A MOMENT OF REST
For the past few days, I have been doing nothing. This comes after spending the first six weeks of my internship collecting data. I used that free time to explore some personal projects and to apply the skills I have learned so far. The biggest learning for me is the value of data in NLP. W spent more than half of my internship period collecting data, just to make sure that we get the best quality possible.
In most NLP domains such as topic modeling is very sensitive and a slight change in the input data will deeply affect the outcome of a model. This is the reason why my supervisor, the one who will work directly with the end result of our work, requires that we get the original edition of the books or the earliest edition available, this is because books centuries years old may have been published multiple times and the text and contents may have been changed to adapt to the period during which the edition is printed.
Now that I am planning out my personal projects, I will decide to spend more time collecting the data and figure out the best data for my project without worrying about how I would collect it. Once I figure out the best data, I will do everything to collect it. It took me 4 different methods to figure out the best one, so I will do the same thing for my project.
Comments
Post a Comment