FINAL POINT

     I recently completed my internship at Vanderbilt University, where I worked with an English professor and a librarian for a text-mining project. The goal of the project is to help the English professor in his research about how Law is represented in general dialogue during the Romanticism period (a period of literature between 1750 and 1850). So, the data collected were English books published in the United Kingdom and Ireland between 1770 and 1835. The main source of data was Project Gutenberg, an online library of free eBooks. 


    We faced a load of challenges while working on this project. One of the biggest issues was figuring out the correct book to collect. While project Gutenberg provides some information such as authors and editions of the book, it does not provide the year of publication, so there was no way for us to know what the right book is. We have attempted to solve the problem by collecting all the books in English and from Project Gutenberg, then tried to check their publication date from third-party websites such as Wikidata or the Library Of Congress. However, we hit another issue with figuring out the correct edition. Since we were dealing with books that are two centuries old, finding the first edition is almost impossible. Due to this we only found the publication date for the later version of some books, making the data unreliable. The way we solved the problem was by using the author as a reference. The English professor I was working with has a list of all British and Irish authors who published their works during their Romanticism period, so we only collected the books published by these authors. While we did not know when exactly the books were published, we were confident that they were published during the era we are interested in.


    This internship has helped me solidify my skills and helped me learn new ones. One of the most valuable skills I have solidified was my problem-solving skills. Since this was not like traditional programming projects, I had to improvise to accomplish the task. It is important to point out that I was the only intern, and the only person working on the technical part of this project. I was given a problem and an expected outcome, but I had to come up with ways to accomplish my task. I received minimal technical support from my mentors. Due to this, I had to rely more on my problem-solving skills a lot, and by the end of the internship, I could see how much it improved. One of the new technical skills I learned from this internship was SQL queries. I started this internship with only a knowledge of Python and its data processing libraries, I did not expect to use SQL anytime during the project, but I did, so I had to learn it. With the help of one of my mentors and some time spent learning on my own, I managed to learn queries with MySQL and stored the final data in an Avro database. I also learned to use Spark, a big data tool, since our work was done in Spark, and we used PySpark with its libraries.


    Apart from, technical things, I also learned non-technical skills from this internship, with the biggest one being time management. My internship was fully remote. It was my first ever remote work and it took me some time to adapt to it. My mentors had their full trust in me, so I was not supervised at all. I set deadlines for the work I do, and I could work whenever I wanted as long as I met the deadlines. So I had to make a schedule and decide when to work, when to take a break and when to do other things. It required self-discipline to complete work during a certain hour every day. 


    All in all, this internship was a unique experience for me. It was my first ever internship, it was remote, and I worked in the field of literature. It is less likely that I will have the same experience again in my future career. I worked with great mentors who were very understanding and very supportive. I faced difficulties, and there were times when I thought I would fail, but in the end, I accomplished the job and provided clean final data that my mentors really appreciated. They were more than satisfied with my work and repeated that to me as often as they could towards the end of the internship. I can say that I was satisfied with my internship and I can call this an accomplishment. I would do it again in the future if an opportunity is given to me.


Comments

Popular posts from this blog

FINAL TASK