I LEARNED SOMETHING NEW

    Today, I learned something very interesting about the world of libraries. As I mentioned in my last blog post, I had to come up with a new method to collect the year and region of publication of the books that we need for the project. One of my supervisors suggested I use the data from the library of congress. This is the national library of the United States, so all the data on it are reliable, and we are more than likely to be able to get all the information we need from it. The library of congress does not have an API, but there is a way to query data using the querying language called Search/Retrieve via URL (SRU). This type of query works in a way that you start with the base URL of the server you want the data from, add the keyword you want to search, then the limit of results you want to see, so the query is in form of a long URL. When a query is made, the result comes back in the browser in XML format. 

    The result of the query is very interesting, it is not similar to an API where we find a key name that identifies the value request or an SQL query where we get the column names to help us know what data we are getting. Instead, the result of the SRU query is an XML page with "tag numbers" to identify each data. For example, if we search for a book, the following are some of the results we would find: the title is in "tag 245", the author is in "tag 100", and the general note (mostly describes the publication of the first edition) is in "tag 500", the publication statement (year and country of publication) would be on "tag 260", and so on. There is a total of 887 tags, and each tag number is unique to the type of data it is assigned to, so a "tag 100" will always represent an author regardless of what data you are querying. 

    I got to talk to the supervisor who suggested me the idea today, and he explained to me how much these tags are used among librarians. He stated that these are even used instead of the actual words (such as author, title, etc) when some librarians talk to each other. But for me, it was my first time learning about these tags, and I only had a very short moment to familiarize myself with the concept before I can start working with it. As a computer science major, I would never imagine myself coming across a concept from a completely different field. This is an opportunity to increase my general knowledge and another big reason why I enjoy applying my computer science skills in non-STEM fields.

    Going back to the SRU query language, I think it is a very interesting way to query data, there is no need for an interpreter as everything is done on the URL bar of a browser. I wonder if it is possible to extend this technology to non-library databases. The simplicity of the query also makes it simpler to include it on a script such as Python. All we need is the "requests" module to make the URL request. But, thanks to the vast community support that Python has, someone already created a module for SRU, which makes it even easier to include the query on a python script.

Connection to discipline

Comments

Popular posts from this blog

FINAL TASK

FINAL POINT