Nov 18 2012


Data Mining Large Digital Collections

Filed under The Digital Past

I think that the concept of a Syllabus Finder is really interesting; an archive of syllabi from different courses at different universities would probably be really helpful in the future, allowing professors to possibly coordinate classes. It could also help students by arranging a syllabus from each class in some sort of digital format where they would have easy access to it throughout the term.

The idea of finding a document or something based on the word usage in said document reminds me of using Ctrl+ F to find specific words within a block of text. Speaking as someone who has used online databases and journals frequently for over four years, I find this sort of thing incredibly helpful.

For my Google Ngram Viewer I compared three of arguably most influential people in getting Henry VIII to break from Rome and establish the Church of England: Anne Boleyn, Cardinal Wolsey, and Thomas Cromwell.

First I tried it in British English

Then American English, using the same dates

Then finally, English Fiction, still using the same dates

 

I thought that the results were very interesting. Obviously British English goes back further than either of the other two, though there seems to be some sort of error around 1680, and if I had gone back any further the error would have gotten worse. In American English, Anne seems to be consistently the most popular out of the three names, but in both of the other categories she and Wolsey alternatively share the top spot, with English Fiction trailing behind British English by about one hundred years.

No responses yet




Trackback URI | Comments RSS

Leave a Reply

Your email address will not be published. Required fields are marked *