This past spring, four of us here at GW Libraries had the privilege of attending the 2016 Code4Lib conference
The Combined Blog: Posts from All GW Libraries Blogs
On Feb 1, GW’s Expert Finder launched. Expert Finder is an implementation of VIVO, a researcher discovery platform. The project is a collaboration between the Division of Information Technology and GW Libraries. As one of the software developers on the project, I want to take this opportunity to discuss some noteworthy aspects of our implementation.
In 1963, NEA teamed up with Hollywood to create Mr. Novak. The show was about an idealistic young high school teacher, played by James Franciscus, facing problems many teachers would recognize. As producer E. Jack Neumann described the show in interview with NEA Reporter, "[o]ur stories sometimes will be provocative and controversial, they'll sometimes show the bad as well as the good among students and teachers. But we aim to keep everything in its proper, true perspective."
The latest in our social media harvesting experiments for the Social Feed Manager project involves analysis, discovery, and visualization of social media content. An analytics service may help satisfy two needs:
At the Access Conference in Toronto in September 2015, I attended an all-day hackfest on data sonification, led by William Denton of York University and Katie Legere of Queen’s University. Data sonification is the translation of data into sound, much as data visualization transforms data into a graph or image. You can read about the workshop and see some examples of data sonification at Music, Code and Data: Hackfest and Happening at Access 2015.
The Twitter Streaming API is very powerful, allowing harvesting tweets not readily available from the other APIs. However, recall from our previous post that the Twitter Streaming API does not behave like REST APIs that are typical of social media platforms -- see Twitter’s description of the differences. A single HTTP response is potentially huge and may be collected over the course of hours, days, or weeks. This is a poor fit for both the normal web harvesting model in which a single HTTP response is recorded as a single WARC response record in a single WARC file, and for most web archiving tools, which store HTTP responses in-memory and don’t write them to the WARC file until the response is completed.
This post describes an approach we’ve developed for harvesting the Twitter Streaming API and recording in WARC files. We will also show how the tweets can be extracted from the WARC files for use by a researcher.