Sources and Sensibility

So, call me melodramatic, but Open Refine might just be my villain origin story.

An accurate depiction of me using Open Refine

Not to sound arrogant in any way, but it has been a hot, hot minute since I came across a program that I couldn’t pick up within the first hour or two of using it. But somehow, Open Refine has thwarted me. Quite effectively. I despair for my future as a data organizer.

However, despite my mildly appalling lack of skill with this technical task, it still accomplished the goal of really forcing me to think about data, how we as academics collect it, and how we utilize it. Prior to the advent of the internet age that we live in today, research in history was done in-person, most often in an archival setting. Work like this was slow and imprecise in that historians were often looking through a lot of side information as they searched for their primary query. However, this excess information served an important function—it provided deeper context and understanding of other sources, ultimately expanding the researcher’s perception of their chosen subject matter. Lara Putnam terms this as “side-glancing,” and it is this periphery vision in research that for many years allowed greater context of sources and of topics as a whole.

However, as the internet advanced more and more of these sources were digitized and put in an online format. Researchers no longer had to travel the sometimes far distances to get to regional archives that, previously, were the only spaces to access sources on said region. With just a few keywords and a search bar, researchers are now able to cross international boundaries in pursuit of knowledge. This new ease of access does not come without cost. As Putnam explains in her article, utilizing the search box is to rely on the algorithms and other technicalities to pull the most relevant information to your query. Though this allows the researcher to quickly find relevant sources, the process cuts out a lot of the side-glancing that provided invaluable context and periphery information to research itself. This opens up new blind spots in research across the board, not just historical. Researchers need to become keen to these blind spots.

Putnam thoroughly and eloquently raises questions about our modern methods of research and, indeed, their limits. I know “limits” is not a word we as society tend to apply to technology that is constantly advancing, but for our research purposes, especially as historians, it is highly relevant. We often get so caught up in ease of access and just the sheer amount of sources we can pull these days that we don’t stop to consider the wider implications of what we’re reading. Just as doors were opened by technology, there are others that have also been closed. One of the side effects of digitization and digital access is that we as researchers lose much of the side-glancing that often provided us with just as vital information, even if it wasn’t directly what we were searching for. This, of course, does not even touch on the crucial discussion regarding inequalities in digitization and internet access is general. It has become very apparent that advancing technology in research has raised just as many questions as it has answered, and will probably continue to do so.

6 comments / Add your comment below

  1. You bring up some really great points about side glancing. I can’t tell you how many times I found a book through the Mason library system, went to get it, and then left with a bunch of others from the area surrounding that one book because they looked interesting or relevant. The readings from this week really made me think about how much information we are missing from physical or non-digitized sources when we only conduct digital research. I totally understand your struggles with OpenRefine, it is definitely not an easy program to use and I applaud anyone who learned it quickly. I found myself just having to poke around a bunch and apply many filters/facets just to see what they did.

  2. Hayley, I enjoyed reading your blog. I can certainly relate to your frustration with Open Refine!

    Your comments on side-glancing (Putnam) were spot-on. Digital-only research prevents the researcher from taking advantage of on-site resources. It’s like reading a newspaper article on line instead of on a newspaper page. You miss out on all the other articles, etc. on the page beside the article that you read. The question of who choses what gets digitized is also an important point that every researcher needs to be aware of.

  3. Great post! I really appreciate your emphasis on side-glancing here. I feel like this point was somewhat lost on me in my initial read of Putnam’s article, but I now have a much better sense of it. I wonder if this issue could be potentially solved by more archival digitization, or at least, better organized archival digitization. Could online databases perhaps be organized in such a way to allow for a form of “digital” side-glancing? This seems realistic to me, though of course it would require an excellently structured database.

    Also, I completely sympathize with you regarding OpenRefine! I’m definitely in the same boat in terms of feeling confident in my abilities to navigate digital programs, this is until I met OpenRefine!

    1. The question of implementing digital side-glancing is definitely a big one. I feel like on some level we would have to reevaluate how we organize digital databases, as well as how we utilize them. One of the reasons we lose the ability to side-glance in a digital arena is because we become dependent on search box algorithms that pull material by relevance, so I feel like we would have to maybe reconstruct those algorithms. Ahaha who knows!

  4. I’m so glad you brought up side-glancing! Like Olivia, pretty much any time I’m grabbing a book off the shelf in a library, I walk away with an armload of things that are also relevant! Unfortunately, a lot of libraries are moving towards closed stacks or off-site storage that will make this sort of side glancing impossible. Many library catalogs have added a digital shelf browse, but it doesn’t feel the same with only a title and not being able to look through a lot of the text (maybe with eBooks in the future?). If you’re ever doing newspaper research, the Library of Congress’ Chronicling America newspaper database is pretty great for side-glancing. It’ll highlight the terms you want, but you can also see the entire page and know what else was going on in that time/area. Lara Putnam was lamenting this loss with newspaper research, but I think this is one that is more easily solved, especially since you can look at a headline/story/ etc. and keyword search to answer questions about how much a term was used etc. I also had trouble with OpenRefine and just couldn’t get the transformations right. It’s not my arch-nemesis, but there were times when I got pretty frustrated about not figuring things out. (I’m not sure if a gif will work in the comments, but I’m going to try!)

  5. Hahaha, I am so glad I am not the only one who feels the same way about OpenRefine! I honestly wanted to pull out my hair because for some reason, the installation process wasn’t working for me and I ended up spending 30 minutes trying to even get to the main grid. But with any program, the more we use it, the more comfortable we get sorting through OpenRefine!

    I definitely agree with you on getting caught up in ease of access and the number of sources we gather that we don’t stop to consider the wider implications of what we’re reading. As historians, I think it is also very crucial to acknowledge the inequalities that come with digitalization on a national level, but also on a global scale.

Leave a Reply

Your email address will not be published. Required fields are marked *