Digital Databases

I first became aware of the possibilities of digital databases for historical research when using the Samuel J. May Anti-Slavery Collection to establish the reception and impact of a particular text, Emancipation in the West Indies by James A. Thome and J. Horace Kimball. A full text word search instantaneously called forth the minutes of the American Anti-Slavery Society, the public addresses and private correspondences of many prominent abolitionists like William Lloyd Garrison and Theodore Dwight Weld, as well as the letters of James G. Birney and other politicians. Although I also used non-digitized archival materials and hard-copies of multivolume edited primary source collections, this research experience came in stark contrast to the painstaking research of historians of an earlier generation, like Gilbert Hobbs Barnes, who had traveled to dozens of archives in order to construct an argument about the impact of antislavery literature like Emancipation in the West Indies.[1]

This example, and countless others like it, point to some of the many changes that the digitization of archival materials and the introduction of optical character recognition (OCR) has wrought on the study of the past. Now, the largest and most commonly known digital databases are Google Books, the HathiTrust Digital Library, the Internet Archive, and Open Library. The largest collections of digitized materials to date are in English, though Google Books is digitizing materials in at least seven other languages and there are other major projects digitizing materials in Latin, for example, and in French.[2] Tools such as Google N-Gram Viewer and Bookworm have made it possible to do frequency analysis of particular words over time.

Accessibility has been and will remain a major concern with digital archives. While the above collections and digitization initiatives like the Library of Congress’ American Memory Project or the University of Michigan’s Making of America are open access, many other collections remain accessible only through paid subscription or institutional affiliation. Some of the largest early curated collections – like the Early American Imprints series, Early English Books Online, or Eighteenth-Century Collections Online, for example – and some smaller and more tailored collections – like Alexander Street’s Black Thought and Culture or their North American Immigrant Letters, Diaries, and Oral Histories database, for example – still require subscription access. Similarly, the Gale Slavery and Anti-Slavery transnational archive, one of the largest of its kind, is available only through paid subscription. Many universities are moving to make their collections open-access, however. Harvard University Library Open Collections Program provides such databases as Immigration to the United States, 1789-1930, for example, and digital collections like their Latin American Pamphlets are also openly accessible.[3] Similarly, access to resources like Brown University’s Latin American Travelogues is complete free and open to all and is part of a broader collaboration with the Open Library project mentioned above.

In addition to collections of digitized archival material, there are also ever-increasing numbers of newspaper databases. The same issues of accessibility apply here, with such collections as America’s Historical Newspapers (including early newspapers, 1690-1922, African American Newspapers, 1827-1998, Ethnic American Newspapers, 1799-1971, and Hispanic American Newspapers, 1808-1980) and 19th Century Newspapers requiring paid access while other collections like NewsLibrary providing free accessibility. Many historians have also long been accustomed to using the powerful digitized archives of government and legal historical documents provided through companies like LexisNexis or Proquest, but we have yet to see how much of this material becomes freely available.

Although I have focused these introductory remarks on digital archives containing mostly primary source archival texts, we might also discuss the rapid proliferation of audio and visual archives as well. The online version of the Alan Lomax Collection of folk music, for example, premiered this year.[4] And, it is not just primary sources but also scholarly commentary that is increasingly going digital. The University of Wisconsin-Madison’s Havens Center Audio Archive, for example, contains podcasts from many renowned historians, including Thavolia Glymph’s most recently posted “Turned into the Streets: Black Women and Children Refugees in the Civil War.”

So, with that brief and necessarily selective introduction to some of the digital databases that are out there (apologies for also skewing the survey toward my own particular research interests in U.S. and Latin American history), we may want to discuss some of the new methodologies of historical practice these resources enable as well as the questions they inevitably raise.

One example is the way in which these databases open possibilities for doing concept history and the history of ideas in the field of intellectual history more generally. While diverse works of history – such as Raymond Williams’ Keywords: A Vocabulary of Culture and Society (1976), Daniel Rodgers’ Contested Truths: Keywords in American Politics Since Independence (1987), Eric Foner’s The Story of American Freedom (1998), Walter Mignolo, The Idea of Latin America (2006), Andrew Sartori’s Bengal in Global Concept History: Culturalism in the Age of Capital (2008), João Feres Júnior’s The Concept of Latin America in the United States (2010) – might broadly be thought of as interested in charting the history of particular words, ideas, or concepts, there is well-developed field of concept history which takes its inspiration from the work of Reinhart Koselleck and others originally begun in Germany (with similar national projects taking place in France, Finland, Netherlands, Denmark, Sweden, Italy, and Spain).[5] The journal Contributions to the History of Concepts has become the flagship publication for this research and the most exciting current research projects currently underway are seeking to write a transnational history of concepts.[6] As David Armitage has recently pointed out, the use of digital databases may help to bridge the gap between diachronic and synchronic approaches to intellectual history.[7] To me, the question of where to properly locate conceptual change remains open and should be productively contested; digital databases allow historians to analyze previously unimaginable amounts of material in previously impossible ways. (Here is an in-progress review essay I am working on about the field of concept history in relation to the field of intellectual history more generally for those interested to read more).

Some potential questions for discussion:

-       What other fields of historical inquiry are the proliferation of digital databases rapidly transforming and with what kind of benefits and consequences?

-       How have other people been using digitized databases in their own work and what possibilities and problems has this raised?

-       In what ways do digitized databases transform our pedagogy as well as our research?


[1] See, Gilbert Hobbs Barnes, The Anti-Slavery Impulse, 1830-1844(1933; 1964). See also, Barnes and Dwight L. Dumond, eds., Letters of Theodroe Dwight Weld, Angelina Grimké Weld and Sarah Grimké(2 Vols., 1934; 1965); Dumond, ed., Letters of James Gillespie Birney, 1831-1857(2 Vols., 1938) which have since been digitized.

[3] For a selection of other web-accessible Harvard Collections see, http://digitalcollections.harvard.edu/.

[4] Larry Rohter, “Floklorist’s Global Jukebox Goes Digital,” (New York Times, Jan. 30 2012), accessed 11/18/12, http://www.nytimes.com/2012/01/31/arts/music/the-alan-lomax-collection-from-the-american-folklife-center.html?pagewanted=all; Michael Martin, “Major U.S. Folk Music Archive Makes Online Debut,” (National Public Radio, May 9, 2012), accessed 11/18/12, http://www.npr.org/2012/05/09/152341534/major-us-folk-music-archive-makes-online-debut.

[5] See, Otto Brunner, Werner Conze, and Reinhart Koselleck, eds., Geschichtliche Grundbegriffe: Historisches Lexikon zur Politisch-sozialen Sprache in Deutschland, 8 vols., (Stuttgart, 1972-1990).

[6] See, Contributions to the History of Concepts, http://journals.berghahnbooks.com/choc/;  Iberconceptos project, http://iberconceptos.net

[7] See, David Armitage, “What’s the Big Idea? Intellectual History and the Longue Durée,” (History of European Ideas, 2012). See also, Martin J. Burke, “Conceptual History in the United States: a Missing ‘National Project,’ ” (Contributions to the History of Concepts, 2005); Hartmut Lehmann and Melvin Richter, eds., The Meaning of Historical Terms and Concepts: New Studies on Begriffsgeschichte (1996).  

______________________________

Some suggestions of further reading:

Armitage, David. “What’s the Big Idea? Intellectual History and the Longue Durée,” (History of European Ideas, 2012).

Bamman, David and Davis Smith, “Extracting Two Thousand Years of Latin from a Million Book Library,” (Journal on Computing and Cultural Heritage, 2012).

Burke, Martin J. “Conceptual History in the United States: a Missing ‘National Project,’ (Contributions to the History of Concepts, 2005).

Foner, Eric. The Story of American Freedom, (New York: W.W. Norton, 1998).

Ifversen, Jan. “About Key Concepts and How to Study Them,” (Contributions to the History of Concepts, 2011).

Júnior, João Feres. The Concept of Latin America in the United States (2010)

Koselleck, Reinhart. The Practice of Conceptual History: Timing History, Spacing Concepts, trans. Tood Samuel Presner, (Standford University Press, 2002).

--------. Futures Past: on the Semantics of Historical Time, tarns. Kieth Tribe (Columbia University Press, 2004).

--------. “Introduction and Preface to the Geschichlichte Grundbegriffe,” (Contributions to the History of Concepts, 2011).

Michel, Jean-Baptist and others. “Quantative Analysis of Culture Using Millions of Digitized Books,” (Science, 2011).

Mignolo, Walter. The Idea of Latin America (2006)

Rodgers, Daniel T. “Republicanism: The Career of a Concept,” Journal of American History79, no. 1 (1992): 11–38.

--------.Contested Truths: Keywords in American Politics Since Independence, New York, 1987.

Sartori, Andrew. Bengal in Global Concept History: Culturalism in the Age of Capital (University of Chicago Press, 2008).

Sebastián, Javiér Fernández, and others. Diccionario Político y Social Iberoamerico: Conceptos politicos en la era de las Independencias, 1750-1850, Vol.1 (Centro de Estudios Politicos y Constitucionales, 2008-).

Williams, Raymond. Keywords: A Vocabulary of Culture and Society (1976). 

agmullen

September 11 Archive

Another digital database type I'd throw into the mix is collections of born-digital artifacts. I recently heard a talk by Dan Cohen, in which he showed the Center for History and New Media's collection, the September 11 Archive, which is a collection of all kinds of born-digital remembrances and artifacts of 9/11/01. He mentioned that people  who are not historians have been using the database for non-historical purposes, such as mapping cell phone usage and analyzing teen slang. 

What that example, as well as your many excellent examples, shows us is that we should feel free to approach these huge databases with unorthodox questions, perhaps questions that tell us more about the culture than about our specific historical subject. 

Digital databases of texts also provide opportunities for breadth of inquiry that would not be possible without digitization. I, for one, have used America's Historical Newspapers to look at public interest in the navy frigate Philadelphia, a task that would not have been possible without keyword searches. (I will say, though, researching a ship named the Philadelphia is not all fun and games: a keyword search for the term Philadelphia is naturally problematic when you're looking for a ship and not a city. There wasn't a uniform code of reference for ships, not even a USS, so I had to get creative with my search keywords. Because of that difficulty, I'm sure I didn't actually see all the newspaper articles that referenced the Philadelphia, because I skipped over ones that appeared to be only about the city. The problems of the keyword search, though, don't negate the amount of time saved by not doing all the searches by hand.)

 

bendavidweber

born-digital artifacts and archives

That is a great point about born-digital artifacts, Abby!

This also raises pressing questions surrounding the possibilities and problems of archiving born-digital materials, like emails, in terms of traditional primary source material historians often use (especially  governmental and nongovernmental organizations, businesses and corporations). This is already raising issues with the freedom of information act and struggles over novel types of classification and declassification.

I heard a talk on Wednesday by Jo Guldi on “The Long Land War: A Global History of Land Reform, 1860-Present” where she explained how she uses Zotero and the plug-in “Paper Machines” to deal with the overwhelming amount of paper generated by bureaucracies like the UN’s Food and Agriculture Organization. She also mentioned a growing community of radical archivists who are working on the frontlines of born-digital preservation, including the digital archive that is being put together by the Occupy Movement. To add to the considerations of accessibility, then, it seems to me that issues of surveillance also come to the fore (as police, FBI, CIA and so forth can readily tap these digital archives that are being molded for other purposes). 

Kirsten Weld, who commented on Guldi’s paper, also raised questions about archival preservation of materials pertaining to crimes against humanity when the documentation is in danger of deteriorating before it becomes declassified, pointing to the struggles of activists and radical archivists to just get in there and preserve it in digital form. She has a forthcoming book called Paper Cadavers: The Archives of Dictatorship in Guatemala that folks should check out if they are interested (and a related article on the recently discovered police archives in Guatemala in the Radical History Review entitled “Dignifying the Guerrillero, Not the Assassin: Rewriting a History of Criminal Subversion in Postwar Guatemala.”).

raypun101

Great Points!

 

You've made a lot of great points regarding the role of digital databases in transforming the study of history. 

It's also becoming true that many people still prefer viewing print and hard copies over e-resources.  I believe it's because the physicality of the manuscript, archive or book makes it much more exciting to see as a research experience than a scanned copy online.  

I spoke to a few professors in various fields of history, particularly East Asian history, and some have told me that they love reading original manuscripts and its annotations, mainly concerning the thought process of the writer - how he/she started revising their ideas chronologically.  

Since the digital copies can't really project the density of the ink so well on the screen, the originals can reveal what was initially thoughtfully conceived, what was revised and the underpinning ideas behind the texts.

But in any case, I do see a trend of visitors altering their research methodology via online databases.  But of course, many of them (students particularly) always seem to start with "google" first and consider that as the ultimate "database."