Topic Networks of Slave Narratives, part 1

Sometime earlier this year Documenting the American South released DocSouth Data, a data portal for their online collections. DocSouth Data makes it easy to download the entirety of their collection of slave narratives. DocSouth has long been a guiding light and inspiration for us over at the Colored Conventions Project. (We’re starting a crowd-sourcing project to create downloadable texts like DocSouth, but more about that another time.)

Their collection of slave narratives has always struck me as one of the great achievements in the practice of American history. They have created online texts of 294 slave narratives—viewable online and now easy to download via DocSouth Data. Bill Andrews has an introduction to the collection over there that’s really worth a read. I think most people probably know about slave narratives through the move adaptation of 12 Years a Slave by Solomon Northrup, but that is hardly a representative sample out of the 294 narratives. If you’re looking for something to read, why not check one out? Here’s a few of my favorites:

Given the massive size (~18,000 pages) and national importance of DocSouth’s collection of slave narratives in American history, I’ve wondered why there aren’t more people using such a large corpus to visualize and study.

Well, I used my Sunday to see what I could do.

I downloaded all of the narratives from this page. After I unzipped them, I generated 40 topics of the entire collection using this Topic Modeling Tool. A topic is, at its most basic, “a recurring pattern of co-occurring words.” More simply, it’s the patterns of words that often appear with each other.

After getting 40 topics, I used the TopicsInDocs file as the basis for my edges table to import into Gephi. That allowed me to create a table of links between a narrative and 1 of the 40 topics. Then I used the multimode network projection plugin in Gephi to create links between the narratives that are most affiliated with each topic. The results are somewhat interesting to browse. Click the preview below to see the full working display (warning: the page takes a minute to load).

Screenshot 2014-12-14 16.56.55

 

Credits for guides, instructions, and inspiration:

http://electricarchaeology.ca/2011/11/11/topic-modeling-with-the-java-gui-gephi/

http://www.scottbot.net/HIAL/?p=221

https://dhs.stanford.edu/comprehending-the-digital-humanities/

http://www.ics.uci.edu/~asuncion/pubs/TIST_11.pdf