Voyant Data
This project can be applied to multiple disciplines, and implemented with or without the DH emphasis. To this end, while the project template’s topic modeling feature contains data intended for history students by default, this data can be supplemented or exchanged with whatever the instructor prefers.
About the demo data used for this project:
The State of the Union Addresses and Party Platforms used for the text analysis and topic modeling in this project template were obtained from the University of California Santa Barbara’s American Presidency Project. This data can be found in multiple formats in the project template’s GitHub repository in the /assets/data/ folder.
Only Democratic and Republican parties were included in our 20th-century Party Platforms data.
The stopword list used for the State of the Union Address and Party Platform text analysis is the Buckley-Salton stopword list, retrieved from Alan Liu’s workshop at http://dhworkshop.pbworks.com/w/file/105416844/Buckley-Salton-stopword-list.txt. The Buckley-Salton stopword list was also used for the Party Platforms, with the addition of three words: america, american, and americans.
To prepare the demo corpus for text analysis with Voyant Tools, we split each State of the Union Address and Party Platform into its own text file, then uploaded all of the State of the Union text files into one instance of Voyant, and all of the Party Platform text files into another instance of Voyant. To view these text files, see the /assets/data/state-of-the-union/txt/ folder and /assets/data/party-platforms/txt/ folder in the project template’s GitHub repository.
We then added links to these Voyant instances (one with State of the Union files and one with Party Platform files) to the documentation’s Voyant page as buttons, so that students can link out to them easily.
As an alternative to pre-loading the text into Voyant, if time allows you could choose instead to teach students how to upload the requisite text files to the Voyant home page themselves.
Prepare New Voyant Data
-
Obtain and clean a body of text that you’d like your students to analyze.
-
Break the text into “documents” according to how you’d like to analyze it (documents are “segments of text” – they can be paragraphs, chapters, books, speeches, etc.). Each document should be separated into its own text file.
- Either teach students how to load the documents into the Voyant Tools Home Page, Or upload the documents into the Voyant Tools Home Page yourself, then copy and distribute the URL of the visualization page that Voyant generates to your students. (Note that if you’ve already copied this documentation repository to create a customized version of this learning sequence, you can replace the current button links in the Voyant page of this documentation site with your new links. Simply locate the three “button includes” on the page–which look like this:
{% include button.html text="20th-Century State of the Union Addresses" link="https://voyant-tools.org/?corpus=3331b9ec3186b714ca53835d5b3ed722" color="success" %}
–and replace the “text” and “link” values with your own).
-
If necessary, prepare a stopword list for students to upload to Voyant (or pre-load it to your Voyant instance). We recommend appending your stopwords to the Buckley-Salton Stopword List.
- Once students are viewing the requisite text in Voyant, they can proceed through this documentation’s Text Analysis with Voyant and Export an Image from Voyant pages.