Text analysis using Voyant Tools

Voyant Tools is one of my favorite text analysis tools because it is fast and easy to use, even for people who have no background in text analysis. Although Voyant offers a lot of options—which can be overwhelming—the interface presents basic results that any user can easily customize. The results of Voyant’s analysis can be downloaded as visualizations or in tab-separated or JSON data formats, and Voyant also generates embed codes for its tools (which I’m using for this blog post), as well as citations for specific analyses. This post will cover basic Voyant functions, including inputting texts for analysis, working with and understanding basic Voyant tools, and downloading data.

Voyant’s tagline is “see through your text.” Computer-based text analysis is a helpful supplement to close reading: for example, it can provide quantitative confirmation of patterns that you notice in a text, allow you to quickly locate interesting words or phrases within a large corpus, and help to contextualize trends in word usage. Using a tool like Voyant at the beginning of a project may also help you to find interesting trends in a text that you’d like to research further.

Voyant accepts texts in a number of ways. You can paste text or URLs (including URLs to PDFs posted online) directly, or you can upload files you have on your computer. These can be plain text, MS Word, or PDF files. You can upload one file to analyze or multiple files as a corpus. For this blog post, I’m using one of Voyant’s built-in test corpuses—the novels of Jane Austen.

The basic interface includes five panes: Cirrus, Reader, Trends, Summary, and Contexts. Each pane is interactive, and selections made in one pane may affect the display of another pane. Below you’ll see an embedded version of each tool; for full functionality, click on the image to open this Voyant corpus in your browser.

The Cirrus tool is a familiar word cloud; hover over each word to reveal the count in the corpus. If you click on a word in the word cloud, the other panes will change to feature that word. Use the scale at the bottom of the tool to limit the word cloud to only specific texts in your corpus. Click on the question mark in the upper right corner for help, and hover just left of the question mark for options for embedding or downloading an image of your results. 

The Reader tool is provides the entire text of the corpus. Click on any word to highlight every occurrence in the reader and also to show that word in the other panes, or use the search box at the bottom of the pane to look for specific words.

The Trends tool shows the relative frequency of words throughout the corpus. It will automatically display the five most common words in the corpus, but you can add more words using the text box at the bottom of the pane. Clicking on a word in the Cirrus or Reader panes will display frequency in the Trends pane, while clicking on a point in the Trends pane will bring up the specific text in the Reader and Contexts panes, with all instances of the word highlighted. 

The Summary tool gives an overview of the corpus, including length, vocabulary density, and distinctive words (you’ll notice that these are generally proper names). Within Voyant, rather than the embed below, you’ll notice that next to the tool name at the top of the pane, you have options for documents and phrases. These options bring up information about the corpus that can be downloaded as a tab-separated file and opened in a spreadsheet program for further analysis. The Cirrus and Trends tools provide a similar option at the word level.

Finally, the Context tools provides words from the corpus with left and right context. Expand each entry for a longer view of the context.

Are you interested in learning more about Voyant? Contact Beth Platte at eplatte@reed.edu or stop by the Language Lab!

This entry was posted in General Instructional Technology. Bookmark the permalink.