Data Mining in the Humanities

Francesca Giannetti (Rutgers Libraries)

Popular media often portray "big data" as the exclusive province of information scientists, but data collection in the humanities can swiftly exceed the capacity of the human brain to analyze. Increasingly, humanists turn to digital tools to conduct quantitative research on literary texts, websites, tweets, images and sound recordings. How does one create or reuse a humanities data set? What tools are used to store, manipulate and process that data? How does one begin to analyze data using visualizations? This course will explore the methodologies of both quantitative and qualitative analysis in the humanities using free and open source digital tools to yield new insights into data that would otherwise be difficult to obtain. Through lectures, discussion, labs, and a digital final project, students will familiarize themselves with the tools of digital scholarship and form complex arguments on the basis of a few simple computational techniques.