MIT HiSTORY SERIES ON DIGITAL HUMANITIES | 3
Computation and the practice of 21st-century history
In a talk at MIT, Professor William Turkel, PhD'04, described the techniques and tools he uses in his study of global 21st-century history.
"When something is in machine-readable form it can be continuously read and used and expanded upon in new ways by machines."
— William J. Turkel, PhD'04 Professor of History at the University of Western Ontario
Digital History at MIT — Introducing the Digital Humanities series, Jeffrey Ravel, Head of MIT History, explained that the series is part of the History faculty's ongoing exploration of computational methods for research and teaching — a growing practice that is referred to as "digital history." Over the past 15 years MIT historians have led several substantial projects in digital history ranging from the Comédie-Française Registers Project, an analysis of French theater in the 18th century, to Visualizing Japan and Visualizing Imperialism and the Philippines, 1898-1913, two MOOCs (Massive Open Online Courses) developed by the department. The current seminars on innovative digital history projects are especially timely given the recent founding of the MIT Stephen A. Schwarzman College of Computing, which will equip MIT faculty and students in any discipline to use computing and AI tools and also, equally, will enable faculty from all MIT disciplines to inform and guide the development of new technological tools. "Writ large," Ravel said, "this seminar series is a space for us to reflect on our forthcoming engagement with the new college."
• • •
In a wide-ranging talk on ways that computation is altering the practice of 21st-century history, William J. Turkel, PhD'04, and a professor of history at the University of Western Ontario, described examples of the phenomenon, including his own experience in publishing two books.
The first book took him seven years to complete. The second, ten months.
The difference, Turkel said, is directly related to his embrace of new computational methods. Turkle also described the techniques and tools he uses in his study of global 21st-century history. Several of these allow him to amass data — and sources — in real time, which he says is important “if we want to stay on top of a topic that is changing every day.”
Turkel’s talk was the third in a series of three seminars on digital history sponsored by the MIT History section.
Toward better automation
In 2007, Turkel published his first book, The Archive of Place: Unearthing the Pasts of the Chilcotin Plateau. Before starting his second, he spent time analyzing exactly how he wrote and researched the earlier work. “That first project was insufficiently computational,” he said. “My goal for the second book was to automate everything that could possibly be automated. I wanted to save my own care and attention for things that people are really good at, such as close reading, interpretation, and a choice turn of phrase.”
When he’d finished streamlining his methods, he used them to write and publish his second book. Spark from the Deep: How Shocking Experiments with Strongly Electric Fish Powered Scientific Discovery, published in 2013, took an order of magnitude less time to complete than The Archive of Place.
To share his digital research techniques, Turkel and colleagues developed a virtual machine dubbed "the HistoryCrawler" that includes a suite of open source tools compiled for web crawling, text mining, and visualization. A virtual machine is essentially a second computer that runs on the user’s desktop, regardless of the main computer’s hardware and software. That makes Turkel’s tools available to anyone, including his students.
“It means that everybody can be using exactly the same machine for their coursework,” he said. “So you don’t have to worry about a student with an old laptop that’s running a version of Windows that doesn’t get along with a particular software package. Everybody has the same consistent environment.”
Figure 1, from Benjamin Schmidt, "Stable random projection: lightweight, general-purpose dimensionality reduction for digitized libraries," Journal of Cultural Analytics. October 3, 2018.
Turkel says that the most important characteristic of algorithms used for real time computing is that they have to have what computer scientists call linear or sublinear computational complexity. That is to say: the time and memory needed to solve a problem grow more slowly than the size of the problem.
For ongoing projects on the history of electronics Turkel is applying both relatively standard and cutting-edge computational techniques to the collection and analysis of millions of documents and images. These sources range from the latest white papers on integrated circuits to entire runs of journals (Turkle emphasizes the importance of getting permission first).
“I’m most interested in tools that can automatically grow a collection like this, help me navigate it, and help me develop and answer research questions,” he said. He described several of these tools, many of which use machine learning to “categorize [huge amounts of data] in ways that enable us to find them and make sense of them in an ever-changing context.”
The most important characteristic of algorithms used for real time computing is that they have to have what computer scientists call linear or sublinear computational complexity. That is to say: the time and memory needed to solve a problem grow more slowly than the size of the problem. Turkel gave a number of examples.
He noted that these powerful computational techniques often unearth sources he hadn’t known existed. “This kind of serendipity is something that you feel a lot when you’re working with systems like this,” he said.
Turkel also observed that "text on paper" is dead except for the moments when someone is reading it. From the perspective of transmitting knowledge to future generations, “Writing something down is better than being illiterate, but it’s worse than having machine-readable sources. When something is in machine-readable form it can be continuously read and used and expanded upon in new ways by machines.”
Online Courses and Resources from MIT History
Visualizing Imperialism and the Philippines, 1898-1913, an MITx, edX MOOC
Visualizing Japan, an MITx, edX MOOC
3 Questions: Jeffrey Ravel on bringing data to cultural history
MIT conference stems from data-rich historical project on French theater.
Interview: Ravel on French History
A global French history conference at MIT
MIT reshapes itself to shape the future
Gift of $350 million establishes the MIT Stephen A. Schwarzman College of Computing, an unprecedented, $1 billion commitment to world-changing breakthroughs and their ethical application.
Making a path to ethical, socially-beneficial artificial intelligence
Leaders from government, philanthropy, academia, and industry say collaboration is key to make sure computational and AI tools serve the public good.
Story prepared by SHASS Communications
Editorial and Design Director: Emily Hiestand
Writer: Elizabeth Thomson
Published 24 April 2019