COMPUTING AND AI: HUMANISTIC PERSPECTIVES FROM MIT

Linguistics | The MIT Linguistics Faculty

The Stata Center, home of MIT Linguistics

Action
“Crucially, nearly all transformative new tools have come from researchers at institutions where linguists work side-by-side with computational researchers who are able to translate back and forth between computational properties of linguistic grammars and of other systems.”

Series | Computing and AI: Humanistic Perspectives from MIT

The MIT Linguistics Group has been engaged in the study of language since the 1950s, and the first class of PhD students was admitted in 1961. The group's research aims to discover the rules and representations underlying the structure of particular languages and what they reveal about the general principles that determine the form and development of language in the individual and the species. The program covers the traditional subfields of linguistics (phonetics, phonology, morphology, syntax, semantics, and psycholinguistics) as well as interfaces with philosophy and logic, speech science and technology, computer science, artificial intelligence, and study of the brain and cognition.

• • •

Q: What benefits do you expect to emerge from integrating the domain knowledge, perspectives, and methods from linguistics into the research and curriculum of the Schwarzman College of Computing?

The scientific mission of Linguistics at MIT is to understand the nature of human language. The excitement of this mission reflects the centrality of language to our species. Language is one of the most characteristic properties of humans, and many aspects of human language are unique to us. Moreover, language is a central component in attempts to explain the richness and complexity of human thought.

In service of our scientific mission, linguists at MIT tackle many diverse but closely related questions: What laws underlie the workings of individual languages and language in general? How is that system embodied in the mind and brain, and how does it interact with other cognitive faculties? How is it acquired by children? How is it used to communicate and to think?

Ever since the inception of modern linguistics by Noam Chomsky, whose decades-spanning work at MIT has shaped and transformed the field in fundamental and unparalleled ways, it has been understood that the scientific investigation of these questions needs to be explicitly formal and computational: Language is one of the most complex symbolic systems of the human mind, with unique and intricate computational properties — of which the most important, perhaps, is the ability to express an infinite set of ideas by finite means.

Opportunities

The core problem of linguistic theory, then, is to figure out the appropriate computational vehicle for this: the type of recursive procedures involved and their computational properties. Meeting this challenge requires understanding the logic of the abstract system that we have in our heads, charting the predictions made by contrasting hypotheses about this system, and comparing these predictions to the data of actual language use (whether gathered via linguistic fieldwork, corpora study, or laboratory experimentation).

The wealth of discoveries that has accumulated over the years in pursuit of meeting these challenges presents many natural points of contact with the Schwarzman College of Computing — and we are especially excited by the new opportunities for discovery that the College can provide for our field. Perhaps the most obvious opportunities concern the interrelation between specific hypotheses about the formal properties of language and their computational implementation in the form of systems that learn, parse, and produce human language. These include areas of great and equal importance to linguistic researchers and to researchers interested in theoretical computer science or the practical side of language technologies.

"Recent developments in phonetics and phonology provide us with an indication of the kind of progress we might expect from integrating the knowledge and skillset of formal linguistics with the ever-increasing power and sophistication of modelling techniques from machine learning/AI."

Lingistics and machine learning

There are two broad areas of linguistic study in which we see particularly exciting possibilities and where distinct, complex, and unresolved computational questions arise, both theoretical and practical. The first concerns the structure of sentences and their basic units (words), how structures create meaning, and how sentence meanings contribute to thought and communication: fields that linguists call syntax, semantics, and formal pragmatics. The second concerns the production and perception of sounds and gestures, their basic units, and the laws governing their combination: fields that linguists call phonetics and phonology.

While the application of tools developed in machine learning/artificial intelligence (AI) to problems in syntax, semantics, and formal pragmatics is still very much in its infancy, recent developments in phonetics and phonology provide us with an indication of the kind of progress we might expect from integrating the knowledge and skillset of formal linguistics with the ever-increasing power and sophistication of modelling techniques from machine learning/AI.

For instance, computationally implemented learning algorithms for specific phonological grammars have increased our understanding of core questions in learnability; algorithms for calculating the set of possible languages that a particular theory predicts have allowed us to develop more precise empirical tests. These tools permit automated analysis, inductive learning from corpora or dictionaries, and statistical comparisons of how well the theory fits the range of attested languages.

Crucially, nearly all transformative new tools have come from researchers at institutions where linguists work side-by-side with computational researchers who are able to translate back and forth between computational properties of linguistic grammars and of other systems.

For the areas of syntax, semantics, and formal pragmatics we envision similarly transformative progress. For instance, more precise and computationally implemented models of how syntactic and semantic knowledge interacts with other subsystems of mind (e.g., number, social cognition, object recognition, etc.) are necessary to make fine-grained quantitative predictions about the deployment of linguistic knowledge in real time, or the growth of linguistic knowledge as function of linguistic experience and social and cognitive development.

"Advances in these areas will invariably lead to a deeper understanding of the specific nature and origin of computational properties of human language."

A 21^st-century leap forward

Advances in these areas will allow us to develop better grounded and more integrated theories of linguistic knowledge and will invariably lead to a deeper understanding of the specific nature and origin of computational properties of human language. Only the historical disconnect between theoretical and computational work holds us back in these domains. The time is clearly ripe for a 21^st-century leap forward in these areas, and there are researchers emerging who can help.

From the computational side, there is a parallel disconnect, with even clearer consequences for progress. Much work to date in natural language processing systems has not been concerned with how humans solve similar tasks, such as understanding a sentence. The last 10 years have seen incredible improvements in the performance of these systems, but these advances have been disproportionately great for English, which offers enormous resources for training models, and which lacks many of the complexities in terms of conjugation, word order, and lexical ambiguity that are found in many other languages. What’s more, even with this advantage, basic aspects of the meaning of English sentences that all native speakers effortlessly understand have remained elusive to computers.

Explainable AI models

Ultimately, we believe that an important ingredient in diagnosing and improving natural language processing will be "explainable AI models" — transparent modes of computing that provide an interpretable representation of the knowledge that they use to accomplish their tasks. This knowledge can be compared to what we know about the form of human knowledge; after all, humans are the most sophisticated and accurate language machines out there, providing the highest bar for assessing and evaluating the achievements of artificial systems.

In order to carry out this comparison successfully, a computational researcher must be well versed in ways of characterizing what humans know about language — that is, in linguistic theory. Of course, such work also frequently leads to surprises, as when a computational model discovers unexpected ways of describing the language or uncovers facts that linguists had not previously noticed. So, we have much to learn from each other.

MIT Schwarzman College of Computing

Related publications

Decoding the meaning of language
Linguist Kai von Fintel engages in research at the intersection of science and the humanities.

3 Questions: What is linguistics?
MIT Professor David Pesetsky describes the science of language and how it sheds light on deep properties of the human mind.

Donca Steriade: Searching for the building blocks of language
The most comprehensive survey of rhyme ever made reveals a new possibility for one of the essential units of language.

Cold case: A linguistic mystery yields clues in Russian
MIT professor’s new book explains how the quirks of Russian numerals can tell us something deep about the universal properties of grammar.

The complexities of cognitive comparisons
In experiments, linguists examine how we make everyday judgments about groups of objects.

The sound and the query
Why do questions take the form they do? An MIT linguist explains how the noises we make help to shape the sentences we speak.

Ask a linguist
Professor Pesetsky discusses 'Universal Grammar,' and the relationship between language and music

Syntax and Semantics Lab generates knowledge about meaning formation
Role of linguistic form stronger than previously realized.

DeGraff awarded $1M NSF grant for linguistics research in Haiti
Funding will help develop classroom tools to teach science and math in Creole for the first time.

The rapid rise of human language
Paper by Shigeru Miyagawa suggests people quickly started speaking in a now-familiar form.

Why children confuse simple words
Study: Kids have “and/or” problem despite sophisticated reasoning.

Code of the humans
New book by Noam Chomsky and Robert Berwick explores how people acquired unique language skills.

From contemporary syntax to human language’s deep origins
New paper amplifies hypothesis that human language builds on birdsong and speech forms of other primates.

Decoding ‘noisy’ language in daily life
Study shows how people rationally interpret linguistic input.

How human language could have evolved from birdsong
Linguistics and biology researchers propose a new theory on the deep roots of human speech.

Global agreements
In new book, MIT linguist expands the horizons of language analysis.

Sound system
Norvin Richards explores how our voices shape the rules of grammar.

Series prepared by SHASS Communications
Office of Dean Melissa Nobles
Series Editor and Designer: Emily Hiestand, Communications Director
Series Co-Editor: Kathryn O'Neill, Associate News Manager
Published 22 September 2019

COMPUTING AND AI: HUMANISTIC PERSPECTIVES FROM MIT Linguistics | The MIT Linguistics Faculty

COMPUTING AND AI: HUMANISTIC PERSPECTIVES FROM MIT

Linguistics | The MIT Linguistics Faculty