With the help of an artificial language network, MIT neuroscientists discovered what kinds of sentences are most likely to activate the brain’s key language processing centers.
The new study reveals that sentences that are more complex, either because of unusual grammar or unexpected meaning, generate stronger responses in these language processing centers. Sentences that are too simple hardly engage these areas, and silly sequences of words don’t do much for them either.
For example, the researchers found that this brain network was most active when reading unusual sentences such as “The sell signal market remains special,” drawn from a publicly available language dataset called C4. However, he was quiet when he read something very simple, like “We were sitting on the couch.”
“The input has to be linguistic enough to engage the system,” says Evelina Fedorenko, Associate Professor of Neuroscience at MIT and a member of MIT’s McGovern Institute for Brain Research. “And then within that space, if things are really easy to process, then you don’t get a lot of response. But if things get tricky or weird, if there’s an unusual construction or an unusual set of words that maybe you’re not very familiar with, then the network has to work harder.”
Fedorenko is the senior author of the study, which appears today in Nature Human Behavior. MIT graduate student Greta Tuckute is the paper’s lead author.
Processing language
In this study, the researchers focused on the language processing areas located in the left hemisphere of the brain, which includes Broca’s area as well as other parts of the left frontal and temporal lobes of the brain.
“This language network is extremely selective in language, but it’s been harder to understand what’s going on in those language regions,” says Tuckute. “We wanted to find out what kinds of sentences, what kinds of linguistic inputs, drive the left-hemisphere language network.”
The researchers began by compiling a pool of 1,000 sentences drawn from a wide variety of sources—fiction, spoken word transcriptions, web text, and scholarly articles, among many others.
Five people read each of the sentences while the researchers measured the activity of their language network using functional magnetic resonance imaging (fMRI). The researchers then fed those same 1,000 sentences to a large language model—a model similar to ChatGPT, which learns to generate and understand language from predicting the next word in vast amounts of text—and measured the patterns of activation of the model in response to each sentence.
Once they had all this data, the researchers trained a mapping model, known as an “encoding model”, which correlates the activation patterns seen in the human brain with those seen in the artificial language model. Once trained, the model could predict how the human language network would respond to any new sentence based on how the artificial language network responded to those 1,000 sentences.
The researchers then used the coding model to identify 500 new sentences that would elicit maximum activity in the human brain (the “drive” sentences), as well as sentences that would elicit minimal activity in the brain’s language network (the “suppress” sentences ). .
In a group of three young participants, the researchers found that these new suggestions did indeed drive and suppress brain activity as predicted.
“This ‘closed-loop’ modulation of brain activity during language processing is novel,” says Tuckute. “Our study shows that the model we use (which maps between language model activations and brain responses) is accurate enough to do this. This is the first demonstration of this approach in brain regions involved in higher level cognition, such as the language network.”
Linguistic complexity
To understand what made certain sentences promote activity more than others, the researchers analyzed the sentences based on 11 different linguistic properties, including grammaticality, plausibility, emotional valence (positive or negative), and how easy it is to visualize the content of the sentence.
For each of these qualities, the researchers asked participants from crowd-sourcing platforms to rate the proposals. They also used a computational technique to quantify each sentence’s “surprise,” or how unusual it is compared to other sentences.
This analysis revealed that sentences with higher surprise elicited higher responses in the brain. This is consistent with previous studies showing that people have more difficulty processing sentences with more surprise, the researchers say.
Another linguistic property correlated with language network responses was linguistic complexity, which is measured by how well a sentence conforms to the rules of English grammar and how plausible it is, that is, how much the content makes sense, apart from the grammar.
Sentences at either end of the spectrum—either extremely simple, or so complex as to make no sense—caused very little activation in the language network. The biggest answers came from sentences that make some sense but require work to understand, such as “Jiffy Lube of — of therapies, yes,” which came from the Corpus of Contemporary American English dataset.
“We found that the sentences that elicit the highest brain response have a strange grammatical thing and/or a strange meaning,” says Fedorenko. “There is something slightly unusual about these sentences.”
The researchers now plan to see if they can extend these findings to speakers of languages other than English. They also hope to investigate what type of stimuli can activate language processing areas in the right hemisphere of the brain.
The research was funded by an Amazon Grant from Science Hub, an International Doctoral Fellowship from the American Association of University Women, the MIT-IBM Watson AI Lab, the National Institutes of Health, the McGovern Institute, the Simons Center for the Social Brain, and the MIT Department of Brain and Cognitive Sciences.