Two preprints: separating incremental composition from predictability and localizing dependency-processing across languages

Update: Both of these papers are now in print! Stanojevic et al, in Cognitive Science and Dunagan et al. in Neurobiology of Language

I’m still catching up on some of our efforts from before the new year. These include two (2) pre-prints of papers that make use of the Little Prince datasets.

First up: Miloš Stanojević (DeepMind) and I co-led a project where we model incremental composition using a CCG parser to (1) test whether the more expressive and directly compositional CCG parser better captures neural signals than a less-expressive context-free grammar, (2) evaluate which of several CCG parsing variants shows the best fit, and (3) tease apart parser-correlated neural signals from predictability as modeled from the Chinchilla large language model. This is a dream project for me as it is the first time, to my knowledge, that a directly compositional grammar has been used to model neural signals and it also offers the best-to-date effort to isolate parsing effort from predictability in naturalistic data.

Stanojević, M., Brennan, J. R., Dunagan, D., Steedman, M., & Hale, J. T. (2022). Modeling structure-building in the brain with CCG parsing and large language models (arXiv:2210.16147). arXiv. https://doi.org/10.48550/arXiv.2210.16147

Second: Donald Dunegan (UGA) leads a project identifying fMRI correlates of wh– and object relative dependencies in English and Mandarin (these languages, familiarly, linearize these dependencies quite differently.) Despite these surface differences, both dependency classes show remarkable similarity in their neural correlates. Interestingly, the correlates differ between dependency types, consistent with the hypotheses that these dependencies are resolved using different (yet cross-linguistically shared) strategies.

Dunagan, D., Stanojević, M., Coavoux, M., Zhang, S., Bhattasali, S., Li, J., Brennan, J., & Hale, J. (2022). Long-distance linguistic dependencies in Chinese and English brains (p. 2022.09.12.507571). bioRxiv. https://doi.org/10.1101/2022.09.12.507571

Computational Neurolinguistics Lab