ChatGPT and linguistic theory, with a focus on morphology
Stela Manova
June 2024
 

With the rise of Large Language Models (LLMs), in Linguistic Theory (LT) much attention has been paid to whether LLMs can serve as theories of language (Piantadosi 2023 and replies to other scholars in it). However, the discussion has been one-sided: only linguists have participated in it (Chomsky et al. 2023, Haider 2023, Katzir, 2023, Moro et al., 2023, Rawski and Baumont 2023, Sauerland, 2023, among others) and consequently the approach adopted has been an abstract one, with an emphasis on semantics and syntax, though LLMs understand and generate language based exclusively on form. The technical organization of LLMs and the question of what this organization means for LT have remained unaddressed. The goal of my contribution is to provide the missing pieces. The fact that ChatGPT understands and generates language based on subword units (tokens) is the explanation for why this chapter is with a focus on morphology. Since algorithms in different LLMs may differ, I tackle ChatGPT. I demonstrate that ChatGPT’s internal organization cannot be explained with a single linguistic theory but with a combination of insights from different theories. I pay special attention to Distributed Morphology (Halle and Marantz 1993, Harley and Noyer 1999, Embick and Noyer 2007, Bobaljik 2017) and Paradigm Function Morphology (Stump 2001, 2016, Stump and Finkel 2013, Bonami and Stump 2017). ChatGPT not only can serve as a theory of language but can also provide a novel perspective on existing theories and reveal shortcomings.
Format: [ pdf ]
Reference: lingbuzz/008600
(please use that when you cite this article)
Published in: Submitted for inclusion in José-Luis Mendívil-Giró (ed.), Artificial Knowledge of Language. A Linguists’ Perspective on its Nature, Origins and Use. Wilmington, DE: Vernon Press.
keywords: natural language processing, large language models, chatgpt, linguistic theory, complexity of analysis, byte-pair encoding, tokens, words, subword units, meaning, form, form-meaning mapping, word-formation, syntax, phonology, semantics, morphology
Downloaded:869 times

 

[ edit this article | back to article list ]