Emojis & the metalinguistic performance of LLMs
John David Storment
November 2024
 

In this short paper I show that, although ChatGPT (GPT-4o) can provide accurate linguistic judgments for many types of sentences (Cai et al. 2023; Ortega-Martín et al. 2023; J. Wang et al. 2023; Collins 2024a; b), it does not give accurate grammaticality judgments for sentences that contain pro-text emojis (Tieu et al. 2023; Storment 2024). I demonstrate this with three distinct experiments performed on GPT-4o. This work builds on prior research that shows that the combinatorics of pro-text emojis are sensitive to the morphosyntactic constraints of the language in which the emojis appear, and it connects the poor performance of GPT-4o in this respect to two factors: (i) the fact that LLMs lack an internal hierarchical syntax (Linzen & Baroni 2021; Contreras Kallens et al. 2023; Zhou et al. 2023; Hale & Stanojević 2024; Manova 2024, a.o), and (ii) the fact that LLMs lack the means of directly processing iconic and pictorial content in the same way that human cognition allows for. This paper establishes a precedent for the research of the intersection of generative AI systems and linguistic utterances that contain pictorial elements as morphosyntactic constituents.
Format: [ pdf ]
Reference: lingbuzz/008597
(please use that when you cite this article)
Published in:
keywords: emoji, emojis, pro-text emojis, llm, gpt-4o, chatgpt, generative ai, experimental linguistics, syntax, morphology, computational linguistics
previous versions: v1 [November 2024]
Downloaded:336 times

 

[ edit this article | back to article list ]