Skip to main navigation Skip to search Skip to main content

Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning

Research output: Contribution to journalArticlepeer-review

3 Downloads (Pure)

Abstract

Metaphor is a pervasive feature of discourse and a powerful lens for examining cognition, emotion, and ideology. Large-scale analysis, however, has been constrained by the need for manual annotation due to the context-sensitive nature of metaphor. This study investigates the potential of large language models (LLMs) to automate metaphor identification in full texts. We compare three methods: (i) retrieval-augmented generation (RAG), where the model is provided with a codebook and instructed to annotate texts based on its rules and examples; (ii) prompt engineering, where we design task-specific verbal instructions; and (iii) fine-tuning, where the model is trained on hand-coded texts to optimize performance. Within prompt engineering, we test zero-shot, few-shot, and chain-of-thought strategies. Our results show that state-of-the-art closed-source LLMs can achieve high accuracy, with fine-tuning yielding a median F1 score of 0.79. A comparison of human and LLM outputs reveals that most discrepancies are systematic, reflecting well-known grey areas and conceptual challenges in metaphor theory. We propose that LLMs can be used to at least partly automate metaphor identification and can serve as a testbed for developing and refining metaphor identification protocols and the theory that underpins them.
Original languageEnglish
Article number100204
Number of pages12
JournalApplied Corpus Linguistics
Volume6
Issue number2
Early online date15 Mar 2026
DOIs
Publication statusE-pub ahead of print - 15 Mar 2026

Keywords

  • metaphor identification
  • large language models
  • metaphor detection
  • linguistic annotation
  • artificial intelligence

Fingerprint

Dive into the research topics of 'Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning'. Together they form a unique fingerprint.

Cite this