Authorial language models for AI authorship verification

Research output: Contribution to journalConference articlepeer-review

131 Downloads (Pure)

Abstract

In this paper, we introduce the use of Authorial Language Models (ALMs) for AI Authorship Verification (AIAV). Given two texts, where one is written by a human and one is written by a machine, AIAV is the task of determining which text was written by the machine (or alternatively by the human). Our approach to resolving this task involves using a support vector machine to predict which text is written by a machine based on perplexity scores for a set of language models that were each fine-tuned on texts generated by a set of language models. We submitted our method as a docker-contained software for independent evaluation on the main testing dataset, and its variants that are obfuscated against detection. On the main dataset, we have been informed that our method has achieved a score of approximately 0.979 on all proposed evaluation measures, including ROC-AUC Brier C@1 F1 and F0.5, beating all baseline methods. And on the variants of main datasets, we achieve a median score of 0.935 which also beats all baselines. We attributes the success of ALMs in this context to the power of using many fine-tuned authorial language models, which we believe improves the resilience of our approach by maximising the amount of potentially discriminating information drawn from the underlying textual data
Original languageEnglish
Pages (from-to)2645-2652
Number of pages8
JournalCEUR Workshop Proceedings
Volume3740
Publication statusPublished - 5 Aug 2024
EventConference and Labs of the Evaluation Forum: Information Access Evaluation meets Multilinguality, Multimodality, and Visualization - Grenoble, France
Duration: 9 Sept 202412 Sept 2024

Keywords

  • LLM Detection
  • Large Language Model
  • Perplexity
  • Authorship Verification

Fingerprint

Dive into the research topics of 'Authorial language models for AI authorship verification'. Together they form a unique fingerprint.

Cite this