Large Language Models and Literary Analysis

The field of natural language processing is witnessing significant advancements in the application of large language models (LLMs) to literary analysis. Recent studies have shown that LLMs can effectively capture and represent stylistic features of texts, enabling tasks such as authorship attribution and literary theme modeling. The ability of LLMs to process and compress cumulative information from entire texts into individual embeddings has been demonstrated, highlighting the sophistication of these models. Furthermore, the use of LLMs in academic writing has been investigated, with findings suggesting that authors use LLMs uniformly, reducing the risk of introducing hallucinations into academic preprints. The development of new methods for detecting cultural differences in images and watermarking LLM-generated text has also been explored, with potential applications in preserving the integrity of academic peer review. Noteworthy papers in this area include: The paper 'What's in a prompt?' which shows that LLMs encode stylistic features in their embeddings, and 'Frankentext' which introduces a new task for generating long-form narratives by stitching random text fragments together. The paper 'The Arabic AI Fingerprint' presents a comprehensive investigation of Arabic machine-generated text, demonstrating the ability to detect LLM-generated text with high accuracy.

Sources

What's in a prompt? Language models encode literary style in prompt embeddings

GPT Editors, Not Authors: The Stylistic Footprint of LLMs in Academic Preprints

Frankentext: Stitching random text fragments into long-form narratives

The Feasibility of Topic-Based Watermarking on Academic Peer Reviews

Detecting Cultural Differences in News Video Thumbnails via Computational Aesthetics

Tell, Don't Show: Leveraging Language Models' Abstractive Retellings to Model Literary Themes

The Arabic AI Fingerprint: Stylometric Analysis and Detection of Large Language Models Text

Built with on top of