Chau Minh Pham
banner
chautmpham.bsky.social
Chau Minh Pham
@chautmpham.bsky.social
PhD student @umdcs | Long-form Narrative Generation & Analysis | Intern @AdobeResearch @MSFTResearch | https://chtmp223.github.io
We release code to facilitate future research on fine-grained detection of mixed-origin texts and human-AI cowriting.

Github: github.com/chtmp223/Fra...
Paper: arxiv.org/abs/2505.18128

Work done with @jennajrussell, @dzungvietpham, and @MohitIyyer!
GitHub - chtmp223/Frankentext: Frankentext: Stitching random text fragments into long-form narratives
Frankentext: Stitching random text fragments into long-form narratives - chtmp223/Frankentext
github.com
June 3, 2025 at 3:09 PM
Room for improvement:

🔧 Frankentexts struggle with smooth narrative transitions and grammar, as noted by human annotators.
🔩 Non-fiction versions are coherent and faithful but tend to be overly anecdotal and lack factual accuracy.
June 3, 2025 at 3:09 PM
Takeaway 2: Our controllable generation process provides a sandbox for human-AI co-writing research, with adjustable proportion, length, and diversity of human excerpts.

👫 Models can follow copy constraints, which is a proxy for % of human writing in co-authored texts.
June 3, 2025 at 3:09 PM
Takeaway 1: Frankentexts don’t fit into the "AI vs. human" binary.

📉 Binary detectors misclassify them as human-written
👨‍👩‍👧 Humans can detect AI involvement more often
🔍 Mixed-authorship tools (Pangram) help, but still catch only 59%

We need better tools for this gray zone.
June 3, 2025 at 3:09 PM
Automatic evaluation on 100 Frankentexts using LLM judges, text detectors, and a ROUGE-L-based metric shows that:

💪 Gemini-2.5-Pro, Claude-3.5-Sonnet, and R1 can generate Frankentexts that are up to 90% relevant, 70% coherent, and 75% traceable to the original human writings.
June 3, 2025 at 3:09 PM
Frankentext generation presents an instruction-following task that challenges the limits of controllable generation, requiring each model to:

1️⃣ Produce a draft by selecting & combining human-written passages.
2️⃣ Iteratively revise the draft while maintaining a copy ratio.
June 3, 2025 at 3:09 PM