Giulia Taurino
banner
giuliataurino.bsky.social
Giulia Taurino
@giuliataurino.bsky.social
AI+Cultural Heritage (OCR, CV, LLMs), Digital Humanities, Archival Science, Media & Cultural Studies.

Member of NULab & Turing Institute AI+Arts Group / Editor at The Programming Historian.
Pinned
I am delighted to share that I am a member of the Scientific Committee for the research project ∀ISION_E. The project's call for abstracts on "extended intelligences" in the field of drawing and architecture is open until November 14.

www.visioneuid.com/call-for-vis...
Call for Visions
Context & Mission of the Call for Visions
www.visioneuid.com
Reposted by Giulia Taurino
"There’s a whole generation that grew up entirely digitally but is now developing a different kind of interest in analog. She is hosting analog color darkroom workshops at gOlab, which are always fully booked out with young students, and she is in demand for lectures at art schools."
The Future of Photography: A Roundtable
Fugitive Processes: A Roundtable with Jeff Wall, Thomas Demand, Roxana Marcoci, Florian Ebner, Ketuta Alexi-Meskhishvili, and Christian Scheidemann
www.artforum.com
November 25, 2025 at 12:45 AM
Reposted by Giulia Taurino
New issue of my newsletter: "The Writing Is on the Wall for Handwriting Recognition" — One of the hardest problems in digital humanities has finally been solved, and it's a good use of AI newsletter.dancohen.org/archive/the-...
The Writing Is on the Wall for Handwriting Recognition
One of the hardest problems in digital humanities has finally been solved
newsletter.dancohen.org
November 25, 2025 at 4:35 PM
Reposted by Giulia Taurino
New post on an important project for building archives of contemporary social movements being undertaken by Fondazione Feltrinelli

jeffreyschnapp.com/2025/11/12/a...
Archives of the Present | Jeffrey Schnapp
Last year, the Milan-based Giangiacomo Feltrinelli Foundation, on whose advisory board I serve, launched its important Archives of the Present project which sets out to collect and preserve documentat...
jeffreyschnapp.com
November 15, 2025 at 1:28 AM
Reposted by Giulia Taurino
Building datasets to train smaller, task-focused models used to be incredibly time-consuming.

Very excited to see SAM3 massively lower that barrier. Describe the class you want to detect and get annotated datasets automatically!

Try it yourself: huggingface.co/datasets/uv-...!
November 21, 2025 at 1:30 PM
Reposted by Giulia Taurino
Que reste-t-il des textes médiévaux ? Pourquoi certains survivent et d’autres s’effacent ?

👉 Une séance de séminaire dédiée aux enjeux de la transmission textuelle dans le cadre du projet ERC LostMA avec @jbcamps.bsky.social

Rendez-vous demain à partir de 14h

Inscription 🔽
Modelling the transmission and survival of texts and manuscripts | médialab Sciences Po
Le séminaire du médialab accueille Jean-Baptiste Camps pour une séance sur le projet LostMA sur la transmission et la survie des anciens textes. Il analysera comment et pourquoi certains textes manusc...
medialab.sciencespo.fr
November 24, 2025 at 10:35 AM
Reposted by Giulia Taurino
The Public Interest Corpus has completed the last of three planning workshops. A diverse group of stakeholders sharpened our implementation plan by contributing expert insights on users, uses, and managing legal risk, data development and access, multi-stakeholder governance, and sustainability.
The Public Interest Corpus Update – Oakland Edition
Center for Library & Instructional Computing Services, Undergraduate Library, 1986 The Public Interest Corpus recently completed the last of three planning workshops. The final workshop was hos…
www.authorsalliance.org
November 24, 2025 at 4:09 PM
Reposted by Giulia Taurino
I curated some readings for class on "data tensions" and the list felt worth sharing. Come on a tour of datasets, books, the web, and AI with me...

We'll start with this piece on the Google Books project: the hopes, dreams, disasters, and aftermath of building a public library on the internet.

1/n
Torching the Modern-Day Library of Alexandria
“Somewhere at Google there is a database containing 25 million books and nobody is allowed to read them.”
www.theatlantic.com
November 14, 2025 at 4:39 PM
I am delighted to share that I am a member of the Scientific Committee for the research project ∀ISION_E. The project's call for abstracts on "extended intelligences" in the field of drawing and architecture is open until November 14.

www.visioneuid.com/call-for-vis...
Call for Visions
Context & Mission of the Call for Visions
www.visioneuid.com
November 8, 2025 at 8:34 AM
Reposted by Giulia Taurino
Transformer LMs get pretty far by acting like ngram models, so why do they learn syntax? A new paper by sunnytqin.bsky.social, me, and @dmelis.bsky.social illuminates grammar learning in a whirlwind tour of generalization, grokking, training dynamics, memorization, and random variation. #mlsky #nlp
Sometimes I am a Tree: Data Drives Unstable Hierarchical Generalization
Language models (LMs), like other neural networks, often favor shortcut heuristics based on surface-level patterns. Although LMs behave like n-gram models early in training, they must eventually learn...
arxiv.org
December 20, 2024 at 5:56 PM
Reposted by Giulia Taurino
Join us in Charleston on November 4 for our preconference, "How Can Libraries and Publishers Collaborate to Make Backlist Monographs Open Access?, which is free to attend through the support of the California Digital Library, the De Gruyter eBound Foundation, and University of Michigan Library.
How Can Libraries and Publishers Collaborate to Make Backlist Monographs Open Access?
Join us in Charleston this November for a Preconference on making backlist monographs open access! Tuesday, November 4, 2025, 1pm-4pm ET Cost: $0 Presenters: Dave Hansen, Executive Director, Author…
www.authorsalliance.org
October 9, 2025 at 2:01 PM
Reposted by Giulia Taurino
In principle, open access means that anyone, anywhere, can read and reuse scholarly work. In practice, many works labeled as “open” are constrained by restrictions that limit how they can be used. These constraints dilute the value of openness and conflict with its foundational definitions.
Open? When Site Restrictions and Clauses Undermine Open Access
Open access publishing has transformed the way research circulates. In principle, open access means that anyone, anywhere, can read and reuse scholarly work without financial, legal, or technical b…
www.authorsalliance.org
October 29, 2025 at 1:24 PM
Reposted by Giulia Taurino
Computationally, whitespace gets little attention—it’s usually standardized or stripped.

But in poetry, whitespace matters!

Yet actually *preserving* that poetic whitespace is v tough. Its slipperiness points to bigger issues w/ text processing & LLMs.

New paper ⬜️ aclanthology.org/2025.emnlp-m...
November 3, 2025 at 3:14 PM
Reposted by Giulia Taurino
As DH grows, it’s increasingly important to publish conference papers, but there hasn’t been a clear venue for that.

So I’m thrilled to share this new home for DH proceedings, which will include CHR papers & more.

Thanks to @taylor-arnold.bsky.social for leading this effort!

bit.ly/ach-anthology
October 29, 2025 at 3:39 PM
Reposted by Giulia Taurino
New issue of my newsletter: "The Index and the Vector" — Converting ambiguity into precision can help a broader audience discover and learn from collections newsletter.dancohen.org/archive/the-...
The Index and the Vector
Converting ambiguity into precision can help a broader audience discover and learn from collections
newsletter.dancohen.org
October 20, 2025 at 3:30 PM
Reposted by Giulia Taurino
New issue of my newsletter: “The Library’s New Entryway” — An interface that combines the advantages of the traditional index with the power of LLMs is the path forward newsletter.dancohen.org/archive/the-...
The Library’s New Entryway
An interface that combines the advantages of the traditional index with the power of LLMs is the path forward
newsletter.dancohen.org
October 10, 2025 at 7:32 PM
Reposted by Giulia Taurino
highly recommend!
October 6, 2025 at 6:29 PM
Reposted by Giulia Taurino
Bartz v. Anthropic has had a couple of major developments. Though the lawsuit was initially brought to address the legality of using copyrighted materials for training AI, the suit now focuses on Anthropic’s storage—without training use—of copies of books downloaded from LibGen and PiLiMi.
Bartz v. Anthropic: A Preliminary Look at What LibGen Books May Be Included in the Class Action
The LibGen Logo For this post, we relied heavily on the help of Charles Horn, self-described “metadata wrangler,” for data analysis.  As readers are likely aware, the Bartz v. Anthropic AI law…
www.authorsalliance.org
September 5, 2025 at 1:08 PM
Reposted by Giulia Taurino
Anthropic’s copyright settlement is historic, but it’s also not what many authors and publishers think. Check out our latest on what’s inside the proposed settlement:
The Anthropic Settlement – what it is and isn’t (and who could get paid)
www.anthropiccopyrightsettlement.com EDIT: On Sunday evening, Judge Alsup granted the motion for a hearing on Monday, September 8th, but expressed disappointment over lack of details, mostly on the…
www.authorsalliance.org
September 8, 2025 at 11:10 AM