Sterling Fluharty
sterlingfluharty.bsky.social
Sterling Fluharty
@sterlingfluharty.bsky.social
Independent Scholar. Working in history and data.
I’m getting excited. I’ve lined up recommendation letter writers. I’ve requested transcripts. I’ve drafted personal statements, updated my CV, and dusted off my master’s thesis. I’ve narrowed down possible doctoral programs to half a dozen. Here’s hoping that I land at one of them in Fall 2026.
September 25, 2025 at 2:35 AM
This essay proposes using backprop to determine when content contributes to an LLM response and compensating creators accordingly. It reminded me of how many academics want credit & payment when their work is used to power chatbots. There would be challenges, but this sort of mechanism could emerge.
September 15, 2025 at 9:33 PM
This is a great academic analysis of the settlement. It mentions the 7.5M title database of LibGen titles compiled by The Atlantic. I’ll add that, according to the settlement, the 7M pirated downloads by Anthropic from LibGen and PiLiMi yielded only 465k titles. Something does not add not add up.
September 5, 2025 at 11:05 PM
I have questions (below is NYTimes today). Why did Turvey go to publishers vs research libraries? If Anthropic took 500k pirated books, did they buy millions from publishers? How does this impact Public Interest Corpus? Is $3k per pirated book insulting? What kind of response if 500k list released?
September 5, 2025 at 8:12 PM
I can understand the accidental framing, given the discourse around LLMs and their hallucinations, but I think the odds are decent for the model encountering and recalling this set of facts from the more than 7,000 Victorian-era documents it was trained on. Perhaps we will see more historical LLMs.
College student’s “time travel” AI experiment accidentally outputs real 1834 history
Hobbyist training AI on Victorian texts gets an unexpected history lesson from his own creation.
arstechnica.com
August 22, 2025 at 11:40 PM
This week I'm reading the new book _Artificial Historians_ and getting more ideas for the conference paper I'm working on about history and AI.
Artificial Historians
This book offers readers an introduction to the world of artificial histories and historians. It looks behind the interfaces of AI and explores everyday platforms and prize-winning history books to id...
www.routledge.com
August 12, 2025 at 2:32 AM
The AHA has just reported that it filed yesterday, along with ACLS and MLA, a motion for preliminary injunction on behalf of NEH, in response to the harms done by DOGE. I asked OpenAI’s o3 model to evaluate the motion and the model gives it a 70% chance of success. I hope we prevail in the courts.
May 15, 2025 at 10:00 PM
A historian of science ponders the future of the humanities after feeding documents into AI and hearing an impressive podcast and then assigning students to have a conversation with AI on the history of attention and finds that chatbots have perhaps become better than humans at getting attention.
Will the Humanities Survive Artificial Intelligence?
Maybe not as we’ve known them. But, in the ruins of the old curriculum, something vital is stirring.
www.newyorker.com
April 26, 2025 at 6:33 PM
I just received this email that the AHA is exploring possible legal action to protect historical funding from the NEH and other agencies. With some help from ChatGPT, I found that similar advocacy at the federal level to protect the work of historians has been successful before. I hope we prevail.
April 17, 2025 at 7:47 PM
I worked with ChatGPT on some estimates. About 60B words published in univ press books since 1969. And > 250T words avail for scraping on the web. That’s a 1 to 4k ratio. How would comp work for LLM training on academic books if they’re a drop in the bucket? What criteria would work for licensing?
March 21, 2025 at 3:20 AM
I'm writing a paper for this year's Social Science History Association conference. I decided to read the call for papers against the grain. My working title is "Emergent History: When AI Models Rewrite the Past." If the paper is accepted, I'm hoping I don't get called Foolhardy during my session ;-)
SSHA2025
ssha2025.ssha.org
March 11, 2025 at 4:11 AM
It has been 16 years since I've attended an AHA meeting (www.historians.org/perspectives...). Today I dug up an old password, successfully logged into historians.org, and renewed my membership. I'll be in Bozeman next month at an AHA event to learn more about large-scale historical research with AI.
The Future of Research in the Age of AI – AHA
This symposium will take place on Thursday, March 27, and Friday, March 28 at the Museum of the Rockies at Montana State University (MSU) in Bozeman, Montana. The event is supported by MSU’s College o...
www.historians.org
February 27, 2025 at 11:09 PM
I read "Provocations from the Humanities for Generative AI Research" (arxiv.org/abs/2502.19190). This got me wondering about adding social sciences and focusing on history. I prompted o1 Deep Research to reimagine the paper (chatgpt.com/share/67c0e7...) and generate more provocations (see below):
February 27, 2025 at 10:40 PM
I gave the new copyright guidance a quick scan and it appears there was little to no discussion of public domain or Creative Commons licensing. Will digital libraries built by AI that summarize and extract insights from primary sources have the option of online open access or become walled gardens?
copyright.gov
January 30, 2025 at 3:24 PM
This post (resobscura.substack.com/p/the-leadin..., HT @nic221.bsky.social) by @resobscura.bsky.social caught my attention. Perhaps taking a cue from the author, I enlisted the help of the o1 model in a close reading of the text. You can peruse the remaining response at chatgpt.com/share/6791ca....
January 23, 2025 at 4:51 AM
This is about research presented at NeurIPS. I agree that LLMs need better coverage of history across time periods and regions. But are multiple-choice tests the best way to measure how LLMs understand history? When will we measure how well they reason when given primary sources & relevant context?
AI struggles to understand human history and fails miserably when tested
The study, which is the first of its kind, evaluates the historical knowledge of leading AI models such as ChatGPT-4, Llama, and Gemini.
www.earth.com
January 22, 2025 at 9:45 PM
Yesterday, I saw a request for info on a county founder in a Facebook genealogy group. I asked Google Deep Search to generate a bio. It read nearly 40 websites, wrote an essay, and dropped 14 footnotes to cite its sources. I wonder how this capability will improve as more sources are digitized.
December 18, 2024 at 5:42 PM
I’m enjoying today’s episode of the History in Focus podcast, which is a discussion with historians of what AI means for teaching and research:
December 18, 2024 at 2:16 PM
Today I used Transkribus for a first pass of transcribing historical court records, which was handy because it provides an editor with line-by-line comparison to the image. I then gave the image and first transcription to ChatGPT and asked it to produce a final transcription, which it did well.
November 14, 2024 at 3:52 AM
I presented a paper at SSHA recently. I tried to put critical data studies into conversation with U.S. census data record linkage 1850-1880. I was excited our outgoing president spoke about applying source criticism to census data analysis and that colleagues from IPUMS asked for copies of my paper.
November 12, 2024 at 10:02 PM
I’m working on a history of Wetzel County. It was in the borderland region of the Upper Ohio River Valley and sent men to both Union and Confederate forces. I’m building a population database to match the genealogical profiles of these individuals with their military service and pension records.
November 11, 2024 at 6:02 PM
I had an interesting chat with Claude today that started with this prompt: “If nineteenth-century American sources could be made machine readable through OCR and handwriting recognition, why isn’t there much of a coordinated effort to construct this sort of public domain corpora for training LLMs?”
November 11, 2024 at 12:40 AM
I’ll be testing out some digital public history this month. I’m using public domain sources from Internet Archive, organizing their chapters by chronology and themes, generating podcasts via NotebookLM, creating historical images via ChatGPT, and assembling narrated slideshows for AI YouTube videos.
November 10, 2024 at 1:28 AM
With white-collar jobs for recent college graduates declining this year, it appears ChatGPT is still bullish about what AI agents, which can further automate many tasks, will mean for the prospects of job seekers. I hope students and workers will get ready to utilize AI and not be left behind.
October 23, 2024 at 9:09 PM
As someone who grew up with the Internet in the 1990s, I hope the Internet Archive isn’t wiped out by lawsuits. It reminds me of how Roy Rosenzweig anticipated the outcomes of either digital abundance or scarcity of historical sources in the 21st century: www.wired.com/story/intern...
The Internet Archive’s Fight to Save Itself
The web’s collective memory is stored in the servers of the Internet Archive. Legal battles threaten to wipe it all away.
www.wired.com
October 2, 2024 at 8:15 PM