Aaron Sterling
@aaronsterling.bsky.social
440 followers 1.5K following 710 posts
CEO, Thistleseeds. Personal account. Current primary project: tech for substance use disorder programs.
Posts Media Videos Starter Packs
aaronsterling.bsky.social
"Boyfriend" is important. Women spend more per OnlyFans transaction per capita than men do. More men subscribe to OF than women, of course. But perhaps the text/story primacy of ChatGPT will mean there are more female paid users than male, while OF is more visual.
aaronsterling.bsky.social
There's a months-long Trust and Safety whackamole where goon communities post porn jailbreak prompts to Github, OpenAI renders the jailbreak ineffective, and the cycle continues. Might be easier to manage if there's an official front door.
Reposted by Aaron Sterling
werner.social
No data, no AI, no progress. My @AmazonScience article explores how multi-layered mapping + petabyte-scale cloud infrastructure helps save lives in time of crisis. Building AI without addressing the fundamental data divide means solving the wrong problems. amazon.science/blog/why-ai-...
Why AI for good depends on good data
New technologies are helping vulnerable communities produce maps that integrate topographical, infrastructural, seasonal, and real-time data — an essential tool for many humanitarian endeavors.
amazon.science
aaronsterling.bsky.social
My "original proposal" was a recommendation to an experienced medical researcher. Research maturity gives you the ability to prima facie check that a methodology appears evidence-based. Hire a subject matter expert to verify, if you want to deploy to clinic. (My last post here, best wishes.)
aaronsterling.bsky.social
What you describe is better done by humans. The advantage of LLMs is the ability to do the same work at scale. To find useful references in a field you have no experience in, to process petabytes of health data, etc.
aaronsterling.bsky.social
That is a proven technique to receive more useful responses: fewer errors, and the errors that exist are easier to spot. You need to check work, just as you would with a human assistant. It's much like supervising someone with a lot of intelligence and ego, but little practical experience.
aaronsterling.bsky.social
I'm not having a rhetorical discussion; I'm telling you what's happening in real life. Risk Management departments are managing LLM error similarly to how they manage human error. Legal red teaming is a tool for this. Management of risk is possible; elimination is impossible, and always has been.
aaronsterling.bsky.social
One of the core techniques of legal red teaming is to depose models. Think of it like a variation of moot court. Even when attorneys are not directly involved in such depositions (as that particular group often is), it's a useful metaphor, as discussed in the screenshot I posted earlier.
aaronsterling.bsky.social
A cautious use is: cover the attached file with automated tests according to (project testing standards document). You can get tests for dozens of weird edge cases in a few minutes. Can specify unit tests, integration tests, etc., depending on if you want dependencies to be run or mocked.
aaronsterling.bsky.social
Creators and users of LLMs potentially could. My own work with LLMs is bound by HIPAA and a "super-HIPAA" requirement called 42 CFR Part 2. A negligent data breach could cause the negligent person to face prison time. That's part of what motivates deposition of LLMs: to reduce corporate legal risk.
aaronsterling.bsky.social
Like any powerful tool, it's easy to use incorrectly. It requires a fair amount of training and time investment to write strong prompts, to write tests that verify accuracy of those prompts, etc. Some of my prompts are multi-page standards documents.
aaronsterling.bsky.social
Most software programmers are not managers. Good prompting requires skills very similar to writing requirements for contract work. It's a learnable skill, but most people using LLMs have not yet learned it.
aaronsterling.bsky.social
I have a meeting in three minutes. But here is one white paper that appeared in that search. I attended a talk by these folks earlier this year. Teams of attorneys and data scientists depose LLMs as part of a legal red team. www.jdsupra.com/legalnews/le...
Legal Red Teaming, One Year In: Reports from the Field | JD Supra
Introduction - In our June 2024 white paper, Legal red teaming: A systematic approach to assessing legal risk of generative AI models,...
www.jdsupra.com
aaronsterling.bsky.social
But accurate, given the applications I work with.
aaronsterling.bsky.social
I imagine it depends on the field. I've published academic papers, and I'm in medical software writing clinic-ready applications. But don't take my word for it. You could look at the AI deployment department of The Mayo Clinic for a cutting edge example.
aaronsterling.bsky.social
There's a close connection to legal rules of evidence, if that assuages you. The search term "Legal Red Team" or "Deposing LLMs to verify correctness" will give you more background.
aaronsterling.bsky.social
I'm reporting lived experience. There's nothing to argue with. But to respond: grad students, professors, doctors, all make mistakes. If the error rate is on par with human error, you just have to perform the same verificiations you would anyway.
aaronsterling.bsky.social
Common sense has limits. Eg: academic writing specialized. A sociologist can craft a prompt and tell LLM to do a deep dive and respond as an economist or an anthropologist, about recent papers that bear on sociology thing X. Or: write survey paper on subject Y, with cross-disciplinary references.
aaronsterling.bsky.social
Our code quality has improved, our automated tests cover more sophisticated edge cases, our ability to stand up new features that are robust for clinical deployment is far faster. Helps to have advanced skills to write productive prompts, eg technical project management. Done correctly, adds value.
aaronsterling.bsky.social
It helps to think of talking to a kid. Instead of saying "Don't do X," say "Do Y and here is an example of Y." Emulate good behavior reducing the opportunities for bad behavior.
aaronsterling.bsky.social
I've found it helpful to say "Answer in two parts. In the first part, provide only quotes and citations from XYZ source. In the second part, write conclusions based on the cited quotes in the first part."
Reposted by Aaron Sterling
brittanytrang.com
Scoop at STAT: Amazon, a founding partner of the Coalition for Health AI, has dropped out of the org.

The Trump admin also seems to be mounting an attack on CHAI, with FDA and HHS Secretary RFK Jr warning that CHAI must be stopped from becoming a "cartel."

Read more:
🩺🖥️
Coalition for Health AI faces escalating attacks by Trump officials, loss of founding member Amazon
The Trump administration is escalating its efforts to weaken a tech industry-funded group seeking to help shape the nation’s use of AI in health care.
www.statnews.com