Lightnews — Scholar-powered news

Reposted by Avijit Ghosh

EvalEval Coalition

@eval-eval.bsky.social

🚨 AI keeps scaling, but social impact evaluations aren’t–and the data proves it 🚨

Our new paper, 📎“Who Evaluates AI’s Social Impacts? Mapping Coverage and Gaps in First and Third Party Evaluations,” analyzes hundreds of evaluation reports and reveals major blind spots ‼️🧵 (1/7)

November 13, 2025 at 1:59 PM

Avijit Ghosh

@evijit.io

Extremely thrilled to talk about our new paper: "Who Evaluates AI’s Social Impacts? Mapping Coverage And Gaps In First And Third Party Evaluations".

This is the first big project output from the
@eval-eval.bsky.social coalition! Thread below:

November 13, 2025 at 2:35 PM

Avijit Ghosh

@evijit.io

We have a call for posters out! Please submit your extended abstracts, it should be quick and easy. And just like last year, provocative work is especially encouraged as it makes for such interesting conversation 😈

EvalEval Coalition @eval-eval.bsky.social · 7d

📮 We are inviting students and early-stage researchers to submit an Abstract (Max 500 words) to be presented as posters during interactive session. Submit here: tinyurl.com/AbsEval

We have a rock-star lineup of AI researchers and an amazing program. Please RSVP at the earliest! Stay tuned!

November 6, 2025 at 9:22 PM

Avijit Ghosh

@evijit.io

This. Copyright is a tool for protection but it’s not everything. In fact, there’s research showing that it is possible to create competitive language models using public domain data only. The proliferation of copyright respecting models would not solve the labor impact policy problem.

Ted Underwood @tedunderwood.com · 12d

Imagine you time travel back to 1847 and you find the left response to industrialization is a) machines will never be as good as human weavers or b) we need to copyright loom patterns or c) it’s a speculative bubble.

You’d say “Y’all. Not helping. What you need is obviously a labor movement.”

November 2, 2025 at 12:25 PM

Avijit Ghosh

@evijit.io

Going to San Diego for Neurips? We at @eval-eval.bsky.social , along with the UK AISI, are hosting a closed door state of evals workshop at @ucsandiego.bsky.social on Dec 8th.

Request to join below! :)

evaleval.github.io/events/works...

2025 Workshop on Evaluating AI in Practice

EvalEval, UK AI Security Institute (AISI), and UC San Diego (UCSD) are excited to announce the upcoming Evaluating AI in Practice workshop, happening on December 8, 2025, in San Diego, California.

evaleval.github.io

November 1, 2025 at 4:46 PM

Avijit Ghosh

@evijit.io

Datasets are the backbone of AI for Science, and we want to support scientific data natively on Hugging Face. The amazing @lhoestq.hf.co started a discussion on GH for this! Please engage (better still, submit a PR) so we can start supporting your 🫵 dataset:

github.com/huggingface/...

Support scientific data formats · Issue #7804 · huggingface/datasets

List of formats and libraries we can use to load the data in datasets: DICOMs: pydicom NIfTIs: nibabel WFDB: wfdb cc @zaRizk7 for viz Feel free to comment / suggest other formats and libs you'd lik...

github.com

October 28, 2025 at 4:33 PM

Avijit Ghosh

@evijit.io

Random off the cuff observation about American AI: LLM folks seem to be concentrated in SF, but AI4Science folks seem to be concentrated in Boston. Meaning as the former gets oversaturated and the latter is only getting started, I expect Boston to be the next big AI epicenter! 💪

October 24, 2025 at 6:32 PM

Reposted by Avijit Ghosh

EvalEval Coalition

@eval-eval.bsky.social

🌟 Weekly AI Evaluation Spotlight 🌟

🤖 Did you know malicious actors can exploit trust in AI leaderboards to promote poisoned models in the community?

This week's paper 📜"Exploiting Leaderboards for Large-Scale Distribution of Malicious Models" by @iamgroot42.bsky.social explores this!

October 24, 2025 at 4:44 PM

Avijit Ghosh

@evijit.io

+1000. I miss life pre-AI hype when the discourse around AI was more scientific and people used to attribute papers and opinions to scientists instead of to their companies. Not all orgs block research papers and sanity check their papers via legal teams, and HF, especially so, is very distributed.

Margaret Mitchell @mmitchell.bsky.social · 29d

Hugging Face (thankfully) doesn't do a groupthink too much -- e.g., "Hugging Face thinks this". We're generally able to have different opinions and thoughts, which is part of the open/collaborative ethos. I don't feel strongly about this conference myself, I see pros and cons. 🧵

October 20, 2025 at 7:01 PM

Avijit Ghosh

@evijit.io

Hey, so can someone tell me why ChatGPT generating erotica is bad, any more so than it generating anything else? Obviously anything non-consensual or age-inappropriate is bad, but I don't see why some researchers in my timeline are up in arms about it, while Grok already does this.

October 20, 2025 at 1:44 PM

Avijit Ghosh

@evijit.io

We're starting a weekly paper spotlight series! Come engage with the posts and let's improve evals together! :)

First up: Do Large Language Model Benchmarks Test Reliability?

EvalEval Coalition @eval-eval.bsky.social · 27d

✨Weekly AI Evaluation Paper Spotlight✨

🕵️ Is benchmark noise and label errors masking the true fragility of LLMs?

🖇️"Do Large Language Model Benchmarks Test Reliability?" - This paper by @joshvendrow.bsky.social provides insights!

October 17, 2025 at 4:56 PM

Avijit Ghosh

@evijit.io

I had never seen people fighting with co-authors via colored latex comments on overleaf in a developing paper draft until today. One of them even pasted a google calendar link saying "can we hop on a call and hash this out". Paper writing is actually exciting sometimes!

October 16, 2025 at 7:53 PM

Avijit Ghosh

@evijit.io

Trying to start a new hobby and the internet is useless. Maybe AI will finally kill unstructured information retrieval for good and then we will be forced to call or visit friends for help again

October 12, 2025 at 9:17 PM

Avijit Ghosh

@evijit.io

More of such research please! Chatbots are not the future of science, science is

Allen Institute @alleninstitute.org · Oct 7

Introducing CellTransformer, a new AI tool developed with UCSF that makes it easier to explore massive neuroscience datasets and identify important subregions of the brain. 🧵

October 8, 2025 at 8:08 PM

Avijit Ghosh

@evijit.io

Some of these new gpt/claude wrapper startups make me wonder how much the founders are paying themselves in salary because there is no way they expect their horrible idea to actually be sustainably profitable

October 8, 2025 at 12:46 AM

Avijit Ghosh

@evijit.io

AI for scientific discovery is a social problem: In our new position paper, @cgeorgiaw.bsky.social and I show that culture, incentives, and coordination are the main obstacles to progress, and we are launching the Hugging Science Initiative to address this!

October 6, 2025 at 4:28 PM

Avijit Ghosh

@evijit.io

At a panel a couple of weeks ago I exclaimed in despair: “who decided that the only mode of interacting with AI is via chatbots?” And this trend is relentless still.

www.theverge.com/news/787076/...

Microsoft launches ‘vibe working’ in Excel and Word

Vibe working is all about Office’s new Agent Mode.

www.theverge.com

October 1, 2025 at 5:56 PM

Reposted by Avijit Ghosh

Jordan Jamboree!

@jordanjamboree.bsky.social

Wife brought her 100 year old film camera to that pirates game from a few weeks ago and she got this shot that's just perfect

A black and white photo of a baseball game, you can see that it is mid pitch with a pirate at bat

September 28, 2025 at 9:36 PM

Avijit Ghosh

@evijit.io

So fascinating (not really) to me that company execs and tier 1 AI conferences have gone in completely opposite directions as it relates to AI usage. Surely the best minds actually developing AI models know something about overreliance, productivity, and quality? Surely?

September 28, 2025 at 4:03 PM

Avijit Ghosh

@evijit.io

This phenomenon needs a nice catchy name because every research paper (including my own) that refers to it uses as many words but it’s an insidious problem

Redneck of the Edmund Fitzgerald @amagire.bsky.social · Sep 25

"of course ChatGPT is full of errors when it comes to my specific area of professional or technical expertise, that's why I only use it for other stuff, where I can't tell if it's bullshit or not"

📰🗞💥

September 25, 2025 at 10:45 PM

Avijit Ghosh

@evijit.io

I’m back at my alma mater today to talk about personal anecdotes of how I have experienced AI, how those experiences fueled my research, and the kind of questions that still remain. Looking forward to it! 🤗

www.eventbrite.com/e/personal-a...

Personal Anecdotes of AI Bias (and where do we go from here?)

BostonCHI in partnership with NU Center for Design presents a hybrid talk by Avijit Ghosh

www.eventbrite.com

September 23, 2025 at 9:07 PM

Avijit Ghosh

@evijit.io

I’m just going to play that South Park episode on a loop at people now. It was so spot on

Maggie Harrison Dupré @mharrisondupre.bsky.social · Sep 18

NEW: ChatGPT is causing chaos in marriages, as one spouse becomes deeply fixated on AI therapy/advice/spiritual wisdom — alienating the other spouse and, often, resulting in divorce.

In some cases, ChatGPT-enmeshed spouses are using the tech to bully their partners.

futurism.com/chatgpt-marr...

Futurism

Poison Tongue

Sep 18, 11:05 AM EDT by Maggie Harrison Dupré

ChatGPT Is Blowing Up Marriages as Spouses Use AI to Attack Their Partners

"My family is being ripped apart, and I firmly believe this phenomenon is central to why."

Image: A frustrated-looking woman holding up her phone to a man who looks confused.

September 19, 2025 at 1:45 AM

Avijit Ghosh

@evijit.io

This continues to be a recurring theme -_- Why is consumer agency so scary to the market? Noticed the same trend when I worked at Twitter: the personalization team cared for engagement metrics way more than actually showing users what they wanted to follow: “people don’t know what they like”

Dr Juliette @ferrydanini.bsky.social · Sep 17

I discussed with a colleague and their student in UX design that they should add something in their design that mentions the uncertainty of AI results and I got the same reply "users don't like uncertainty so no"

🌈Dr. Frizzle @swilua.bsky.social · Sep 16

you don’t understand bro, if we didn’t lie, no one would buy our product 😩

September 17, 2025 at 9:26 AM

Reposted by Avijit Ghosh

Margaret Mitchell

@mmitchell.bsky.social

🤖 As AI-generated content is shared in movies/TV/across the web, there's one simple low-hanging fruit 🍇 to help know what's real: Visible watermarks. With others @hf.co, I've made sure it's trivially easy to add this disclosure to images, video, chatbot text. See how:
huggingface.co/blog/waterma...

September 16, 2025 at 4:29 PM

Avijit Ghosh

@evijit.io

I wish there was a way for arxiv to take me to the abstract page by default instead of the html page 😬 it is strictly worse

September 14, 2025 at 7:05 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news