Lightnews — Scholar-powered news

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Lets use AI to help people manage illness

I am excited by the idea of using AI to help people manage ilness and health conditions. This isnt very sexy, but I think there is real potential to improve health outcomes and quality of life.

ehudreiter.com/2026/01/19/l...

Lets use AI to help people manage illness

I am excited by the idea of using AI to help people manage ilness and health conditions. This isnt very sexy, but I think there is real potential to improve health outcomes and quality of life.

ehudreiter.com

January 19, 2026 at 9:22 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Other CS academics I know have done very different things in retirement: remained active in academia as emeritus, joined a startup, charitable work, moved to remote spot in Scot Highlands, write novels, etc. We did similar things as academics (research and teaching), but very diff in retirement!

January 16, 2026 at 9:16 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

AI hallucination is in the UK political news. Israeli fans were banned from a football match, and this ban was based on a report which included hallucinated material made up by MS Copilot

www.theguardian.com/uk-news/2026...

West Midlands police chief apologises after AI error used to justify Maccabi Tel Aviv ban

Craig Guildford says he gave incorrect evidence to MPs and mistake arose from ‘use of Microsoft Copilot’

www.theguardian.com

January 14, 2026 at 3:20 PM

Reposted

Emily M. Bender

@emilymbender.bsky.social

Health experts: Your synthetic text "AI" overviews are misleading, for example see this about liver function tests.
Google: Okay, we'll block "AI" overviews on that query.

The product is fundamentally flawed and cannot be "fixed" by patching query by query.

A short 🧵>>

‘Dangerous and alarming’: Google removes some of its AI summaries after users’ health put at risk

Guardian investigation finds AI Overviews provided inaccurate and false information when queried over blood tests

www.theguardian.com

January 11, 2026 at 2:27 PM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Nice chat with some of my soon-to-submit PhD students. They all know how to conduct and write up research, have lots of ideas for future work, and have developed networks of collaborators. So they are ready to "leave the nest", which is good feeling for me as supervisor

January 8, 2026 at 9:54 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog (personal): Retirement Plans: Travel and some academics

I hope to retire soon, and many people are asking about my plans. Basically I want to do lots of travel, say involved in academia, and perhaps do some writing.

ehudreiter.com/2026/01/06/r...

Retirement Plans: Travel and some academics

I hope to retire soon, and many people are asking about my plans. Basically I want to do lots of travel, say involved in academia, and perhaps do some writing.

ehudreiter.com

January 6, 2026 at 8:25 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

One nice thing about 2025 was that the two publications I was proudest of were single-author! Also many good papers with my students, but I get a special buzz from single-author papers

January 1, 2026 at 1:46 PM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Do a sanity check on your experiments

Researchers should do a “sanity” check on experiments. That is, manually inspect some (A) test/train data, (B) model/system output, and (C) evaluation output, looking for anything that seems strange.
ehudreiter.com/2025/12/22/d...

Do a sanity check on your experiments

I strongly recommend that researchers do “sanity checks” on data, model outputs, and evaluation results, looking for anomalies. This can help detect data errors, model cheating, softwar…

ehudreiter.com

December 22, 2025 at 9:05 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

One of main goals for 2025-26 is to get 6 PhD students to submit before I retire in summer 2026. So very happy that Nikolay Babakov has submitted and passed his viva, and Iniakpokeikiye Thompson has submitted. Getting there...

December 16, 2025 at 10:11 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Colleague has discovered many bugs (eg incorrect annotations) in a respected 8-year old dataset he is using. Nobody warned him, and hard for him to warn others. Maybe most people just dont care if dataset is deeply flawed, as long as they can compute numbers and beat SOTA...

December 15, 2025 at 9:02 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Making good LLM benchmark is hard. Avoid
data contamination, reward hacking, saturation; ensure construct validity; rigorously test and validate, etc.

Unfortunately, community places little value on above. Want to beat SOTA or competitors, dont care if BM used mean anything...

December 10, 2025 at 7:55 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Do LLMs cheat on benchmarks

LLMs often “cheat” on benchmarks via data contamination and reward hacking. This problem is getting worse, perhaps because of perverse incentives. Need to move beyond benchmarks and start measuring real-world impact.

ehudreiter.com/2025/12/08/d...

Do LLMs cheat on benchmarks

LLMs often “cheat” on benchmarks via data contamination and reward hacking. Unfortunately, this problem seems to be getting worse, perhaps because of perverse incentives. If we want to …

ehudreiter.com

December 8, 2025 at 6:50 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Interesting chat about hallucination in patient information dialogues. When we ask domain experts to check statements such as "X increases liklihood of Y", response is often "depends on context" or "we dont know, need more experiments". Does this make statement a hallucination?

November 27, 2025 at 9:42 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Hard to Change Poor Research Culture

Research culture is very important but also very hard to change. I suspect this is one reason why it is so difficult to get people to do more rigorous and meaningful experiments.

ehudreiter.com/2025/11/24/h...

Hard to Change Poor Research Culture

Research culture is very important but also very hard to change. I suspect this is one reason why it is so difficult to get people to do more rigorous and meaningful experiments.

ehudreiter.com

November 24, 2025 at 9:11 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Aberdeen CS is hiring a new lecturer for its "Joint Institute" with South China Normal University. Basically you would be based and do research in Aberdeen, but would be expected to go to China a few times a year and teach at SCNU.

Closing 28 Nov

www.abdn.ac.uk/jobs/vacanci...

Lecturer in Computing Science, Natural & Computing Sciences (NCS253A) | The University of Aberdeen

University of Aberdeen Research Jobs

www.abdn.ac.uk

November 12, 2025 at 9:14 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

I'm disturbing reports about chatbots encouraging children to kill themselves. such as www.bbc.co.uk/news/article... . Shame that the AI Safety community in general, and the @AISecurityInst in particular, seem to have little interest in this, very disappointing...

Mothers say AI chatbots encouraged their sons to kill themselves

In her first UK interview Megan Garcia speaks to Laura Kuenssberg about the death of her teenage son.

www.bbc.co.uk

November 10, 2025 at 8:51 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Understanding what users want from NLG

When building an NLG system, it really helps to understand what users want; this came up several times at the recent INLG conference. I discuss some of our work in this space, and give a few suggestions.

ehudreiter.com/2025/11/06/u...

Understanding what users want from NLG

When building an NLG system, it really helps to understand what users want; this came up several times at the recent INLG conference. I discuss some of our work in this space, and give a few sugges…

ehudreiter.com

November 6, 2025 at 7:26 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

I'm trying to understand OpenAI's healthbench. "HealthBench: Evaluating Large Language Models Towards Improved Human Health" doesnt say much about the BM(eg, very few examples). Are there other papers? I dont care how well model X performs, I want to judge if I can trust the BM

November 5, 2025 at 2:27 PM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Just back from INLG. Nice event as always, but I am concerned that it is losing its uniqueness. Maybe for 2026 Ill suggest some special tracks which are interesting to INLG community but not ARR types (eg, user requirements/eval, non-LLM techniques).

November 5, 2025 at 9:15 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Most common uses of AI in Healthcare

Data on usage of AI in healthcare suggests that most common uses in 2025 are probably (A) giving personalised health information to patients and (B) helping clinicians write documents.

ehudreiter.com/2025/10/21/m...

Most common uses of AI in Healthcare

I review some data on usage of AI in healthcare, and conclude that the most common uses in 2025 are probably (A) giving personalised health information to patients and (B) helping clinicians write …

ehudreiter.com

October 21, 2025 at 6:21 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

One of my main goals for 2025-26 is to help my 6 senior PhD students submit their PhDs before I retire. Glad to say that Nicolay Babakov has now done so, with viva scheduled for Dec. Other five students seem to be on track, which is encouraging.

October 15, 2025 at 9:13 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Somewhat frustrated yesterday to once again read ACL paper which did all sorts of complex things (including the usual results tables showing best approach) on garbage data. With minimal ack of this in limitations. Most fundamental rule of CS is Garbage In, Garbage Out

October 9, 2025 at 8:46 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

New blog: Good diagrams for research papers

Ive seen a number of diagrams recently which are too complicated and difficult to understand. I explain some of the problems I see and give advice.

ehudreiter.com/2025/10/08/g...

Good diagrams for research papers

Ive seen a number of diagrams recently which are too complicated and difficult to understand. I explain some of the problems I see and give advice.

ehudreiter.com

October 8, 2025 at 8:27 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Really interesting paper on real-world evaluation in IR. I should learn more about eval in IR, its not something Ive ever properly looked at
dl.acm.org/doi/10.1145/...

What Matters in a Measure? A Perspective from Large-Scale Search Evaluation | Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

dl.acm.org

September 30, 2025 at 8:27 AM

ehudreiter.bsky.social

@ehudreiter.bsky.social

Several people have asked me recently if I will still be able to contribute to research projects after I retire in summer 2026. Absolutely! I will have emeritus statius, and am very hapy to remain involved in research projects at Aberdeen amd elsewhere.

September 26, 2025 at 10:21 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news