Lightnews — Scholar-powered news

Aaron Tay

@aarontay.bsky.social

Quick test that the Pubmed MCP/Connector works like normal Pubmed. Query below and shows top 5 (can be changed to top X) by PMID and total number of results. Tested in Pubmed advanced mode, same number of results, and top 5 PMID match by most recent (4)

November 25, 2025 at 4:50 AM

Aaron Tay

@aarontay.bsky.social

Here's aexample of claude "pilot testing" search strategies. claude.ai/share/f411ca... how bad is it? I am a noob at pilot testing strategies in Pubmed, but this doesnt look like the way an expert would do it? I think it does better if you give it different variants and it tests it out on top 20 (3)

Boolean search strategy optimization for exercise and cancer

Shared via Claude, an AI assistant from Anthropic

claude.ai

November 25, 2025 at 1:20 AM

Aaron Tay

@aarontay.bsky.social

Have to check if the pubmed Mcp server actually works correctly for long Boolean strings but I've always thought human Vs GPT competitions on Boolean strings crafting was always a bit unfair cos humans could test their strategies. This evens things out. Maybe add a mesh browser access via MCP? (2)

November 25, 2025 at 12:49 AM

Aaron Tay

@aarontay.bsky.social

Yeah will be hybrid world. And some will use MCP + their own local RAG etc

November 24, 2025 at 4:44 AM

Aaron Tay

@aarontay.bsky.social

And I often find people who feel the need to harp on this , dont actually understand the tech as well as they think. Cos half the webinar was showing use of LLM/GPT to screen for relevant papers aka "tech that generate content" IS used for evidence retrieval. Such a dumb comment. (3)

November 23, 2025 at 6:51 PM

Aaron Tay

@aarontay.bsky.social

That said generic deep research or even chatgpt are more "agentic". Try requests like look at paper x, find papers it should cite but didn't & you see specialised tools fall down. Mostly because they are more controlled flow wise.

November 23, 2025 at 10:04 AM

Aaron Tay

@aarontay.bsky.social

For generic deep research tools that use the web, god knows how the LLM chooses... What little empirical research does suggest they do favour higher cited papers. Though ai2 suggest their tools are less skewed towards top papers than human citations share.google/UFYMbMP0Bjnv...

Making AI citations count with Asta | Ai2

We're releasing data that shows which scientific papers our agentic platform for research and discovery, Asta, relies on most when answering questions.

share.google

November 23, 2025 at 9:48 AM

Aaron Tay

@aarontay.bsky.social

Or course the big bomb would be Google scholar coming in. They did just launch Scholar labs but that's deep search not deep research... But if they change their mind...
aarontay.substack.com/p/scholar-la... (7)

Scholar Labs Early Review: Google Scholar Finally Enters the AI Era

Generated by Nano-Banana Pro from text of this blog post

aarontay.substack.com

November 23, 2025 at 9:26 AM

Aaron Tay

@aarontay.bsky.social

Already you can kinda see it in certain topics like economics & law (public policy) where Gemini deep research are as good if not better. I suspect but haven't checked for humanities type subjects but I suspect they actually do better than those relying only on OpenAlex/semantic scholar (6)

November 23, 2025 at 9:00 AM

Aaron Tay

@aarontay.bsky.social

I think though as models get smarter, open access level rises, support of mcp increases, index wise relying just on semantic scholar/OpenAlex/open sources may actually be a disadvantage due to their limitations. Eg grey lit (5)

November 23, 2025 at 8:57 AM

Aaron Tay

@aarontay.bsky.social

In terms of access to full text behind paywalls, both specialised & general DR are on par typically. Though rise of mcp eg Wiley AI gateway which allows Claude to plugin to Mcp servers might change the game. Elsevier Leapspace that claims 15M full text from Elsevier & partners is the other model(4)

November 23, 2025 at 8:56 AM

Aaron Tay

@aarontay.bsky.social

General deep research are also slower cos many open a virtual browser and query it like a human. The other disadvantage using things like openai/Gemini DR is you know it isn't customized for academic in the way it does citations & it seems to create more "ghost references" for some reason (3)

November 23, 2025 at 8:52 AM

Aaron Tay

@aarontay.bsky.social

The main advantage has always been these specialised academic ai tools directly searched academic content while general deep research searches the web and needs to be prompted and be smart enough to know where to search and to prefer to cite. (2)

November 23, 2025 at 8:49 AM

Aaron Tay

@aarontay.bsky.social

Pretty funny that this post was right above yours. Some journals are desk rejecting work using open datasets that are commonly used by paper mills to generate papers

bsky.app/profile/pete...

petersuber @petersuber.fediscience.org.ap.brid.gy · Oct 19

Update. In response to this problem (previous post, this thread), some publishers are desk-rejecting papers based on open health datasets. The problem is not the quality of the data, but the absence of additional work to validate the findings.

Two reports:

1. "Journals and publishers crack […]

Original post on fediscience.org

fediscience.org

November 21, 2025 at 3:16 PM

Reposted by Aaron Tay

petersuber

@petersuber.fediscience.org.ap.brid.gy

Update. In response to this problem (previous post, this thread), some publishers are desk-rejecting papers based on open health datasets. The problem is not the quality of the data, but the absence of additional work to validate the findings.

Two reports:

1. "Journals and publishers crack […]

Original post on fediscience.org

fediscience.org

October 19, 2025 at 3:58 PM

Reposted by Aaron Tay

petersuber

@petersuber.fediscience.org.ap.brid.gy

Update. #socarxiv (@socarxiv) is dealing with a similar problem by requiring submitters to have #orcids and tightening its focus on the social sciences.
https://socopen.org/2025/11/19/socarxiv-submission-rule-changes/

#greenOA #preprints #repositories

SocArXiv submission rule changes

**Context** SocArXiv is experiencing record high submission rates. In addition, now that we have paper versioning – which is great – our moderators have to approve every paper revision. As a result, our volunteer workload is increasing. In addition we are receiving many non-research, spam, and AI-generated submissions. We do not have a technological way of identifying these, and it is time-consuming to read and assess them according to our moderation rules. We also don’t have moderation workflow tools that allow us to, for example, sort incoming papers by subject, to get them to specific expert moderators. So all our moderators look at all papers as they come in. That encourages us to think about narrowing the range of subjects we accept. The two rule changes below are intended to help manage the increased moderator burden. More policy changes may follow if the volume keeps increasing. **1. ORCID requirement** We require the submitting author to have a publicly accessible ORCID linked from the OSF profile page, with a name that matches that on the paper and the OSF account. In the case of non-bibliographic submittors (e.g., a research assistant submitting for a supervisor), the first author must have an ORCID. We can make exceptions for institutional submitters upon request, such as journals that upload their papers for authors. At present we are not requiring additional verification or specific trust markers on the ORCID (such as email or employer verification), just the existence of an account that lists the author’s name. It’s not a foolproof identity verification, obviously, but it adds a step for scammers, and also helps identify pseudonymous authors, which we do not permit. We may take advantage of ORCID’s trust markers program in the future and require additional elements on the ORCID record. We are happy to host papers by independent scholars, but a disproportionate share of non-research, spam, and AI-generated submissions come from independent scholars, many of whom do not have ORCIDs. For those scholars with institutional affiliations, we urge you to get an ORCID. This is a good practice that we should all endorse. **2. Focus on social sciences** At its founding, SocArXiv did not want to maintain disciplinary boundaries. It was our intention to be the big paper server for all of social sciences, and we couldn’t draw an easy line between social sciences and some humanities subjects, especially history, philosophy, religious studies, and some area studies, which are humanities in the taxonomy we use, but have significant overlap with social sciences. It was more logical just to accept them all. As the volume has increased, this has become less practical. In addition, a lot of junk and AI submissions are in the areas of religion, philosophy, and various language studies. We also don’t have moderators working in arts and humanities, and our moderators trained in social sciences are not expert at reviewing these papers. Finally, there is an excellent, open humanities archive: Knowledge Commons (KC Works), which is freely available for humanities scholars. With approval from that service, we will now direct authors to their site for papers we are rejecting in arts and humanities subjects. We continue to accept papers in education and law, which are also generally adjacent to social science. For a limited time we will accept revisions of papers we already host in arts and humanities, but urge those authors to include links to Knowledge Commons or somewhere else that can host their work in the future. We will assess papers that include arts/humanities as well as social science subject identifiers, and if we determine they are principally in art/humanities, reject them. We will continue to host all work we have already accepted. ### Share this: * Tweet * * Click to share on Reddit (Opens in new window) Reddit * More * * * Like Loading...

socopen.org

November 21, 2025 at 3:06 PM

Aaron Tay

@aarontay.bsky.social

I've always been skeptical of "AI Scientist" claims but maybe there is something there eg Kosmos, Google Ai co scientist blog.google/feed/google-.... (3)

We’re launching a new AI system for scientists.

Today Google is launching an AI co-scientist, a new AI system built on Gemini 2.0 designed to aid scientists in creating novel hypotheses and research plans. Researchers…

blog.google

November 21, 2025 at 12:05 PM

Aaron Tay

@aarontay.bsky.social

A while ago I fell down the rabbit hole reading about Literature-based discovery while made connections in literature or suggest hypotheses & last month Elicit wrote a very detailed post on this elicit.com/blog/literat... (2)

Looking for Hidden Gems in Scientific Literature - Elicit

Scientific literature is vast and contains within it as yet unnoticed connections. Literature-based discovery is an attempt to bring them to light.

elicit.com

November 21, 2025 at 12:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news