Martin Elstner
banner
martin.elstner.dev
Martin Elstner
@martin.elstner.dev
AI and ML engineering. Search and recommendation. Founder of Elstner Analytics. Helping companies solve their data issues
https://elstner.dev
Coolcool. Hab so ein Brother Hobby Teil, da operiert man mit einem 5mL Tropfer
November 10, 2025 at 10:37 AM
Die braucht aber nicht den ganzen Beutel?
November 10, 2025 at 9:51 AM
Yeah in a 1:1 qwen is typically stronger. Granite is best for “non-chinese, permissive license” environments
October 29, 2025 at 7:17 AM
The model family in is entirety is really good. As many businesses don’t like to build on qwen base models (smth along the lines “Chinese model will steal our data), they are our goto starting point for local deployment now
October 29, 2025 at 7:10 AM
They excluded the exact prompts, as this is probably relevant IP but you’ adapt them to your specific use case anyway. All in all, a great find!
October 16, 2025 at 10:40 PM
LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings
arxiv.org
October 16, 2025 at 10:40 PM
They provide a nice blog post: www.pymc-labs.com/blog-posts/A...,
AI-based Customer Research: Faster & Cheaper Surveys with Synthetic Consumers
www.pymc-labs.com
October 16, 2025 at 10:40 PM
This works much better than the direct approach, but still less effective than the proposed SSR technique.
October 16, 2025 at 10:40 PM
They also found that directly prompting for a Likert-scaled candidate evaluation results in poor reliability and an over-sampling of ‚unsure‘ votes. Interestingly, they also tested a second LLM call to map generated written responses onto a defined scale.
October 16, 2025 at 10:40 PM
and calculating the vector similarity to a set of pre-defined reference answers (semantic similarity ranking, SSR). They worked with Colgate and could test against a large set of real consumer surveys and found very good correlation between the LLM-generated answers and observed consumer behaviour.
October 16, 2025 at 10:40 PM
This worked for all our previous use cases but we had to put quite a bit of effort into the classification part.
PyMC Labs (@pymc-labs.bsky.social) published a study that addresses this challenge by embedding the open-ended model answer
October 16, 2025 at 10:40 PM
But you are still facing the problem of getting your numeric score. In the past, we worked with clustering, topic modeling and classic ML techniques to map these answers onto defined categories.
October 16, 2025 at 10:40 PM
Models tend to average everything out and will produce ‚not sure‘ answers in most cases. If you ask for written explanations (prompting like ‚explain in three sentences why you would like to buy the presented product‘) in an open-ended fashion, you can trigger much more valuable results.
October 16, 2025 at 10:40 PM
Conventional in-person surveys typically measure that on a Likert scale (ranging from ‚definitely no’ to ‚definitely yes‘). The obvious idea here is to ask the LLM for that score directly, but unfortunately this doesn’t work.
October 16, 2025 at 10:40 PM
First, define a persona and let the model role play this persona. So we need some demographic parameters to pack into a prompt to construct our synthetic consumer. Second, we need to quantify user intent.
October 16, 2025 at 10:40 PM
What would you expect? Single line prompt gave me this:
October 15, 2025 at 6:41 PM
And a super fresh one 😉

bsky.app/profile/chem...
New in Chemical Science!

"Towards Large-scale Chemical Reaction Image Parsing via a Multimodal Large Language Model" by Hanyu Gao et al. from the Hong Kong University of Science and Technology.

Read it for free here: doi.org/10.1039/D5SC...
October 15, 2025 at 3:23 PM
And we should always know when not to use AI/ML:

bsky.app/profile/geof...
I mean, the worst part is that there are actual deterministic #openscience tools to do name => SMILES (or other chemical format) + depict in 2D. Combine OPSIN with RDKit.. maybe check PubChem or ChEMBL or other open database if people use an informal term.
Can generative AI be trusted to draw chemical structures? Not yet, according to two chemists who want to see the community take a tough stance against its use.
October 15, 2025 at 3:07 PM
There are some open models, e.g: huggingface.co/AI4Chem/Chem... (but the focus is on the inverse task). Also quite some activity behind corporate doors
AI4Chem/ChemVLM-8B · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
October 15, 2025 at 2:58 PM
Business opportunity: sawing as a service
Also gives a snappy acronym
October 9, 2025 at 12:42 PM
Und dann noch der AI-content den wir nicht als slop erkennen (jeder, der sich ein klein wenig Mühe gibt, schafft es heute Texte generieren zu lassen, die nicht auffallen). Ist zZ echt schwer das Ausmaß abzuschätzen. Wir sehen offensichtlich nur den schlecht gemachten Teil.
October 6, 2025 at 6:32 AM