Lightnews — Scholar-powered news

Nanne van Noord

@nanne.bsky.social

Wait, you get summaries of your own papers? That seems like step up from the "I see you work on <insert topic I've not touched in my life>" emails at least

November 19, 2025 at 8:39 AM

Reposted by Nanne van Noord

EurIPS Conference

@euripsconf.bsky.social

And lastly, if @neuripsconf.bsky.social would choose to reverse the decisions on the papers affected by space constraints, we would be happy and able to accommodate their presentation

September 19, 2025 at 10:01 AM

Nanne van Noord

@nanne.bsky.social

You're arguing in bad faith, so this will be my last reply.

But yes, if you actually want to learn about multimodality then you shouldnt read about MLLM.

July 27, 2025 at 8:02 PM

Nanne van Noord

@nanne.bsky.social

I'm not sure what the point here is, but if you're going to believe Gemini over actual research done by AI researchers there isn't much more to discuss.

If you're willing to actually learn about this then you can start here: arxiv.org/abs/2505.19614, or even here: academic.oup.com/dsh/article/...

July 27, 2025 at 7:14 PM

Nanne van Noord

@nanne.bsky.social

That's a bit sealion-y, but I'll bite - *artificial* neural networks are a poorly analogy.

Those different details also matter a lot; especially because the brain isn't just floating in a jar, it's part of an embodied system.

July 27, 2025 at 7:05 PM

Nanne van Noord

@nanne.bsky.social

This is where your misunderstanding is happening, as they are not elementary pieces. For the visual tokens a lot of the semantics have already been determined, and hence the interpretations it can arrive at are limited.

Brain analogy really doesnt hold here. NN != Brains.

July 27, 2025 at 2:59 PM

Nanne van Noord

@nanne.bsky.social

Its clearly not; neural nets are a poor analogy for the brain, and clearly don't work the same way.

July 27, 2025 at 2:54 PM

Nanne van Noord

@nanne.bsky.social

This, plus the (initial) interpretation of the modalities should not be independent - even at the pixel/word-level we may want to interpret differently depending on the other modalities (e.g., sense disambiguation)

Partial Information Decomposition has been used to formalise some of this

July 27, 2025 at 8:49 AM

Nanne van Noord

@nanne.bsky.social

No.. that's not how any of that works 😵‍💫

July 27, 2025 at 8:13 AM

Nanne van Noord

@nanne.bsky.social

It means I said 'mix' to explain the process, but I obviously know this involves attention - so the Gemini explanation is not meaningfully different.

Potential limited: if key visual info is missing, then attention wont recover that. So alot of 'decisions' about visual are made before fusion

July 26, 2025 at 11:00 PM

Nanne van Noord

@nanne.bsky.social

Ah, I see how you and Gemini misunderstood. I was talking about extracting visual tokens, and mix referred to attention.

That doesnt make it meaningfully multimodal; potential of visual tokens is still limited by visual encoder.

Anyway, if I wanted to talk to an LLM I would do that directly

July 26, 2025 at 10:37 PM

Nanne van Noord

@nanne.bsky.social

Please do explain then how whatever you're referring to is different and actually meaningfully multimodal.

July 26, 2025 at 10:08 PM

Nanne van Noord

@nanne.bsky.social

*all semantic information* is quite the claim; in our experiments they miss a lot of semantics from visual

'text space' in that after the image encoder the visual information is fixed, and mixed with text tokens for seq2text - which is not how multimodality works..

July 26, 2025 at 8:41 PM

Nanne van Noord

@nanne.bsky.social

Natively is a bit of an exaggeration, as it's mostly just other modalities mapped to text space as input - but this makes their 'understanding' rather shallow

July 26, 2025 at 7:51 PM

Nanne van Noord

@nanne.bsky.social

If the priority is to dunk on people that know less about AI, instead of being accurate, that could be a conclusion I guess.

July 18, 2025 at 4:09 PM

Nanne van Noord

@nanne.bsky.social

It would be weird to describe this 2012 system, that is doing search, as an SVM classifier doing search: www.robots.ox.ac.uk/%7Evgg/publi...

Similarly, I wouldn't describe an LLM that translates a query to a destination for a Waymo as an 'LLM driving a car'

Visual Geometry Group - University of Oxford

Computer Vision group from the University of Oxford

www.robots.ox.ac.uk

July 18, 2025 at 3:41 PM

Nanne van Noord

@nanne.bsky.social

I'm not questioning your definition of searching, I'm questioning your use of "LLMs".

I don't think defining an LLM as a transformer-based NN is inaccurate, in which case it isn't doing search by itself, and then it would be fine to argue that it can only hallucinate.

July 18, 2025 at 3:41 PM

Nanne van Noord

@nanne.bsky.social

That statement mostly seems to apply to hosted commercial systems. It takes more than just downloading an LLM from huggingface to have a system that does this.

Sure an LLM can be trained to formulate queries and process results, but the system doing the searching is more than 'just' an LLM.

July 18, 2025 at 2:51 PM

Nanne van Noord

@nanne.bsky.social

Fair, but still meaningful to make the distinction between LLMs and reasoning models, as not all LLMs are reasoning models. Especially if the point is to communicate across silos.

July 18, 2025 at 1:52 PM

Nanne van Noord

@nanne.bsky.social

Do LLMs do search? Afaik there have been systems built around LLMs that do search, and then send these results back to them (i.e., RAG-like) - but that isn't the same as an LLM doing search.

July 18, 2025 at 1:30 PM

Nanne van Noord

@nanne.bsky.social

I couldnt find EurIPS registration costs; hopefully they can address this by lowering costs for authors

But yes - this has been absurd; especially for those with visa issues - and I do think for that group this is a (minor) improvement

July 17, 2025 at 8:52 AM

Nanne van Noord

@nanne.bsky.social

Not my intention to defend the requirement for a full registration, but this has been common practice for a while across multiple conferences.

The main change of new locations seems primarily that those with US visa issues will be able to present somewhere. But it doesnt really change costs

July 17, 2025 at 8:31 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news