Lightnews — Scholar-powered news

Reposted by Raphael Pisoni

hardmaru

@hardmaru.bsky.social

The US government should subsidize Open AI rather than OpenAI

November 7, 2025 at 6:43 AM

Reposted by Raphael Pisoni

Yuki Asano

@yukimasano.bsky.social

On the occasion of the 1000th citation of our Sinkhorn-Knopp self-supervised representation learning paper, I've written a whole post about the history and the key bits of this method that powers the state-of-the-art SSL vision models.

Read it here :): docs.google.com/document/d/1...

October 15, 2025 at 10:00 AM

Raphael Pisoni

@4rtemi5.bsky.social

We're ready!

roon @tszzl.bsky.social · Sep 20

might be time

September 21, 2025 at 6:39 AM

Raphael Pisoni

@4rtemi5.bsky.social

The single most undervalued property of neural networks is self-consistency. We should change that!

September 6, 2025 at 12:58 PM

Reposted by Raphael Pisoni

asker the gauche, glycojohn destroyer of carbs

@johnbender.bsky.social

August 8, 2025 at 3:56 AM

Raphael Pisoni

@4rtemi5.bsky.social

You've been researching for a while!
Time to have some SOTA!

#aislop

July 26, 2025 at 12:51 PM

Raphael Pisoni

@4rtemi5.bsky.social

You and Adam keep beating Sota? Stop doing that! Poor Sota!

July 26, 2025 at 9:50 AM

Raphael Pisoni

@4rtemi5.bsky.social

Have some cool idea but only evaluate it on small models? Tough luck buddy. You only get your paper accepted if your experimental results are 0.2% above SOTA and too expensive to falsify!

Is academic publishing pay to win yet?

July 26, 2025 at 9:45 AM

Raphael Pisoni

@4rtemi5.bsky.social

Is there a reason why none of the recent models use RBF-kernel Attention to get rid of the softmax-bottleneck for long context?
I tried replacing dot-product attention with the negative squared KQ-distance and was able to remove the softmax without issues and loss in performance!

July 23, 2025 at 8:14 PM

Reposted by Raphael Pisoni

NeurIPS Conference

@neuripsconf.bsky.social

NeurIPS is endorsing EurIPS, an independently-organized meeting which will offer researchers an opportunity to additionally present NeurIPS work in Europe concurrently with NeurIPS.

Read more in our blog post and on the EurIPS website:
blog.neurips.cc/2025/07/16/n...
eurips.cc

eurips.cc

A NeurIPS-endorsed conference in Europe held in Copenhagen, Denmark

eurips.cc

July 16, 2025 at 10:05 PM

Raphael Pisoni

@4rtemi5.bsky.social

Has anyone experimented with "conditional gradients"?
Thinking about a setup where, within a specific activation range (e.g., right before a ReLU), you'd only permit positive or negative gradients.

July 8, 2025 at 5:59 AM

Raphael Pisoni

@4rtemi5.bsky.social

Quick question to the SSL experts out there: Usually you evaluate an ssl-model by freezing it and training a linear probing layer. Would it be fair to somehow learn a final layer with more dimensions than classes and do a nearest-neighbor evaluation?

June 29, 2025 at 11:17 AM

Reposted by Raphael Pisoni

David Picard

@davidpicard.bsky.social

There is an oak forest in central France that was planted 400 years ago by Colbert so that France would have quality hard wood by the 2000s to build ships for its navy.
This is the type of long term planning that Seldonian predictions can help improving.

June 17, 2025 at 8:17 AM

Reposted by Raphael Pisoni

Nafnlaus 🇮🇸 🇺🇦 🇬🇪

@nafnlaus.bsky.social

New anti-censorship jailbreak just dropped ;)

May 13, 2025 at 2:17 AM

Raphael Pisoni

@4rtemi5.bsky.social

Currently on my way to #ICLR in Singapore where we'll present our latest paper on space folding in neural networks.
Would be happy to meet some people there so if you're at ICLR as well and want to hang out feel free to pm!🙂

April 18, 2025 at 11:19 AM

Raphael Pisoni

@4rtemi5.bsky.social

Grok this! What a roller-coaster of emotions...🤪

April 16, 2025 at 7:01 PM

Reposted by Raphael Pisoni

Wissam Antoun

@wissamantoun.bsky.social

ModernBERT or DeBERTaV3?

What's driving performance: architecture or data?

To find out we pretrained ModernBERT on the same dataset as CamemBERTaV2 (a DeBERTaV3 model) to isolate architecture effects.

Here are our findings:

April 14, 2025 at 3:41 PM

Reposted by Raphael Pisoni

Dmytro Mishkin

@ducha-aiki.bsky.social

Just assembled a slide about local feature training time/dataset size.
Anything wrong/missing?

April 13, 2025 at 11:20 AM

Raphael Pisoni

@4rtemi5.bsky.social

Is the project even still worth doing when wandb runs out of funny names or am I cooked?🫠

April 11, 2025 at 11:11 PM

Reposted by Raphael Pisoni

Jeremy Morrell

@jeremymorrell.dev

Meta introduced Llama 4 models and added this section near the very bottom of the announcement 😬

“[LLMs] historically have leaned left when it comes to debated political and social topics.”

ai.meta.com/blog/llama-4...

Meta
Addressing bias in LLMs

It's well-known that all leading LLMs have had issues with bias-specifically, they historically have leaned left when it comes to debated political and social topics. This is due to the types of training data available on the internet.

Our goal is to remove bias from our Al models and to make sure that Llama can understand and articulate both sides of a contentious issue. As part of this work, we're continuing to make Llama more responsive so that it answers questions, can respond to a variety of different viewpoints without passing judgment, and doesn't favor some views over others.

We have made improvements on these efforts with this release—Llama 4 performs significantly better than Llama 3 and is comparable to Grok:

• Llama 4 refuses less on debated political and social topics overall (from 7% in Lama 3.3 to below 2%).
• Llama 4 is dramatically more balanced with which prompts it refuses to respond to (the proportion of unequal response refusals is now less than 1% on a set of debated topical questions).
• Our testing shows that Llama 4 responds with strong political lean at a rate comparable to Grok (and at half of the rate of Llama 3.3) on a contentious set of political or social topics. While we are making progress, we know we have more work to do and will continue to drive this rate further down.
We're proud of this progress to date and remain committed to our goal of eliminating overall bias in our models.

April 5, 2025 at 10:08 PM

Reposted by Raphael Pisoni

ETH CS Department

@csateth.bsky.social

🚀Hello, world! We are now live on Bluesky. This is the official account of the Department of Computer Science at ETH Zurich. Follow us for cutting-edge research, the latest innovations, event updates and insights into the future of technology. inf.ethz.ch
@csateth.bsky.social @ethzurich.bsky.social

Department of Computer Science

Computer Science Department at ETH Zurich. The department offers highest quality in computer science research and education and adds to business and industry growth.

inf.ethz.ch

March 24, 2025 at 9:24 AM

Raphael Pisoni

@4rtemi5.bsky.social

Recently had the pleasure of helping @miclew.bsky.social with a couple of his papers in exchange for him helping me with a couple of mine!
This is the first fruit of our common work. We quantify space folding in relu neural networks with a range based measure. Lots of fun to write and read!😉

TMLR Published Papers @tmlr-pub.bsky.social · Mar 10

On Space Folds of ReLU Neural Networks

Michal Lewandowski, Hamid Eghbalzadeh, Bernhard Heinzl, Raphael Pisoni, Bernhard A. Moser

Action editor: Petar Veličković

https://openreview.net/forum?id=RfFqBXLDQk

#cantornet #similarity #relu

March 24, 2025 at 12:19 PM

Raphael Pisoni

@4rtemi5.bsky.social

x''= 0

March 24, 2025 at 6:55 AM

Reposted by Raphael Pisoni

Gabriele Berton

@berton-gabri.bsky.social

🚀 Paper Release! 🚀
Curious about image retrieval and contrastive learning? We present:

📄 "All You Need to Know About Training Image Retrieval Models"
🔍 The most comprehensive retrieval benchmark—thousands of experiments across 4 datasets, dozens of losses, batch sizes, LRs, data labeling, and more!

March 18, 2025 at 10:41 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news