Lightnews — Scholar-powered news

Reposted by Georg Bökman

Dmytro Mishkin @ducha-aiki.bsky.social · 1d

Do you trust RANSAC stopping criterion?
- Yes, confidence=0.99 FTW
- No, max_iter FTW
- WTF are you talking about?
Answer in comments.

4 1 6

Georg Bökman @bokmangeorg.bsky.social · 1d

max_iter

1

Georg Bökman @bokmangeorg.bsky.social · 1d

Not even in the list of references...?

1

Georg Bökman @bokmangeorg.bsky.social · 1d

Pro tip: For good Halloween vibes, use non-normalized RoPE on images larger than your training resolution and larger than the composite period of some of the RoPE-rotations. You might get scary ghost structures in your features.

1 3 11

Georg Bökman @bokmangeorg.bsky.social · 2d

Fair enough, I guess github.com/scipy/scipy/...

Georg Bökman @bokmangeorg.bsky.social · 2d

Mfw

1

Georg Bökman @bokmangeorg.bsky.social · 2d

scipy deprecates `sph_harm` and replaces it with `sph_harm_y` where `n` and `m` have switched order and `theta` and `phi` have switched meaning 🙃

2 4

Georg Bökman @bokmangeorg.bsky.social · 4d

Reviewer chat is possible on OpenReview, it's just generally not activated. For NLDL it is activated, not sure I like it though. docs.openreview.net/getting-star...

Live Chat on the Forum Page | OpenReview

docs.openreview.net

3

Georg Bökman @bokmangeorg.bsky.social · 14d

Using Fourier theory of finite groups, we can block-diagonalize these group-circulant matrices. Hence, incorporating symmetries (group equivariance) in neural networks can make the networks faster. We used this to obtain 𝑞𝑢𝑖𝑐𝑘𝑒𝑟 𝑉𝑖𝑇𝑠. arxiv.org/abs/2505.15441

6

Georg Bökman @bokmangeorg.bsky.social · 14d

Mapping such 8-tuples to new 8-tuples that permute in the same way under transformations of the input is done by convolutions over the transformation group, or (equivalently) multiplication with group-circulant matrices.

1 2

Georg Bökman @bokmangeorg.bsky.social · 14d

Images (or image patches) are secretly multi-channel signals over groups. Below, the dihedral group of order 8: reflecting/rotating the image permutes the values in the magenta vector. So we can reshape the image into 8-tuples that all permute according to the dihedral group (edge case diagonals).

2 1 9

Georg Bökman @bokmangeorg.bsky.social · 17d

The recovery is on

1

Georg Bökman @bokmangeorg.bsky.social · 17d

Had a skim of Kostelec-Rockmore. There are some interesting pointers suggesting non-triviality of fast implementations of asymptotically fast FFTs at the end. 🙃 Also, there seems to be a version that uses three 1D FFTs, but it is not as fast as possible asymptotically.

1 1

Georg Bökman @bokmangeorg.bsky.social · 17d

At least some FFTs for SO(3) work by separation of variables and a sequence of 1D FFTs right? So is the butterfly decomposition "straightforward" for them? Regarding small finite groups, the entire FFT might be unnecessary and can simply be a dense fourier transform matrix.

1

Georg Bökman @bokmangeorg.bsky.social · 18d

@parskatt.bsky.social

1 1

Georg Bökman @bokmangeorg.bsky.social · 19d

Do you have good examples from other areas of taking the hardware as the prior?

1

Georg Bökman @bokmangeorg.bsky.social · 21d

Also quite generous to cite the paper as a generic reference for the term "FLOPs" 😅

1 2

Georg Bökman @bokmangeorg.bsky.social · 21d

Nice LLM generated citation found by @davnords.bsky.social. I wonder who M. Lindberg and A. Andersson are...

2 4

Georg Bökman @bokmangeorg.bsky.social · 21d

Got to honor the traditions. "In Sweden, the west coast city of Gothenburg is known for its puns."

1 3

Georg Bökman @bokmangeorg.bsky.social · 22d

Big if true

1 4

Reposted by Georg Bökman

Aaron Roth @aaroth.bsky.social · 23d

The opportunities and risks of the entry of LLMs into mathematical research in one screenshot. I think it is clear that LLMs will make trained researchers more effective. But they will also lead to a flood of bad/wrong papers, and I'm not sure we have the tools to deal with this.

4 6 31

Georg Bökman @bokmangeorg.bsky.social · 22d

Nice perspective, you look like a giant! And congrats!

1

Georg Bökman @bokmangeorg.bsky.social · 24d

If you were working at meta you could have called the paper "Mental rotation capabilities emerge at scale with DINOv3" :)

3

Georg Bökman @bokmangeorg.bsky.social · 24d

I see, yeah plots of proportions over the layers would be cool!

1

Georg Bökman @bokmangeorg.bsky.social · 25d

Also, I think it is possible to argue for equivariance at scale from a purely computational perspective. bsky.app/profile/bokm...

Georg Bökman @bokmangeorg.bsky.social · 29d

A simple argument for equivariance at scale: 1) At scale, token-wise linear layers dominate compute. 2) Token-wise linear equivariant layers implemented in the Fourier domain are block-diagonal and hence fast.