Lightnews — Scholar-powered news

Birchlabs

@birchlabs.co.uk

ML Engineer at Anlatan (NovelAI). co-author of HDiT (Hourglass Diffusion Transformers). works on diffusion models and LLMs. 日本語を勉強してる。

Posts Replies Media Videos

Birchlabs

@birchlabs.co.uk

pytorch 2.6 is out!
highlights:
- flex attention: better compilation of blockmask creation, better support for dynamic shapes
- cuDNN SDPA: fixes for memory layout
- CUDA 12.6
- python 3.13
- MaskedTensor memory leak fix

January 30, 2025 at 12:41 AM

Birchlabs

@birchlabs.co.uk

Claude does SVG memes

January 14, 2025 at 1:41 AM

Birchlabs

@birchlabs.co.uk

drink cups should put the hole in the bottom.
heat rises. "the top is cool enough to drink" implies "everything below it is colder".
drinking from the bottom lets us access safe temperatures earlier and before the whole cup cools.

January 4, 2025 at 11:37 PM

Birchlabs

@birchlabs.co.uk

when the standard library comments out std::experimental::observer_ptr just to stop you having fun

December 22, 2024 at 2:09 AM

Birchlabs

@birchlabs.co.uk

NovelAI v4 makes dreams come true

December 21, 2024 at 9:49 PM

Birchlabs

@birchlabs.co.uk

Claude's alright

December 17, 2024 at 2:51 AM

Birchlabs

@birchlabs.co.uk

if you care about multiprocess debugging in VSCode please upvote this issue
so we don't have to click terminate a hundred times
github.com/microsoft/vs...

December 16, 2024 at 10:26 PM

Birchlabs

@birchlabs.co.uk

December 10, 2024 at 2:02 PM

Birchlabs

@birchlabs.co.uk

Box2D 3.0.0 in WebAssembly+TypeScript starting to work

December 9, 2024 at 1:32 AM

Birchlabs

@birchlabs.co.uk

incidentally the inductor problem is here.
it could be fixed by omitting the guard altogether, or perhaps eliding the function grid wrapper entirely
github.com/pytorch/pyto...

November 30, 2024 at 9:57 PM

Birchlabs

@birchlabs.co.uk

if you don't do this, then inductor codegen emits invalid python

November 30, 2024 at 9:54 PM

Birchlabs

@birchlabs.co.uk

mood: adding unused variables to make torch inductor compile my triton kernel

triton autotune configs with empty dicts of meta-parameters cannot be compiled by inductor because it will emit an empty-string guard clause, which is not valid Python syntax

we can work around the inductor codegen error by adding an unused meta-parameter to our triton autotune config, but triton codegen will then complain… until we allocate it also in the kernel function signature, unused in our implementation

November 30, 2024 at 9:53 PM

Birchlabs

@birchlabs.co.uk

November 27, 2024 at 5:47 PM

Birchlabs

@birchlabs.co.uk

born too late to learn maths from touhou pre-fight cutscenes
www.youtube.com/watch?v=tuDA...

November 23, 2024 at 1:42 AM

Birchlabs

@birchlabs.co.uk

if you get KO'd in smash, do you die?
what's the safest stage to be KO'd on?
Great Bay looks alright if you're a confident swimmer…

November 21, 2024 at 5:30 PM

Birchlabs

@birchlabs.co.uk

good algorithm

November 20, 2024 at 7:08 PM

Birchlabs

@birchlabs.co.uk

pytorch 2.5.0 bug:
counting flops makes your compiled model slower.
benchmark your model first, count flops after.
github.com/pytorch/pyto...

November 17, 2024 at 6:40 PM

Birchlabs

@birchlabs.co.uk

bluesky is trending on the other place

November 11, 2024 at 12:26 PM

Birchlabs

@birchlabs.co.uk

fuck offfff

November 9, 2024 at 2:30 PM

Birchlabs

@birchlabs.co.uk

okay where has tensor_split been my whole life
pytorch.org/docs/stable/...

November 4, 2024 at 4:09 PM

Birchlabs

@birchlabs.co.uk

that still leaves you with another problem: attention entropy.
if you're inferencing smaller, the probability distribution is sharper than in training. you can scale the logits to compensate.
left = orig
right = entropy-scaled.
arxiv.org/abs/2306.08645

November 2, 2024 at 11:42 PM

Birchlabs

@birchlabs.co.uk

you can fight back by dilating the convolution!
arxiv.org/abs/2310.07702

November 2, 2024 at 11:39 PM

Birchlabs

@birchlabs.co.uk

the reason stable-diffusion generalizes poorly to untrained resolutions is its implicit position embedding.

convolution padding creates an edge, which nested convolutions look for to understand position.
arxiv.org/abs/2101.12322

November 2, 2024 at 11:38 PM

Birchlabs

@birchlabs.co.uk

she just showed up and asked where I keep the grimoires

November 2, 2024 at 11:18 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news