arxiv.org/abs/2411.14257
arxiv.org/abs/2411.14257
arxiv.org/abs/2406.04093
arxiv.org/abs/2406.04093
openreview.net/forum?id=WCR...
openreview.net/forum?id=WCR...
They simplify tuning with k-sparse autoencoders and results show many improvements in explainability. Code, models (not all!) and visualizer included. openreview.net/forum?id=tcs...
They simplify tuning with k-sparse autoencoders and results show many improvements in explainability. Code, models (not all!) and visualizer included. openreview.net/forum?id=tcs...