Ahmed Imtiaz Humayun,
Utku Evci
@suvinay.bsky.social
Amir Yazdanbakhsh,
@dalistarh.bsky.social
@gkdziugaite.bsky.social
Project/Code/Models: sparsellm.com
Paper: arxiv.org/abs/2501.12486
Session: April 24 Poster Session #2 (Hall 3 + Hall 2B #342)
2/N
Ahmed Imtiaz Humayun,
Utku Evci
@suvinay.bsky.social
Amir Yazdanbakhsh,
@dalistarh.bsky.social
@gkdziugaite.bsky.social
Project/Code/Models: sparsellm.com
Paper: arxiv.org/abs/2501.12486
Session: April 24 Poster Session #2 (Hall 3 + Hall 2B #342)
2/N
Joint work with @ellieyhc.bsky.social
@zackankner.bsky.social
Nikunj Saunshi,
Blake M. Elias,
Amir Yazdanbakhsh,
Jonathan Ragan-Kelley,
Suvinay Subramanian,
@mcarbin.bsky.social
Joint work with @ellieyhc.bsky.social
@zackankner.bsky.social
Nikunj Saunshi,
Blake M. Elias,
Amir Yazdanbakhsh,
Jonathan Ragan-Kelley,
Suvinay Subramanian,
@mcarbin.bsky.social
- Decoding latency (how fast and parallel is the decoding?)
- Response quality (evaluated by another LLM)
8/N
- Decoding latency (how fast and parallel is the decoding?)
- Response quality (evaluated by another LLM)
8/N