Michael Hecht
banner
calabi-and-yau.bsky.social
Michael Hecht
@calabi-and-yau.bsky.social
#DataScientist | @kaggle GM off-duty | #Physics PhD | #deeplearning #ai #nlproc 🤖 #datascience 🔬 #stringtheory 🔭 | Views=own 😀 | GPU poor
Reposted by Michael Hecht
distributed learning for LLM?

recently, @primeintellect.bsky.social have announced finishing their 10B distributed learning, trained across the world.

what is it exactly?

🧵
November 25, 2024 at 12:02 PM
Reposted by Michael Hecht
very interesting work and it reminds me a bit of this paper. Tokenizers and ROPE must die. after samplers, i am on to those next ...
arxiv.org/abs/2407.036...
November 25, 2024 at 2:20 AM
Reposted by Michael Hecht
TIL that there's a Gemini @gradio-hf.bsky.social library that lets you automatically build Python chat bots and web apps with just a few lines of code, then (optionally) deploy them as apps or in @huggingface.bsky.social Spaces.

✨🙌 Amazing work, @_akhaliq!!

🔗 github.com/AK391/gemini...
November 25, 2024 at 5:17 AM
Reposted by Michael Hecht
fresh fresh, DuckDB now has a llms dot txt at duckdb.org/duckdb-docs.md
November 20, 2024 at 2:03 PM
Reposted by Michael Hecht
A new paper, "Let Me Speak Freely" has been spreading rumors that structured generation hurts LLM evaluation performance.

Well, we've taken a look and found serious issue in this paper, and shown, once again, that structured generation *improves* evaluation performance!
November 21, 2024 at 6:33 PM
Reposted by Michael Hecht
Google released a new LLM today - gemini-exp-1121, hot on the heels of last week's gemini-exp-1114

It's currently at the top of the Chatbot Arena. I've updated my llm-gemini plugin to support it and used that to run my pelican on a bicycle SVG benchmark

My notes: simonwillison.net/2024/Nov/22/...
November 22, 2024 at 6:18 AM
Reposted by Michael Hecht
This is neat. I added inline dependency metadata so you can run it using `uv run` without having to install it first:

uv run 'http's://gist.githubusercontent.com/simonw/848a3b91169a789bc084a459aa7ecf83/raw/44fe7e0b326832e88beb83748b50104e5e7f70d0/follow_theirs.py

gist.github.com/simonw/848a3...
November 24, 2024 at 4:22 PM
Reposted by Michael Hecht
We created a new dashboard for @zeit.de showing the current state of election polls in Germany ahead of the upcoming election in February. Also shows probabilities for govt. coalitions based on the polls. The full article version also features my favorite sparklines. www.zeit.de/politik/deut...
November 21, 2024 at 2:08 PM
Reposted by Michael Hecht
A lot of gems and insights in this fresh unreleased slide-deck by Niels Rogge (well now it's released hahah - sorry Niels)

=> docs.google.com/presentation...
November 25, 2024 at 10:44 AM
Reposted by Michael Hecht
✨ Jina AI just released Jina-CLIP-v2: A multimodal (images and texts) & multilingual embedding model. Details in 🧵

Model: huggingface.co/jinaai/jina-...

📈 Jina-CLIP-v2 outperforms Jina-CLIP-v1 (by 3% on text-image and text-text tasks)

🧵
November 25, 2024 at 9:43 AM