Thanks @jfcalvo.hf.co and @ameeelie.bsky.social for the feature!
Thanks @jfcalvo.hf.co and @ameeelie.bsky.social for the feature!
By giving models more "time to think," Llama 1B outperforms Llama 8B in math—beating a model 8x its size. The full recipe is open-source!
By giving models more "time to think," Llama 1B outperforms Llama 8B in math—beating a model 8x its size. The full recipe is open-source!
We applied the same data-driven approach that led to SOTA English performance in🍷 FineWeb to thousands of languages.
🥂 FineWeb2 has 8TB of compressed text data and outperforms other datasets.
We applied the same data-driven approach that led to SOTA English performance in🍷 FineWeb to thousands of languages.
🥂 FineWeb2 has 8TB of compressed text data and outperforms other datasets.
FineWeb 2 extends the data driven approach to pre-training dataset design that was introduced in FineWeb 1 to now covers 1893 languages/scripts
Details: huggingface.co/datasets/Hug...
A detailed open-science tech report is coming soon
FineWeb 2 extends the data driven approach to pre-training dataset design that was introduced in FineWeb 1 to now covers 1893 languages/scripts
Details: huggingface.co/datasets/Hug...
A detailed open-science tech report is coming soon
Chai does structure predictions at AlphaFold3 levels of accuracy and able to handle multi-peptide or peptide-ligand complexes rather than just single chains.
Apache 2.0 on HF huggingface.co/chaidiscover...
Chai does structure predictions at AlphaFold3 levels of accuracy and able to handle multi-peptide or peptide-ligand complexes rather than just single chains.
Apache 2.0 on HF huggingface.co/chaidiscover...
#HuggingFace
#ClementDelangue
#ArtificialIntelligence
www.cosmico.org/6-ai-predict...
#HuggingFace
#ClementDelangue
#ArtificialIntelligence
www.cosmico.org/6-ai-predict...
At @huggingface.bsky.social we'll launch a huge community sprint soon to build high-quality training datasets for many languages.
We're looking for Language Leads to help with outreach.
Find your language and nominate yourself:
forms.gle/iAJVauUQ3FN8...
At @huggingface.bsky.social we'll launch a huge community sprint soon to build high-quality training datasets for many languages.
We're looking for Language Leads to help with outreach.
Find your language and nominate yourself:
forms.gle/iAJVauUQ3FN8...
My notes here: simonwillison.net/2024/Nov/29/...
My notes here: simonwillison.net/2024/Nov/29/...
The primary usecase for the datasets that people are losing their shit over isn't ChatGPT, it's social science research and developing systems that improve Bluesky.
The same 99% will happen here too, but if AI researchers continue to get perma-banned for making available the datasets needed to filter it, it’s going to make this platform unusable.
The primary usecase for the datasets that people are losing their shit over isn't ChatGPT, it's social science research and developing systems that improve Bluesky.
The same 99% will happen here too, but if AI researchers continue to get perma-banned for making available the datasets needed to filter it, it’s going to make this platform unusable.
The same 99% will happen here too, but if AI researchers continue to get perma-banned for making available the datasets needed to filter it, it’s going to make this platform unusable.
jessbpeck.com/posts/bluesk...
anytime i spend less than 5 months on a thing you know i have opinions.
anyway, as always, feel free to argue with me/complain at me/ point out errors.
jessbpeck.com/posts/bluesk...
anytime i spend less than 5 months on a thing you know i have opinions.
anyway, as always, feel free to argue with me/complain at me/ point out errors.
Anyway, buckle up this is about to be a VERY long thread with lots of thoughts and links to papers. 🧵
📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗
huggingface.co/datasets/blu...
Anyway, buckle up this is about to be a VERY long thread with lots of thoughts and links to papers. 🧵
🔗 Read more: venturebeat.com/ai/hugging-f...
🔗 Read more: venturebeat.com/ai/hugging-f...