Junbum Lee
junbumlee.bsky.social
Junbum Lee
@junbumlee.bsky.social
AI/ML GDE. LLM/Continued Pretraining.
Reposted by Junbum Lee
🙅‍♀️ No-code end-to-end example to train your model

1️⃣ Use the Synthetic Data Generator to create your custom dataset

2️⃣ Use AutoTrain to use the generated dataset and train your model

Check it here: huggingface.co/blog/synthet...
December 18, 2024 at 11:28 AM
Reposted by Junbum Lee
A new free tier of GitHub Copilot in Visual Studio Code.

✅ 2,000 code completions per month
💬 50 chat messages per month
💫 Models like Claude 3.5 Sonnet or GPT-4o
♥️ More fun for you

Check it out today!

Oh yeah, and we passed 150M developers on GitHub 💅 github.blog/news-insight...
Announcing 150M developers and a new free tier for GitHub Copilot in VS Code
Come and join 150M developers on GitHub that can now code with Copilot for free in VS Code.
github.blog
December 18, 2024 at 6:19 PM
Reposted by Junbum Lee
The FineWeb team is happy to finally release "FineWeb2" 🥂🥳

FineWeb 2 extends the data driven approach to pre-training dataset design that was introduced in FineWeb 1 to now covers 1893 languages/scripts

Details: huggingface.co/datasets/Hug...

A detailed open-science tech report is coming soon
December 8, 2024 at 9:08 AM