It's called "Master Machine Learning with scikit-learn: A Practical Guide to Building Better Models with Python"
Download the first 3 chapters right now:
👉 dataschool.kit.com/mlbook 👈
Thanks for your support 🙏
It's called "Master Machine Learning with scikit-learn: A Practical Guide to Building Better Models with Python"
Download the first 3 chapters right now:
👉 dataschool.kit.com/mlbook 👈
Thanks for your support 🙏
My solution is short (48 LOC) and relatively general-purpose – I used skrub to preprocess string and date columns, and pytabkit to create an ensemble of RealMLP and TabM models. Link below👇
My solution is short (48 LOC) and relatively general-purpose – I used skrub to preprocess string and date columns, and pytabkit to create an ensemble of RealMLP and TabM models. Link below👇
• The paper arxiv.org/html/2502.05...
• The python package: pypistats.org/packages/tab... (try it out 🐍)
• The source code github.com/soda-inria/t... (100% open source, including pre-training 💞)
Longer read (5mn): gael-varoquaux.info/science/tabi...
8/9
• The paper arxiv.org/html/2502.05...
• The python package: pypistats.org/packages/tab... (try it out 🐍)
• The source code github.com/soda-inria/t... (100% open source, including pre-training 💞)
Longer read (5mn): gael-varoquaux.info/science/tabi...
8/9
With Jingang Qu, @dholzmueller.bsky.social, and Marine Le Morvan
TL;DR: a well-designed architecture and pretraining gives best tabular learner, and more scalable
On top, it's 100% open source
1/9
With Jingang Qu, @dholzmueller.bsky.social, and Marine Le Morvan
TL;DR: a well-designed architecture and pretraining gives best tabular learner, and more scalable
On top, it's 100% open source
1/9
www.dataschool.io/ai-progress-...
www.dataschool.io/ai-progress-...
If you are new to reinforcement learning, this article has a generous intro section (PPO, GRPO, etc)
Also, I cover 15 recent articles focused on RL & Reasoning.
🔗 magazine.sebastianraschka.com/p/the-state-...
Models that use late interaction, like ColBERT, ColPali, and ColQwen, gain significant benefits from this pooling technique! By integrating token pooling methods, the number of vectors to store can be reduced.
Blog: www.answer.ai/posts/colber...
Models that use late interaction, like ColBERT, ColPali, and ColQwen, gain significant benefits from this pooling technique! By integrating token pooling methods, the number of vectors to store can be reduced.
Blog: www.answer.ai/posts/colber...
Kaggle Discussion: www.kaggle.com/competitions...
Kaggle Discussion: www.kaggle.com/competitions...
A good read for building secure AI!
arxiv.org/pdf/2503.18813
A good read for building secure AI!
arxiv.org/pdf/2503.18813
But with LangChain & LangGraph, you can build a chatbot that integrates web search into ANY model you like!
You'll learn how to do that (and much more) in my new AI course...
Sign up for EARLY ACCESS:
👉 dataschool.kit.com/agents 👈
But with LangChain & LangGraph, you can build a chatbot that integrates web search into ANY model you like!
You'll learn how to do that (and much more) in my new AI course...
Sign up for EARLY ACCESS:
👉 dataschool.kit.com/agents 👈
It's still important to learn fundamentals from scratch for growth and problem-solving (e.g be able to fix things)! 😁
It's still important to learn fundamentals from scratch for growth and problem-solving (e.g be able to fix things)! 😁
My company has a bunch of unused T4 GPUs because the LLMs are too big for AI teams run exps. Now the data science team finally has a reason to ask for them! 🤣
developer.nvidia.com/blog/nvidia-...
My company has a bunch of unused T4 GPUs because the LLMs are too big for AI teams run exps. Now the data science team finally has a reason to ask for them! 🤣
developer.nvidia.com/blog/nvidia-...
www.dataschool.io/pandas-strea...
Learn how to identify & analyze scoring streaks using pandas operations:
- shift()
- cumsum()
- boolean math
- groupby()
www.dataschool.io/pandas-strea...
Learn how to identify & analyze scoring streaks using pandas operations:
- shift()
- cumsum()
- boolean math
- groupby()
I now have a much deeper appreciation for Data School's course and regard it as the best scikit-learn course.
Master Machine Learning with scikit-learn: courses.dataschool.io/master-machi...
I now have a much deeper appreciation for Data School's course and regard it as the best scikit-learn course.
Master Machine Learning with scikit-learn: courses.dataschool.io/master-machi...
-- Andrew Ng, legendary AI researcher
Source: www.deeplearning.ai/the-batch/is...
-- Andrew Ng, legendary AI researcher
Source: www.deeplearning.ai/the-batch/is...
www.youtube.com/watch?v=hdWW...
super powerful to easily assemble production-ready pipelines in easy syntax
www.youtube.com/watch?v=hdWW...
super powerful to easily assemble production-ready pipelines in easy syntax
self-attention → parameterized self-attention → causal self-attention → multi-head self-attention
www.youtube.com/watch?v=-Ll8...
self-attention → parameterized self-attention → causal self-attention → multi-head self-attention
www.youtube.com/watch?v=-Ll8...
Happy reading!
Happy reading!
The tool is called Typing Mind and I decided to pay $30 for lifetime access. It was well worth it.
Kevin's post 👇
www.dataschool.io/save-money-o...
The tool is called Typing Mind and I decided to pay $30 for lifetime access. It was well worth it.
Kevin's post 👇
www.dataschool.io/save-money-o...
OpenAI was the clear winner 🏆
Neat study by @binarybits.bsky.social, read more here: www.understandingai.org/p/these-expe...
OpenAI was the clear winner 🏆
Neat study by @binarybits.bsky.social, read more here: www.understandingai.org/p/these-expe...
- Tokenizing raw text and converting tokens into token IDs
- Applying byte pair encoding
- Setting up data loaders in PyTorch for efficient training
- Tokenizing raw text and converting tokens into token IDs
- Applying byte pair encoding
- Setting up data loaders in PyTorch for efficient training