Our long-term goal is to directly connect database tables to machine learning estimators.
https://skrub-data.org
https://discord.gg/ABaPnm7fDC
SelectCols and DropCols can be used as "filtering blocks" in a pipeline.
SelectCols and DropCols can be used as "filtering blocks" in a pipeline.
ApplyToCols lets you select a subset of columns in your dataframe, then applies a transformer to each selected column separately.
ApplyToCols lets you select a subset of columns in your dataframe, then applies a transformer to each selected column separately.
Our talk was very well received, and we got a lot of great questions, especially about scalability and how to interface with other libraries in production environments.
Our talk was very well received, and we got a lot of great questions, especially about scalability and how to interface with other libraries in production environments.
To address this, skrub generates a parallel coordinate plot that visualizes all runs and the parameters used to achieve specific results.
To address this, skrub generates a parallel coordinate plot that visualizes all runs and the parameters used to achieve specific results.
Then you might want to try the skrub SquashingScaler. The SquashingScaler behaves like scikit-learn RobustScaler, but smoothly clips outliers to predefined boundaries.
Then you might want to try the skrub SquashingScaler. The SquashingScaler behaves like scikit-learn RobustScaler, but smoothly clips outliers to predefined boundaries.
Tune hyperparameters where they're defined, and explore the resulting space with a parallel coordinate plot
Tune hyperparameters where they're defined, and explore the resulting space with a parallel coordinate plot
🚀 Major update! Skrub DataOps, various improvements for the TableReport, new tools for applying transformers to the columns, and a new robust transformer for numerical features are only some of the features included in this release.
🚀 Major update! Skrub DataOps, various improvements for the TableReport, new tools for applying transformers to the columns, and a new robust transformer for numerical features are only some of the features included in this release.
This time we will focus on how expressions can simplify the construction of complex hyperparameter grids.
This time we will focus on how expressions can simplify the construction of complex hyperparameter grids.
As this is a preview of an upcoming feature, we are looking for your thoughts and feedback before release.
As this is a preview of an upcoming feature, we are looking for your thoughts and feedback before release.
✅ Filter columns
🔎 Look at each column's distribution
📊 Get a high level view of the distributions through stats and plots, including correlated columns
🌐 Export the report as html
✅ Filter columns
🔎 Look at each column's distribution
📊 Get a high level view of the distributions through stats and plots, including correlated columns
🌐 Export the report as html
skrub-data.org/skrub-materi...
skrub-data.org/skrub-materi...
◼ Encode strings faster and better with StringEncoder!
StringEncoder applies a tf-idf vectorization followed by SVD to produce high quality and FAST embeddings of textual and categorical features.
◼ Encode strings faster and better with StringEncoder!
StringEncoder applies a tf-idf vectorization followed by SVD to produce high quality and FAST embeddings of textual and categorical features.
skrub.patch_display() adds the TableReport as a default representation for all dataframes
skrub.column_association to check which columns are linked...
Check out the changelog:
skrub-data.org/stable/CHANG...
skrub.patch_display() adds the TableReport as a default representation for all dataframes
skrub.column_association to check which columns are linked...
Check out the changelog:
skrub-data.org/stable/CHANG...
◼ tighter layout
◼ support any script (any alphabet حب माया) in the plots
◼ robust to outliers
It works without dependencies, in any html-based environment (Jupyter notebooks, @vscode.dev, a simple web page...)
Check it out on skrub-data.org
4/5
◼ tighter layout
◼ support any script (any alphabet حب माया) in the plots
◼ robust to outliers
It works without dependencies, in any html-based environment (Jupyter notebooks, @vscode.dev, a simple web page...)
Check it out on skrub-data.org
4/5
As always the TableVectorizer is very handy for preparation of data-frames, and it now comes with an option to drop those pesky columns
skrub-data.org/stable/refer...
3/5
As always the TableVectorizer is very handy for preparation of data-frames, and it now comes with an option to drop those pesky columns
skrub-data.org/stable/refer...
3/5
for pipelines that predict great on dataframes of mixed types.
Skrub ensure the language model is downloaded, cached, picklable, everything for easy ops
2/5
for pipelines that predict great on dataframes of mixed types.
Skrub ensure the language model is downloaded, cached, picklable, everything for easy ops
2/5
◼ Easily use deep learning for text entries
◼ TableVectorizer can remove columns with too many missing values
◼ TableReport more robust and prettier
...
1/5
◼ Easily use deep learning for text entries
◼ TableVectorizer can remove columns with too many missing values
◼ TableReport more robust and prettier
...
1/5
AKA: 📈 we heard you liked plots so we put plots in your tables 📈
AKA: 📈 we heard you liked plots so we put plots in your tables 📈