Matthaus Krzykowski
banner
matthausk.bsky.social
Matthaus Krzykowski
@matthausk.bsky.social
CEO/co-founder dltHub, the makers of OSS Python library dlt. At the intersection of single node compute, open storage, Python/in-memory & some others
Reposted by Matthaus Krzykowski
TLDR; we launched on ProductHunt!

After months packed with community-driven features our AI memory tool @cognee.bsky.social has evolved to new heights. Finally AI agents meet the memory they deserve - structured, accurate, and reliable - in 5 lines of code.
April 11, 2025 at 7:30 AM
Meet our own #dlthub Violetta at the upcoming #IcebergSummit with #Python workshop in SF April 9th
🚨 New Video Alert! 🚨

Did you know we're having workshops at #icebergSummit? 😱 Hear from Kevin Liu about the workshop he's putting together with Violetta Mishechkina and Rushan Jiang on getting started with #apacheIceberg using #python.
March 24, 2025 at 3:04 PM
🚀 In the last 6 months we have seen early adopters in the dlt community take advantage of AI code editors such as Cursor. Check our initial assistants and building blocks for custom workflows such as Anthropic MCP servers for dlt on the recently launched hub.continue.dev/dlthub from @continue.dev 🚀
March 13, 2025 at 6:40 PM
Reposted by Matthaus Krzykowski
dlt+ Cache is an example of offline processing that I think is going to become a data engineering + analytics standard.

@mmullins.coginiti.co has shown similar ideas for Coginiti on processing Iceberg data and caching it locally.
Data teams don’t move fast. They wait.

They wait for dbt runs. They wait for queries. They wait to see if something breaks.

Stop testing in production. Start testing like an engineer.

🔗 dlthub.com/blog/staging

#databs #dataengineering
Data Engineers, stop testing in production!
Software engineers don’t test in production. Why are data engineers still doing it? How to fix it? read on...
dlthub.com
February 26, 2025 at 5:08 PM
Reposted by Matthaus Krzykowski
Got a demo of this today, super cool stuff

Reminded me of this

www.datacouncil.ai/talks/proces...
February 26, 2025 at 8:23 PM
Reposted by Matthaus Krzykowski
It’s critical that an open architecture underlies the future of AI-enhanced software development. This is why we are launching Continue 1.0 with hub.continue.dev today: techcrunch.com/2025/02/26/c...
Continue wants to help developers create and share custom AI coding assistants | TechCrunch
Continue helps developers create customized, contextual coding assistants that can connect with any model and development environment.
techcrunch.com
February 26, 2025 at 4:57 PM
Reposted by Matthaus Krzykowski
Continue 1.0 is here! Combining our open-source IDE extensions with hub.continue.dev makes it frictionless to use custom AI code assistants. Discover the models, rules, prompts, docs, and other building blocks you need to become an amplified developer ✨
February 26, 2025 at 4:37 PM
We at dltHub are releasing the initial two features of dlt+, our framework for running dlt in production, in early access:
👉dlt+ Project: A declarative YAML collaboration point for teams
👉dlt+ Cache: A database-like compute layer for developing, testing & running transformations
February 20, 2025 at 6:48 AM
Thanks for letting me hang in your NYC office @aaazzam.bsky.social
January 16, 2025 at 8:53 PM
Going to NYC again this week ! Ping me if you want to grab coffee.
January 13, 2025 at 7:46 PM
notesfrompoland.com/2025/01/08/p... clearly we at dltHub did not go far enough 4y when we played around with the technology at that time
January 10, 2025 at 1:14 PM
One of my favourite electronic music labels in the world is KOMPAKT from Cologne. If you like good electronic work background music, consider their yearly compilations, especially from 12 onwards.

open.spotify.com/playlist/2u5...
KOMPAKT Total [1-24]
Playlist · matt · 438 items · 3.7K saves
open.spotify.com
January 8, 2025 at 8:23 AM
Reposted by Matthaus Krzykowski
Our co-founder @datancoff.ee is wondering whether Python can compete with DuckDB and Spark as the "query engine" for Iceberg open lakehouses. Can it?

tower.dev/blog/buildin...
December 12, 2024 at 6:09 PM
Anyone else thinking that the emergence of the MCP server layer on top of Claude signals that LLMs are coming to data engineering for real? Who's interacting with a MCP server & agent framework already? Any hot takes? github.com/punkpeye/awe...
GitHub - punkpeye/awesome-mcp-servers: A collection of MCP servers.
A collection of MCP servers. Contribute to punkpeye/awesome-mcp-servers development by creating an account on GitHub.
github.com
December 11, 2024 at 7:04 AM
Reposted by Matthaus Krzykowski
This is what I think could be possible with dlt and AI. I basically did this with the Bluesky API docs and the dlt docs and built a new connector.
December 8, 2024 at 5:07 PM
Reposted by Matthaus Krzykowski
👉 The 10x data team at Taktile

How Taktile used Tower & dltHub to enable everyone in the org to contribute to high-quality data sets.

“Tower is like Docker, Dagster, and Jenkins having a baby 🐳+ 🐙+ 🤵🏼‍♂️ = 💜”

– Simon Rosenberger, Head of Data @ Taktile

🍒 Watch: youtu.be/aiKyeo6ZeBA
The 10x data team at Taktile, enabled by Tower and dltHub
YouTube video by Tower
youtu.be
December 4, 2024 at 2:53 PM
Reposted by Matthaus Krzykowski
I've had a crazy few days with my advent calendar of code, looking at Tobiko's SQLMesh with @duckdb.org.

Yesterday, I mentioned in my post that I needed to bring in some data from @bsky.app's HTTP endpoints, and I was going to try using dltHub.

davidsj.substack.com/p/dlt-windsu...
dlt windsurfing
Trying out dlt with DuckDB
davidsj.substack.com
December 4, 2024 at 7:54 PM
Going to AWS Re:Invent Las Vegas Mon-Wed morning at last minute. If you want to talk anything #dlt, #dltHub, #Python, #Iceberg and #datalake, ping me if you want to meet !
December 1, 2024 at 1:03 PM
Reposted by Matthaus Krzykowski
Great things happen when you strip away unnecessary complexity. analytics stacks are overbuilt—ETL, warehouses, and fragmented
🔹 Automating metadata management w/ code/AI
🔹 Connecting to databases and data lakes declaratively
🔹 Defining BI as code+Using dltHub/Evidence
🔗 shorturl.at/V8SZ1
Your Entire Analytics Stack, Just a Few Lines of Code Away
A Simpler Way to Work with Data
open.substack.com
November 15, 2024 at 6:32 PM
Our dlt roadshow is coming to Paris today. Come by if you are around ! lu.ma/gsf3mjbz
dlt Paris Community Meetup #1 w Stellantis, Nao, 42 & Modeo · Luma
About the event We're partnering with local users of the OSS Python library dlt to run the inaugural dlt Paris Community Meetup. Connect with fellow…
lu.ma
November 19, 2024 at 6:37 AM
Reposted by Matthaus Krzykowski
kinda admire the boldness of a blog on 'robust generative AI agents' w/no actual solutions to make these robust

Lots of monitoring & observability to quantify how flaky your system is, but still just a "prompt and pray" approach to having LLMs execute tasks
aws.amazon.com/blogs/machin...
Best practices for building robust generative AI applications with Amazon Bedrock Agents – Part 2 | Amazon Web Services
In this post, we dive into the architectural considerations and development lifecycle practices that can help you build robust, scalable, and secure intelligent agents.
aws.amazon.com
November 15, 2024 at 9:53 AM
Sneaked in watching a Broadway show - Hadestown - on the trip.
November 14, 2024 at 1:59 AM
Was pretty exhausted emotionally by a great work week in SF. Looking for 2h at the creativity inside MoMa NYC today (including seeing studying this Rauschenberg in person after only seeing digital + print copies of it before) was all I needed to revitalise.
November 11, 2024 at 2:25 AM
More on the ML data infra side - we at dltHub are doing an event with LanceDB, Dosu & Continue in SF today. Come by if you are around. Our own Akela will be talking about "Building smarter agents with AI-ready data" lu.ma/y3sh3tpj
Composable AI Infra for Agents · Luma
Join us for an evening of talks about the composable AI infrastructure for agents, hosted by dltHub & LanceDB at the Continue office. Learn about the different…
lu.ma
November 7, 2024 at 5:51 PM