Kostas Pardalis
banner
cpard.bsky.social
Kostas Pardalis
@cpard.bsky.social
1.1K followers 22 following 79 posts
Building https://typedef.ai | host @ https://techontherocks.show | Done some cool stuff with trinodb | ex-RudderStack | previously CEO @ Blendo
Posts Media Videos Starter Packs
That’s a different concept though, right? As you said, here you have a proxy and you pick a different query engine over the same storage. I think using the term federation in this case will confuse people. I can see how this pattern can work.
Full scans on different data sources that then need to be joined and a much closer to ETL workload. This will kill every federated query engine.

Plus what do you do when you have different semantic between different query engines? Let’s say how you handle decimal overflows.
Oh no. Trino tried tried to do that. You really can’t do it. The problem with federated queries is that they work well when you can push computation down to the query engine you federate at and get out a highly reduced dataset. That’s not the case with ETL though.
fenic 0.4.0 brings fenic and its expressive API for working with data, to agents.

With tooling becoming a catalog artifact, MCP servers and toolsets being available with just a cli command you can turn any data set you have into well curated context for your agents.

check it out!
fenic 0.4.0 is live: declarative tools for agents, a production-ready MCP server, and direct reads from HuggingFace plus big DX & reliability gains. 

Highlights:

Declarative tools: define function-calling tools as data (type-safe, reviewable, reusable).
Reposted by Kostas Pardalis
New episode: chatting with bauplan founders Jacopo Tagliabue and Ciro Greco on shipping AI with real-world data constraints.

Why listen

1. Data pipelines determine model effectiveness, far more than most teams admit.
Semantic join is one of Fenic’s AI-native DataFrame functions. They operate over whole tables and relationships—not just individual rows.
7/7

Give it a try, ⭐ the repo, open issues and join the community!

👉 t.co/zDj8rBO5Ce
https://github.com/typedef-ai/fenic
t.co
6/7

Performance & DX

Rust optimizations plus leaner default configs deliver performance gains and a frictionless setup experience.

so you spend less time tuning and more time building.
5/7

New Functions & Models

Access built-in summarization, new semantic APIs, and multiple embedding providers (e.g. Cohere, Google Gemini) out of the box.

This broadens your toolkit, so you can prototype and productionize a wider range of AI workflows quickly.
4/7

Composable Pipelines

Save intermediate DataFrames as persistent views in the fenic catalog.

Reuse and chain complex transformations across jobs without rewriting or rerunning upstream logic, accelerating iteration and collaboration.
3/7

Typed Semantics

Define your output schema once with Pydantic and get back validated, strongly typed results.

This enforces consistency, surfaces errors early, and eliminates manual parsing of LLM responses.
2/7

Robust Fuzzy Text Matching

Ground LLM outputs against your existing data: record linkage, deduplication, and typo-tolerant joins become first-class operations.

This improves precision in extraction pipelines and slashes downstream error rates.
Here's a bit more information on each of the new 🦊 fenic 🦊 features.

1/7 🧵

Dynamic Templating

Turn any column struct or array into a live prompt fragment. No more string concatenation hacks. You get per row, data driven prompts with minimal code, boosting relevance and reducing boilerplate.
fenic v0.3.0 is out and it's a release I'm really excited about!

Here are a few of the things that this release is introducing.

Jinja as a column function
Robust Fuzzy Text Matching
Full Pydantic support in all semantic operators
Persistent views
More Functions & Models
Perf & DX improvements
Reposted by Kostas Pardalis
@steveklabnik.com Joined us on an episode where we discussed about

Why:
• Cargo & friendly errors > benchmarks
• 6-week releases > years-long committees
• How Rust united Ruby, FP & C++ devs
• Next-gen picks

and many more!

Check the episode on your favorite platform!
Everyone’s heads down on AI these days, but please take a break and soak in some deep systems wisdom from Josh Howards.

He’s one of the folks behind R2 at Cloudflare.

After all, whatever you build in AI will sit on top of these foundations.

check @totrrocks.bsky.social for the episode link.
Reposted by Kostas Pardalis
Startups and new products increasingly prioritize serverless models to reduce user friction and accelerate adoption.

@philippemnoel.bsky.social from ep.12
Reposted by Kostas Pardalis
The value proposition of formal methods becomes clear when dealing with complex distributed transactions involving multiple independent services.

Jayaprabhakar(JP) Kadarkarai from ep.5
Reposted by Kostas Pardalis
User experience and developer interaction with complex data abstractions remain a significant challenge beyond the technical integration.

Nikhil Simha & Varant Zanoyan from ep.2
Reposted by Kostas Pardalis
Successful AI developer tools must balance synchronous co-pilot style assistance with asynchronous autonomous agent workflows.

@ivanburazin.bsky.social from ep.9
Reposted by Kostas Pardalis
Managing AI access and permissions requires careful role-based controls to prevent over-privileged AI actions in enterprise environments.

Well said, even before hashtag#MCP was as popular as today.
@ivanburazin.bsky.social from ep.9
I had the rare opportunity to sit down and chat with someone who helped shape that story of Splunk, co-founder Erik Swan.

There's a lot to learn from him but what inspired me the most is his energy. Even after a success like Splunk, still learning and building

listen here @totrrocks.bsky.social
Incremental materialization has stumped the industry for decades.

Epsio led by Gilad , is changing that: product-first, real-world incremental views.

If real-time data infra matters to you, check out my chat with Gilad on @totrrocks.bsky.social