Data Code 101
datacode101.bsky.social
Data Code 101
@datacode101.bsky.social
35 followers 53 following 210 posts
Data / Software Engineering
Posts Media Videos Starter Packs
RAG is not just an integration problem. It’s a design problem. Each layer of this stack requires deliberate choices that impact latency, quality, explainability, and cost.

If you're serious about GenAI, it's time to think in terms of stacks—not just models.
Evaluation

Tools like Ragas, Trulens, and Giskard bring much-needed observability—measuring hallucinations, relevance, grounding, and model behavior under pressure.
Text Embeddings

The quality of retrieval starts here. Open-source models (Nomic, SBERT, BGE) are gaining ground, but proprietary offerings (OpenAI, Google, Cohere) still dominate enterprise use.
Open LLM Access

Platforms like Hugging Face, Ollama, Groq, and Together AI abstract away infra complexity and speed up experimentation across models.
Data Extraction (Web + Docs)

Whether you're crawling the web (Crawl4AI, FireCrawl) or parsing PDFs (LlamaParse, Docling), raw data access is non-negotiable. No context means no quality answers.
Vector Database

Chroma, Qdrant, Weaviate, Milvus, and others power the retrieval engine behind every RAG system. Low-latency search, hybrid scoring, and scalable indexing are key to relevance.
Frameworks

LangChain, LlamaIndex, Haystack, and txtai are now essential for building orchestrated, multi-step AI workflows. These tools handle chaining, memory, routing, and tool-use logic behind the scenes.
LLMs (Open vs Closed)

Open models like LLaMA 3, Phi-4, and Mistral offer control and customization. Closed models (OpenAI, Claude, Gemini) bring powerful performance with less overhead. Your tradeoff: flexibility vs convenience.
RAG Stack

Building with Retrieval-Augmented Generation (RAG) isn't just about choosing the right LLM. It's about assembling an entire stack—one that's modular, scalable, and future-proof.
#ai #rag #dataengineering
EtLT (Extract, transform, Load, Transform) (2/2)

Best for scenarios requiring strict data security/compliance (pre-load masking) while still benefiting from the speed and flexibility of cloud data warehouse transformations.
EtLT (Extract, transform, Load, Transform) (1/2)

Attempts to balance the data governance of ETL with the speed and flexibility of ELT. A minimal transformation step is performed before loading. Essential tasks like data cleaning, basic formatting, masking sensitive data for immediate compliance.
ELT (Extract, Load, Transform) (2/2)

Transformation is implemented inside the target system (e.g., a modern cloud data warehouse like Snowflake or BigQuery, or a data lake). Highly scalable for massive and diverse (structured/unstructured) datasets.
ELT (Extract, Load, Transform) (1/2)

Modern Approach: Became popular with the rise of cloud-native data warehouses offering cheap storage and elastic compute. Raw, unprepared data is loaded immediately, offering faster data ingestion and near real-time analytics.
ETL (Extract, Transform, Load) (2/2)

Transformation is in a dedicated, separate staging server or processing engine outside the target data warehouse. Typically higher latency, as the data must wait for the transformation to complete before loading.
ETL (Extract, Transform, Load) (1/2)

Traditional Approach: Older methodology common with on-premises data warehouses where compute was limited and expensive. Data is cleaned, standardized, and sensitive information can be masked before it enters the final warehouse.
ETL vs ELT vs EtLT

All three methods begin with Extract (E) and end with Load (L), but the placement of transformation dictates their suitability for different infrastructure, data types, and business needs.

#dataengineering
EtLT (Extract, transform, Load, Transform) (2/2)

Best for scenarios requiring strict data security/compliance (pre-load masking) while still benefiting from the speed and flexibility of cloud data warehouse transformations.
EtLT (Extract, transform, Load, Transform) (1/2)

Attempts to balance the data governance of ETL with the speed and flexibility of ELT. A minimal transformation step is performed before loading. Essential tasks like data cleaning, basic formatting, masking sensitive data for immediate compliance.
ELT (Extract, Load, Transform) (2/2)

Transformation is implemented inside the target system (e.g., a modern cloud data warehouse like Snowflake or BigQuery, or a data lake). Highly scalable for massive and diverse (structured/unstructured) datasets.
ELT (Extract, Load, Transform) (1/2)

Modern Approach: Became popular with the rise of cloud-native data warehouses offering cheap storage and elastic compute. Raw, unprepared data is loaded immediately, offering faster data ingestion and near real-time analytics.
ETL (Extract, Transform, Load) (2/2)

Transformation is in a dedicated, separate staging server or processing engine outside the target data warehouse. Typically higher latency, as the data must wait for the transformation to complete before loading.
ETL (Extract, Transform, Load) (1/2)

Traditional Approach: Older methodology common with on-premises data warehouses where compute was limited and expensive. Data is cleaned, standardized, and sensitive information can be masked before it enters the final warehouse.
Generation

The prompt consisting of the user query, instructions, and context is given to the LLM. The LLM processes the prompt and then generates a response grounded in the given context.
Augmentation

Retrieved relevant chunks are combined into a single string to form a context. The context is then combined with the query and instructions to obtain a LLM prompt.
Retrieval

Encoding the user asked query, i.e., the user query is transformed into an embedding vector using the same embedding model used in the indexing step.

The semantic search feature in the vector database uses the query embedding to find and return the most relevant chunks.