Author | Lightnews

Data Code 101 @datacode101.bsky.social · 3d

RAG is not just an integration problem. It’s a design problem. Each layer of this stack requires deliberate choices that impact latency, quality, explainability, and cost.

If you're serious about GenAI, it's time to think in terms of stacks—not just models.

Data Code 101 @datacode101.bsky.social · 3d

Evaluation

Tools like Ragas, Trulens, and Giskard bring much-needed observability—measuring hallucinations, relevance, grounding, and model behavior under pressure.

1

Data Code 101 @datacode101.bsky.social · 3d

Text Embeddings

The quality of retrieval starts here. Open-source models (Nomic, SBERT, BGE) are gaining ground, but proprietary offerings (OpenAI, Google, Cohere) still dominate enterprise use.

1

Data Code 101 @datacode101.bsky.social · 3d

Open LLM Access

Platforms like Hugging Face, Ollama, Groq, and Together AI abstract away infra complexity and speed up experimentation across models.

1

Data Code 101 @datacode101.bsky.social · 3d

Data Extraction (Web + Docs)

Whether you're crawling the web (Crawl4AI, FireCrawl) or parsing PDFs (LlamaParse, Docling), raw data access is non-negotiable. No context means no quality answers.

1

Data Code 101 @datacode101.bsky.social · 3d

Vector Database

Chroma, Qdrant, Weaviate, Milvus, and others power the retrieval engine behind every RAG system. Low-latency search, hybrid scoring, and scalable indexing are key to relevance.

1

Data Code 101 @datacode101.bsky.social · 3d

Frameworks

LangChain, LlamaIndex, Haystack, and txtai are now essential for building orchestrated, multi-step AI workflows. These tools handle chaining, memory, routing, and tool-use logic behind the scenes.

1

Data Code 101 @datacode101.bsky.social · 3d

LLMs (Open vs Closed)

Open models like LLaMA 3, Phi-4, and Mistral offer control and customization. Closed models (OpenAI, Claude, Gemini) bring powerful performance with less overhead. Your tradeoff: flexibility vs convenience.

1

Data Code 101 @datacode101.bsky.social · 3d

RAG Stack

Building with Retrieval-Augmented Generation (RAG) isn't just about choosing the right LLM. It's about assembling an entire stack—one that's modular, scalable, and future-proof.
#ai #rag #dataengineering

1

Data Code 101 @datacode101.bsky.social · 11d

EtLT (Extract, transform, Load, Transform) (2/2)

Best for scenarios requiring strict data security/compliance (pre-load masking) while still benefiting from the speed and flexibility of cloud data warehouse transformations.

Data Code 101 @datacode101.bsky.social · 11d

EtLT (Extract, transform, Load, Transform) (1/2)

Attempts to balance the data governance of ETL with the speed and flexibility of ELT. A minimal transformation step is performed before loading. Essential tasks like data cleaning, basic formatting, masking sensitive data for immediate compliance.

1

Data Code 101 @datacode101.bsky.social · 11d

ELT (Extract, Load, Transform) (2/2)

Transformation is implemented inside the target system (e.g., a modern cloud data warehouse like Snowflake or BigQuery, or a data lake). Highly scalable for massive and diverse (structured/unstructured) datasets.

1

Data Code 101 @datacode101.bsky.social · 11d

ELT (Extract, Load, Transform) (1/2)

Modern Approach: Became popular with the rise of cloud-native data warehouses offering cheap storage and elastic compute. Raw, unprepared data is loaded immediately, offering faster data ingestion and near real-time analytics.

1 1

Data Code 101 @datacode101.bsky.social · 11d

ETL (Extract, Transform, Load) (2/2)

Transformation is in a dedicated, separate staging server or processing engine outside the target data warehouse. Typically higher latency, as the data must wait for the transformation to complete before loading.

1

Data Code 101 @datacode101.bsky.social · 11d

ETL (Extract, Transform, Load) (1/2)

Traditional Approach: Older methodology common with on-premises data warehouses where compute was limited and expensive. Data is cleaned, standardized, and sensitive information can be masked before it enters the final warehouse.

1

Data Code 101 @datacode101.bsky.social · 11d

ETL vs ELT vs EtLT

All three methods begin with Extract (E) and end with Load (L), but the placement of transformation dictates their suitability for different infrastructure, data types, and business needs.

#dataengineering

1

Data Code 101 @datacode101.bsky.social · 11d

EtLT (Extract, transform, Load, Transform) (2/2)

Best for scenarios requiring strict data security/compliance (pre-load masking) while still benefiting from the speed and flexibility of cloud data warehouse transformations.

Data Code 101 @datacode101.bsky.social · 11d

EtLT (Extract, transform, Load, Transform) (1/2)

Attempts to balance the data governance of ETL with the speed and flexibility of ELT. A minimal transformation step is performed before loading. Essential tasks like data cleaning, basic formatting, masking sensitive data for immediate compliance.

1

Data Code 101 @datacode101.bsky.social · 11d

ELT (Extract, Load, Transform) (2/2)

Transformation is implemented inside the target system (e.g., a modern cloud data warehouse like Snowflake or BigQuery, or a data lake). Highly scalable for massive and diverse (structured/unstructured) datasets.

1

Data Code 101 @datacode101.bsky.social · 11d

ELT (Extract, Load, Transform) (1/2)

Modern Approach: Became popular with the rise of cloud-native data warehouses offering cheap storage and elastic compute. Raw, unprepared data is loaded immediately, offering faster data ingestion and near real-time analytics.

1

Data Code 101 @datacode101.bsky.social · 11d

ETL (Extract, Transform, Load) (2/2)

Transformation is in a dedicated, separate staging server or processing engine outside the target data warehouse. Typically higher latency, as the data must wait for the transformation to complete before loading.

1

Data Code 101 @datacode101.bsky.social · 11d

ETL (Extract, Transform, Load) (1/2)

Traditional Approach: Older methodology common with on-premises data warehouses where compute was limited and expensive. Data is cleaned, standardized, and sensitive information can be masked before it enters the final warehouse.

1

Data Code 101 @datacode101.bsky.social · 13d

Generation

The prompt consisting of the user query, instructions, and context is given to the LLM. The LLM processes the prompt and then generates a response grounded in the given context.

Data Code 101 @datacode101.bsky.social · 13d

Augmentation

Retrieved relevant chunks are combined into a single string to form a context. The context is then combined with the query and instructions to obtain a LLM prompt.

1

Data Code 101 @datacode101.bsky.social · 13d

Retrieval

Encoding the user asked query, i.e., the user query is transformed into an embedding vector using the same embedding model used in the indexing step.

The semantic search feature in the vector database uses the query embedding to find and return the most relevant chunks.

1