Ankur Kumar
@ankurkumar.bsky.social
📝 A Techie, Blogger, Educator & Mentor
🖥️ Shares Learnings on Cloud, GenAI, Platform Engineering, Microservices, DevOps, Data Engineering
💙 Loved Husband, Proud Dad
💡 Founder Vedcraft.com - a platform for enabling Cloud, GenAI
🎯 Sports🏎️ 🎾 🏏🏒, Loves ☕🎦📘
🖥️ Shares Learnings on Cloud, GenAI, Platform Engineering, Microservices, DevOps, Data Engineering
💙 Loved Husband, Proud Dad
💡 Founder Vedcraft.com - a platform for enabling Cloud, GenAI
🎯 Sports🏎️ 🎾 🏏🏒, Loves ☕🎦📘
🧮 Information retrieval from internal sources (unstructured)
🧭 Information retrieval from internal sources (structured)
📚Vector storage & database deployment patterns
🔏 Pre and Post filtering of content as per security guardrails
🧭 Information retrieval from internal sources (structured)
📚Vector storage & database deployment patterns
🔏 Pre and Post filtering of content as per security guardrails
September 26, 2025 at 1:48 PM
🧮 Information retrieval from internal sources (unstructured)
🧭 Information retrieval from internal sources (structured)
📚Vector storage & database deployment patterns
🔏 Pre and Post filtering of content as per security guardrails
🧭 Information retrieval from internal sources (structured)
📚Vector storage & database deployment patterns
🔏 Pre and Post filtering of content as per security guardrails
📐Reranking of the information retrieval
📌 Compression of the information retrieval
📝 Structured information retrieval optimization
🔎 Information retrieval from public sources
📌 Compression of the information retrieval
📝 Structured information retrieval optimization
🔎 Information retrieval from public sources
September 26, 2025 at 1:48 PM
📐Reranking of the information retrieval
📌 Compression of the information retrieval
📝 Structured information retrieval optimization
🔎 Information retrieval from public sources
📌 Compression of the information retrieval
📝 Structured information retrieval optimization
🔎 Information retrieval from public sources
An illustrative maturity model (image generated with Nano Banana)
September 20, 2025 at 4:14 AM
An illustrative maturity model (image generated with Nano Banana)
Stage 3 (Run): Build or leverage bespoke and enterprise-specific autonomous AI agents scaling organizational efficiency and achieving new possibilities
September 20, 2025 at 4:14 AM
Stage 3 (Run): Build or leverage bespoke and enterprise-specific autonomous AI agents scaling organizational efficiency and achieving new possibilities
Stage 2 (Walk): Build enterprise or product-specific AI Assistants for individuals & organization productivity (e.g. Google Cloud Assist, Amazon Q, Microsoft Copilot, Salesforce Einstein, ServiceNow Now Assist, Oracle AI Agents, IBM watsonx Assistant, Adobe Firefly Assistant, Workday AI, Zoom AI)
September 20, 2025 at 4:14 AM
Stage 2 (Walk): Build enterprise or product-specific AI Assistants for individuals & organization productivity (e.g. Google Cloud Assist, Amazon Q, Microsoft Copilot, Salesforce Einstein, ServiceNow Now Assist, Oracle AI Agents, IBM watsonx Assistant, Adobe Firefly Assistant, Workday AI, Zoom AI)
Stage 1 (Crawl): Building the foundation layer and Agentic AI platform for building AI assistants and Autonomous Agents
September 20, 2025 at 4:14 AM
Stage 1 (Crawl): Building the foundation layer and Agentic AI platform for building AI assistants and Autonomous Agents
✔️Berkeley Function/Tool Calling: gorilla.cs.berkeley.edu/leaderboard....
✔️LMArena: lmarena.ai/leaderboard
✔️SWE Bench: https://
✔️LMArena: lmarena.ai/leaderboard
✔️SWE Bench: https://
September 17, 2025 at 1:13 AM
✔️Berkeley Function/Tool Calling: gorilla.cs.berkeley.edu/leaderboard....
✔️LMArena: lmarena.ai/leaderboard
✔️SWE Bench: https://
✔️LMArena: lmarena.ai/leaderboard
✔️SWE Bench: https://
2️⃣ Apply Auto-Reasoning and Auto-Selection of Models using solutions such as Semantic Routing instead of static binding
3️⃣ Reference existing industry LLM benchmarks for initial guidance and gradually build an enterprise-specific benchmark for diverse set of scenarios. Industry benchmarks:
3️⃣ Reference existing industry LLM benchmarks for initial guidance and gradually build an enterprise-specific benchmark for diverse set of scenarios. Industry benchmarks:
September 17, 2025 at 1:13 AM
2️⃣ Apply Auto-Reasoning and Auto-Selection of Models using solutions such as Semantic Routing instead of static binding
3️⃣ Reference existing industry LLM benchmarks for initial guidance and gradually build an enterprise-specific benchmark for diverse set of scenarios. Industry benchmarks:
3️⃣ Reference existing industry LLM benchmarks for initial guidance and gradually build an enterprise-specific benchmark for diverse set of scenarios. Industry benchmarks:
1️⃣ Based on the LLM capabilities and Enterprise alignment, build a decision matrix to help drive LLM selection within the enterprise (Reference research: arxiv.org/html/2402.06...)
www.swebench.com/
www.swebench.com/
September 17, 2025 at 1:13 AM
1️⃣ Based on the LLM capabilities and Enterprise alignment, build a decision matrix to help drive LLM selection within the enterprise (Reference research: arxiv.org/html/2402.06...)
www.swebench.com/
www.swebench.com/
3️⃣ Reasoning engine (aka the "Brain "): receives feedback from the environment, self-controls and adapts its actions
4️⃣ Actuators: Action results can go back into the model, agent interacts with environment with actions too
Reference: aima.cs.berkeley.edu
4️⃣ Actuators: Action results can go back into the model, agent interacts with environment with actions too
Reference: aima.cs.berkeley.edu
aima.cs.berkeley.edu
September 14, 2025 at 2:31 PM
3️⃣ Reasoning engine (aka the "Brain "): receives feedback from the environment, self-controls and adapts its actions
4️⃣ Actuators: Action results can go back into the model, agent interacts with environment with actions too
Reference: aima.cs.berkeley.edu
4️⃣ Actuators: Action results can go back into the model, agent interacts with environment with actions too
Reference: aima.cs.berkeley.edu
3️⃣Continuation of massive GPUs procurement and redesigned data centres/ AI factories with innovative power infrastructure (liquid cooling, high density racks, PDUs, etc.), low latency ultra-fast network (with tech like InfiniBand and NVIDIA's NVLink for communication between GPUs) & storage systems.
September 13, 2025 at 4:27 AM
3️⃣Continuation of massive GPUs procurement and redesigned data centres/ AI factories with innovative power infrastructure (liquid cooling, high density racks, PDUs, etc.), low latency ultra-fast network (with tech like InfiniBand and NVIDIA's NVLink for communication between GPUs) & storage systems.
2️⃣Training and Inference cost (with surge in exponential demand for tokens throughout) will continue to be the top most budget in upcoming years
September 13, 2025 at 4:27 AM
2️⃣Training and Inference cost (with surge in exponential demand for tokens throughout) will continue to be the top most budget in upcoming years
3️⃣ Cost per token = capex (annualized) + energy costs + other opex/annual token output.
4️⃣Revenue = tokens X dollars per token
siliconangle.com/2025/08/30/r...
4️⃣Revenue = tokens X dollars per token
siliconangle.com/2025/08/30/r...
September 7, 2025 at 2:27 PM
3️⃣ Cost per token = capex (annualized) + energy costs + other opex/annual token output.
4️⃣Revenue = tokens X dollars per token
siliconangle.com/2025/08/30/r...
4️⃣Revenue = tokens X dollars per token
siliconangle.com/2025/08/30/r...
1️⃣ Jensen’s law (Buy more, make more) — revenue goes up with focus on reducing token cost (focused towards selling more Blackwell GPUs, enterprises’ AI strategy need to find the right balance as per the current & predictive workload)
2️⃣ Tracking Tokens per year = power X efficiency X utilization
2️⃣ Tracking Tokens per year = power X efficiency X utilization
September 7, 2025 at 2:27 PM
1️⃣ Jensen’s law (Buy more, make more) — revenue goes up with focus on reducing token cost (focused towards selling more Blackwell GPUs, enterprises’ AI strategy need to find the right balance as per the current & predictive workload)
2️⃣ Tracking Tokens per year = power X efficiency X utilization
2️⃣ Tracking Tokens per year = power X efficiency X utilization
Tracking these New Metrics in your Enterprise helps to measure ROI and effectiveness of the Enterprise-specific AI strategy 👇
September 7, 2025 at 2:27 PM
Tracking these New Metrics in your Enterprise helps to measure ROI and effectiveness of the Enterprise-specific AI strategy 👇
📚 Claude Sonnet is the most admired AI model
📉 66% of developers are frustrated with AI solutions that are almost right
📈 More than one third of respondents use AI-enabled tools to learn AI this year
📉 66% of developers are frustrated with AI solutions that are almost right
📈 More than one third of respondents use AI-enabled tools to learn AI this year
August 2, 2025 at 3:28 PM
📚 Claude Sonnet is the most admired AI model
📉 66% of developers are frustrated with AI solutions that are almost right
📈 More than one third of respondents use AI-enabled tools to learn AI this year
📉 66% of developers are frustrated with AI solutions that are almost right
📈 More than one third of respondents use AI-enabled tools to learn AI this year
✅Operational Efficiency: Distributed Cloud, Hybrid Cloud Compute/Storage, Edge Data Management Capabilities
✅ Security: XDR, EDR, ZTNA, ITDR
emt.gartnerweb.com/ngw/globalas...
✅ Security: XDR, EDR, ZTNA, ITDR
emt.gartnerweb.com/ngw/globalas...
emt.gartnerweb.com
June 23, 2025 at 2:11 AM
✅Operational Efficiency: Distributed Cloud, Hybrid Cloud Compute/Storage, Edge Data Management Capabilities
✅ Security: XDR, EDR, ZTNA, ITDR
emt.gartnerweb.com/ngw/globalas...
✅ Security: XDR, EDR, ZTNA, ITDR
emt.gartnerweb.com/ngw/globalas...