elvis
banner
eos.bsky.social
elvis
@eos.bsky.social
Building with AI agents • Prev: Meta AI, Elastic, Galactica LLM, PhD • Prompting Guide (~6M+ learners) • I also teach how to build with AI: https://dair-ai.thinkific.com/
In addition, hallucinations generated by GPT-4o provide the most consistent improvements across models.
January 24, 2025 at 1:56 PM
A new paper claims that LLMs can achieve better performance in drug discovery tasks with text hallucinations compared to input prompts without hallucination.

Llama-3.1-8B achieves an 18.35% gain in ROC-AUC compared to the baseline without hallucination.
January 24, 2025 at 1:56 PM
- metrics to assess the efficiency of o1-like models
- several strategies to tackle overthinking and reduce token generation

Very informative paper.
January 2, 2025 at 4:03 PM
January 2, 2025 at 3:15 PM
• 🌍 Flexible Environment Configuration: Define custom environments with YAML configuration files
• 🛠️ Extensible Architecture: Easy to extend and customize for your specific needs
December 31, 2024 at 3:22 PM
• 🔄 Robust Interaction Management: Coordinate complex interactions between agents
• 💾 Checkpoint System: Save and restore agent states and interactions
• 📊 Data Generation: Generate synthetic data through agent interactions
• ⚡ Performance Optimized: Built for efficiency and scalability
December 31, 2024 at 3:22 PM
December 17, 2024 at 2:24 PM
December 16, 2024 at 3:10 PM
December 13, 2024 at 3:20 PM
The authors claim that "With AUC scores of 0.871 and 0.854 on harmful content and RAG-hallucination-related benchmarks respectively, Granite Guardian is the most generalizable and competitive model available in the space."

arxiv.org/abs/2412.07724
Granite Guardian
We introduce the Granite Guardian models, a suite of safeguards designed to provide risk detection for prompts and responses, enabling safe and responsible use in combination with any large...
arxiv.org
December 11, 2024 at 2:28 PM