artificialanalysis.bsky.social
@artificialanalysis.bsky.social
Compare Llama 3.3 70B to other models: artificialanalysis.ai/models/llama...

Follow our Llama 3.3 provider page to stay up to date as more hosting providers launch endpoints: artificialanalysis.ai/models/llama...
Llama 3.3 70B - Quality, Performance & Price Analysis | Artificial Analysis
Analysis of Meta's Llama 3.3 Instruct 70B and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.
artificialanalysis.ai
December 6, 2024 at 8:37 PM
Llama 3.3 70B takes a leap forward on all evals we benchmark.

It now leads Llama 3.1 405B in MATH and almost matches 405B in each of MMLU, GPQA Diamond and HumanEval.
December 6, 2024 at 8:37 PM
➤ Biggest increases in MATH-500 (64% to 76%), GPQA Diamond (43% to 49%) and HumanEval (80% to 85%)
➤ Smaller increase in MMLU (84% to 86%)
➤ Llama 3.3 70B now leads Llama 3.1 405B in Math-500, and scores nearly equal to 405B in MMLU, GPQA Diamond and HumanEval
December 6, 2024 at 8:37 PM