DeepSeek-V3 Technical Report Walkthrough

DeepSeek-V3 Technical Report Walkthrough

268 Просмотров

📖Paper: https://github.com/deepseek-ai/DeepSeek-V3/blob/main/DeepSeek_V3.pdf
🏫Institutes: DeepSeek AI

DeepSeek V3 Unveiled🚀✨

• 671B params 🤯 but only 37B per token for max efficiency ⚡
• Shining on MMLU, MATH-500, LiveCodeBench 💡
• Trained on a budget: $5.57M 🤑 using advanced techniques like MLA & MTP 🛠️
• Open-source for all 🤝


Want to discover more AI papers like this? 🚀 Head over to https://RibbitRibbit.co 🐸 — Discover Research The Fun Way!

#llm #deepseek
Ссылки и html тэги не поддерживаются


Комментарии: