آموزش معماری DeepSeek از صفر تا صد، مجموعهای شامل ۲۰ ویدئوی آموزشی
این مجموعه شامل ۲۰ جلسه آموزشی هست که مفاهیمی مثل Multi-Head Latent Attention و Mixture of Experts رو با جزئیات کامل بررسی میکنه.
1️⃣ DeepSeek Series Introduction
https://youtu.be/QWNxQIq0hMo
2️⃣ DeepSeek Basics
https://youtu.be/WjhDDeZ7DvM
3️⃣ Journey of a Token into the LLM Architecture
https://youtu.be/rkEYwH4UGa4
4️⃣ Attention Mechanism Explained in 1 Hour
https://youtu.be/K45ze9Yd5UE
5️⃣ Self Attention Mechanism - Handwritten from Scratch
https://youtu.be/s8mskq-nzec
6️⃣ Causal Attention Explained: Don't Peek into the Future
https://youtu.be/c6Kkj6iLeBg
7️⃣ Multi-Head Attention Visually Explained
https://youtu.be/qbN4ulK-bZA
8️⃣ Multi-Head Attention Handwritten from Scratch
https://youtu.be/rvsEW-EsD-Y
9️⃣ Key Value Cache from Scratch
https://youtu.be/IDwTiS4_bKo
🔟 Multi-Query Attention Explained
https://youtu.be/Z6B51Odtn-Y
1️⃣1️⃣ Understand Grouped Query Attention (GQA)
https://youtu.be/kx3rETIxo4Q
1️⃣2️⃣ Multi-Head Latent Attention From Scratch
https://youtu.be/NlDQUj1olXM
1️⃣3️⃣ Multi-Head Latent Attention Coded from Scratch in Python
https://youtu.be/mIaWmJVrMpc
1️⃣4️⃣ Integer and Binary Positional Encodings
https://youtu.be/rP0CoTxe5gU
1️⃣5️⃣ All About Sinusoidal Positional Encodings
https://youtu.be/bQCQ7VO-TWU
1️⃣6️⃣ Rotary Positional Encodings
https://youtu.be/a17DlNxkv2k
1️⃣7️⃣ How DeepSeek Implemented Latent Attention | MLA + RoPE
https://youtu.be/m1x8vA_Tscc
1️⃣8️⃣ Mixture of Experts (MoE) Introduction
https://youtu.be/v7U21meXd6Y
1️⃣9️⃣ Mixture of Experts Hands-on Demonstration
https://youtu.be/yw6fpYPJ7PI
2️⃣0️⃣ Mixture of Experts Balancing Techniques
https://youtu.be/nRadcspta_8
>>Click here to continue<<
