28) How DeepSeek Rewrote Quantization Part 2 Accumulation Precision Online Quantization5просмотров10 дней назад
27) How DeepSeek Rewrote Quantization Part 1 Mixed Precision Fine-grained quantization3просмотра10 дней назад
20) Mixture of Experts Balancing Techniques Auxiliary Loss Load Balancing Capacity Factor5просмотров11 дней назад
15) All about Sinusoidal Positional Encodings What’s with the weird sin-cos formula1просмотр12 дней назад
14) Integer and Binary Positional Encodings Journey towards Rotary Positional Encodings (RoPE)3просмотра12 дней назад