💾 trunk/3224e469d0eb3fa8a3a02dafdb68d6254c4b3509: [inductor] Decompose small dot-shaped bmm (#183911)
Nachrichtenbereich: 💾 Downloads
🔗 Quelle: github.com
Lower CUDA/XPU bmm with shape (B, 1, K) @ (B, K, 1) and small K to a fused multiply/reduction when the dtype matches eager bmm support, avoiding tiny extern bmm launches from vmap(dot).
Fixes... [Weiterlesen]
🔧 Boot Process Overview and Comparison
📈 223.63 Punkte
🔧 Programmierung
🔧 Multi-Head Latent Attention (MLA)
📈 127.79 Punkte
🔧 Programmierung
🔧 Mamba/SSM Basics
📈 97.52 Punkte
🔧 Programmierung
🔧 The Great Equaliser
📈 92.48 Punkte
🔧 Programmierung