Jesus RodriguezUnderstanding FlashAttention-3: One of the Most Important Algortihms to Make Transformers FastThe new version takes full advatange of H100 capabilities to improve attention in transformer models.Jul 15, 2024Jul 15, 2024
InLevel Up CodingbyDr. Ashish BamaniaSuperfast Matrix-Multiplication-Free LLMs Are Finally HereA deep dive into Matrix-Multiplication-Free LLMs that might drastically decrease the use of GPUs in AI, unlike todayJun 20, 202410Jun 20, 202410
InAI AdvancesbyJackyNVIDIA GPUs Now Support Copilot+! Is AI PC Set for a Revolution?Exploring the Impact of Microsoft’s Latest AI Service and NVIDIA’s Support on the Future of Personal ComputingJun 13, 20241Jun 13, 20241
InIntuitively and Exhaustively ExplainedbyDaniel WarfieldCUDA for AI — Intuitively and Exhaustively ExplainedParallelized AI from scratch in CUDAJun 14, 20248Jun 14, 20248