InTDS ArchivebyDmitrii Eliuseev16, 8, and 4-bit Floating Point Formats — How Does it Work?Let’s go into bits and bytesSep 30, 20234Sep 30, 20234
InAIGuysbyVishal RajputRetNet: Transformer killer is hereCan RetNet replace the Transformers? Early results looks very promising.Sep 14, 20239Sep 14, 20239
Joe El Khoury - GenAI EngineerIntroducing Deformable Attention TransformerThis post is based on findings made in this paperJun 21, 20232Jun 21, 20232
Zain ul AbideenAttention Is All You Need: The Core Idea of the TransformerAn overview of the Transformer model and its key components.Jun 26, 20232Jun 26, 20232
InTDS ArchivebyChen MargalitSimplifying Transformers: State of the Art NLP Using Words You Understand — part 2- InputDeep dive into how transformers’ inputs are constructedJul 26, 20232Jul 26, 20232
Fareed KhanUnderstanding Transformers: A Step-by-Step Math Example — Part 1I understand that the transformer architecture may seem scary, and you might have encountered various explanations on YouTube or in blogs…Jun 5, 202363Jun 5, 202363
InTowards AIbyQuadric(Vision) Transformers: Rise of the ChimeraIt’s 2023, and transformers are having a moment. No, I’m not talking about the latest installment of the Transformers movie franchise…Jun 21, 2023Jun 21, 2023
Hunter PhillipsOverview: The Implemented TransformerThe transformer is a state-of-the-art model introduced in “Attention is All You Need” in 2017. There are great articles describing various…May 8, 2023May 8, 2023