- Seoul Korea
Stars
Pytorch GAIL VAIL AIRL VAIRL EAIRL SQIL Implementation
Implementation of GateLoop Transformer in Pytorch and Jax
A concise but complete full-attention transformer with a set of promising experimental features from various papers
RichardMinsooGo / LM_2023_Simple-hierarchical-transformer
Forked from lucidrains/simple-hierarchical-transformerExperiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Implementation of RQ Transformer, proposed in the paper "Autoregressive Image Generation using Residual Quantization"
RichardMinsooGo / LM_2022_Memorizing-transformers-pytorch
Forked from lucidrains/memorizing-transformers-pytorchImplementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena
Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways
Implementation of Multistream Transformers in Pytorch
Implementation of Feedback Transformer in Pytorch
Implementation of Linformer for Pytorch
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
Implementation of Fast Transformer in Pytorch
Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch
RichardMinsooGo / LM_2021_hourglass-transformer-pytorch
Forked from lucidrains/hourglass-transformer-pytorchImplementation of Hourglass Transformer, in Pytorch, from Google and OpenAI
Implementation of Metaformer, but in an autoregressive manner
Implementation of Nyström Self-attention, from the paper Nyströmformer
Implementation of Perceiver, General Perception with Iterative Attention, in Pytorch
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012