RoPerformer: Rotary Positional Embedding Mechanism on Sparse Attention Architecture
Dec 10, 2024
·
1 min read
This project is advised by Prof.Krzysztof Choromanski as final project for course IEOR6617 Machine Learning and High-Dimensional Data in Columbia, which focuses on the rotary positional embedding mechanism on both classical Transformer and Sparse-Attention based Transformer(Performer). We conducted thorough experiments involving both models and several SOTA positional embedding mechanisms on CIFAR100 dataset and gave detailed analysis in final report on our results.