Here are my slides from today's talk at Datafest Yerevan.
The talk was about non-transformer architectures, e.g., good old MLPs, CNNs, RNNs, and brand-new SSMs. It may be too dense with too many model names, but I think it may be useful as a reference for further exploration.
https://docs.google.com/presentation/d/19jpt6sSScUb1yKnlO3a47SsMRIL7UmqQZKkuADyI7nM/edit#slide=id.g2f6fb83b821_0_15
>>Click here to continue<<
