Projects
Transformer From Scratch
A full implementation of the Transformer architecture from the "Attention Is All You Need" paper; including multi-head attention, positional encoding, and encoder-decoder structure.
GitHub →~/ currently learning
- Diffusion ModelsDiffusion Models
- RNN ModelsRNN Models
~/ currently building
- Grammar-Typed Deep Symbolic RegressionDoes enforcing dimensional consistency via a typed CFG improve symbolic regression on physics datasets, compared to an untyped baseline?
