Bochkov 's Collections

Progressive Growth Transformers (PGT) [pretrain]

Transformers grown layer-by-layer on frozen embeddings. Explores emergent capabilities with depth.