Publications
A list of publications. Bold indicates myself; * denotes equal contribution.
2026
-
OSDIVTC: DNN Compilation with Virtual Tensors for Data Movement EliminationUSENIX Symposium on Operating Systems Design and Implementation, 2026
-
ISCMegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction ModelsInternational Supercomputing Conference, 2026
-
ICLRAutoSP: Unlocking Long-Context LLM Training Via Compiler-Based Sequence ParallelismInternational Conference on Learning Representations, 2026
2025
-
SCX-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC PlatformsInternational Conference for High Performance Computing, Networking, Storage, and Analysis, 2025
-
OOPSLASPLAT: A Framework for Optimised GPU Code-Generation for Sparse Regular AttentionACM Conference on Object-Oriented Programming, Systems, Languages and Applications, 2025
In submission
-
PreprintA Self-Pruning Transformer: Extreme KV-Cache Compression with Universal AttentionUnder review
-
PreprintFLuRKA: Fast Fused Low-Rank & Kernel AttentionUnder review