Publications
* Equal Contribution, † Corresponding Author(s)
2025
- ICLRSparse Learning for State Space Models on MobileInternational Conference on Learning Representations, 2025
- AAAINumerical Pruning for Efficient Autoregressive ModelsAssociation for the Advancement of Artificial Intelligence, 2025
2024
- TCADHotaQ: Hardware Oriented Token Adaptive Quantization for Large Language ModelsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024