À propos
I am the Principal Research SDE Manager at Microsoft Research Asia, leading the Systems and Engineering Group in Shanghai. My work focuses on advancing large-scale model systems, multimodal systems, and intelligent agents. I specialize in efficient computation techniques, long-context inference, and real-world applications of large language models (LLMs). My research bridges cutting-edge AI innovations with practical applications, with publications in top-tier conferences such as OSDI, SOSP, NeurIPS, EuroSys, ATC, CVPR, ICCV, and etc.. I hold B.S. and Ph.D. degrees from Fudan University, earned in 2006 and 2011, respectively.
We are currently working on:
- Resource scheduling and compiling optimization to accelerate the large-scale, sparse, and dynamic DNN models, e.g., the topology-aware GPU scheduler and sparsity compiling stack
- Hardware efficiency (e.g., latency, energy, and carbon footprint) study of diverse DNN models and prediction based automatic efficient model design
- Real time DNN models and systems for cloud gaming and video streaming
- Wireless sensing for Healthcare, Environments, and Human Computer Interaction
We have various positions opening (Chercheur (opens in new tab), Research SDE (opens in new tab), and Intern (opens in new tab)). Welcome to join us. Please contact me via yuqyang@microsoft.com.