About Me
I am a software engineer at NAVER Cloud, working on large-scale LLM serving systems and AI infrastructure.
My interests lie in efficient LLM inference and distributed systems for large-scale AI services.
Programming Languages : C/C++, CUDA, Python
Simulation Tools : gem5, MARSSx86, DRAMsim3
Backend Development : Kubernetes
Open Source Contributions: FasterTransformer v5.0, vLLM, llmperf
Experience
Education
Yonsei University
2019.03 - 2021.02
M.S. in School of Electrical and Electronic Engineering
Yonsei University
2013.03 - 2019.02
B.S. in School of Electrical and Electronic Engineering
Projects
OmniServe
Scalable Multimodal LLM Serving System
https://github.com/NAVER-Cloud-HyperCLOVA-X/OmniServe
HyperCLOVA, HyperCLOVA X
Large Language Model from NAVER Specialized in Korean
https://clova.ai/hyperclova
Presentations
Hyperscale AI, ‘HyperCLOVA’ serving (DEVIEW 2021)
Publications
S. Woo, J. Kil, H. Kim, M. Kim, J. Kim, A. Seo, S. Lee, M. Jo, J. Ryu, B. Park, S. J. Kwon, and D. Lee, “ICaRus: Identical Cache Reuse for Efficient Multi-Model Inference,” International Conference on Learning Representations (ICLR), Apr. 2026.
G. Park, B. Park, M. Kim, S. Lee, J. Kim, B. Kwon, S. Kwon, B. Kim, Y. Lee, and D. Lee, “LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models,” International Conference on Learning Representations (ICLR), May 2024.
H. Park, A. Cho, H. Jeon, H. Lee, Y. Yang, S. Lee, H. Lee, J. Choo, “HPC2lusterScape: Increasing Transparency and Efficiency of Shared High-Performance Computing Clusters for Large-scale AI Models,” IEEE Visualization in Data Science (VDS), Oct. 2023.
S. Hong, S. Moon, J. Kim, S. Lee, M. Kim, D. Lee, and J. Kim, “DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation,” IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2022.
B. Kim, H. Kim, S. Lee, G. Lee, D. Kwak, D. H. Jeon, S. Park, S. Kim, S. Kim, D. Seo, H. Lee, M. Jeong, S. Lee, M. Kim, S. H. Ko, S. Kim, T. Park, J. Kim, S. Kang, N. Ryu, K. M. Yoo, M. Chang, S. Suh, S. In, J. Park, K. Kim, H. Kim, J. Jeong, Y. G. Yeo, D. Ham, D. Park, M. Y. Lee, J. Kang, I. Kang, J. Ha, W. Park, N. Sung, “What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers”, Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2021.
S. Lee, “Exploiting Large and Small Page Sizes in Two-Tiered Memory System,” Master’s Thesis, School of Electrical and Electronic Engineering, Yonsei University, Dec. 2020.
B. Kim, S. Lee, A. Trivedi, and W. Song, “Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices,” IEEE Access, Nov. 2020.
B. Kim, S. Lee, C. Park, H. Kim, and W. Song, “The Nebula Benchmark Suite: Implications of Lightweight Neural Networks,” IEEE Transactions on Computers, Oct. 2020.
Patents
W. Song, B. Kim, and S. Lee, “Operation Device of Artificial Neural Network, Operation Method of Artificial Neural Network and Computer Program Stored in A Recording Medium to Execute The Method,” 10-2021-0071874, June 2021.
Scholarships and Awards
Scholarship for combined BS/MS program (Yonsei University, Full tuition for master course, Mar. 2019 - June 2020)
4th Place Award (Korea Auto-Vehicle Safety Association (KASA), Autonomous Car Racing in 2017 International Student Car Competition)
Honors Student Award (Yonsei University, Mar. 2013 / Sep. 2017)