Sungjae Lee

Software Engineer at NAVER Cloud

About Me

I am a software engineer at NAVER Cloud, working on large-scale LLM serving systems and AI infrastructure.

My interests lie in efficient LLM inference and distributed systems for large-scale AI services.

Programming Languages : C/C++, CUDA, Python

Simulation Tools : gem5, MARSSx86, DRAMsim3

Backend Development : Kubernetes

Open Source Contributions: FasterTransformer v5.0, vLLM, llmperf

Experience

NAVER Cloud

Software Engineer

2023.01 - present

www.navercloudcorp.com

NAVER Corp.

Software Engineer

2020.12 - 2022.12

www.navercorp.com

Education

Yonsei University

2019.03 - 2021.02

M.S. in School of Electrical and Electronic Engineering

Yonsei University

2013.03 - 2019.02

B.S. in School of Electrical and Electronic Engineering

Projects

OmniServe

Scalable Multimodal LLM Serving System

https://github.com/NAVER-Cloud-HyperCLOVA-X/OmniServe

HyperCLOVA, HyperCLOVA X

Large Language Model from NAVER Specialized in Korean

https://clova.ai/hyperclova

NSML

ML Training Platform on Kubernetes

https://www.ncloud.com/product/aiService/clovaNsml

Nebula

Lightweight Neural Network Benchmarks

https://github.com/yonsei-icsl/nebula

Presentations

Hyperscale AI, ‘HyperCLOVA’ serving (DEVIEW 2021)

Publications

S. Woo, J. Kil, H. Kim, M. Kim, J. Kim, A. Seo, S. Lee, M. Jo, J. Ryu, B. Park, S. J. Kwon, and D. Lee, “ICaRus: Identical Cache Reuse for Efficient Multi-Model Inference,” International Conference on Learning Representations (ICLR), Apr. 2026.

G. Park, B. Park, M. Kim, S. Lee, J. Kim, B. Kwon, S. Kwon, B. Kim, Y. Lee, and D. Lee, “LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models,” International Conference on Learning Representations (ICLR), May 2024.

H. Park, A. Cho, H. Jeon, H. Lee, Y. Yang, S. Lee, H. Lee, J. Choo, “HPC2lusterScape: Increasing Transparency and Efficiency of Shared High-Performance Computing Clusters for Large-scale AI Models,” IEEE Visualization in Data Science (VDS), Oct. 2023.

S. Hong, S. Moon, J. Kim, S. Lee, M. Kim, D. Lee, and J. Kim, “DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation,” IEEE/ACM International Symposium on Microarchitecture (MICRO), Oct. 2022.

B. Kim, H. Kim, S. Lee, G. Lee, D. Kwak, D. H. Jeon, S. Park, S. Kim, S. Kim, D. Seo, H. Lee, M. Jeong, S. Lee, M. Kim, S. H. Ko, S. Kim, T. Park, J. Kim, S. Kang, N. Ryu, K. M. Yoo, M. Chang, S. Suh, S. In, J. Park, K. Kim, H. Kim, J. Jeong, Y. G. Yeo, D. Ham, D. Park, M. Y. Lee, J. Kang, I. Kang, J. Ha, W. Park, N. Sung, “What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers”, Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov. 2021.

S. Lee, “Exploiting Large and Small Page Sizes in Two-Tiered Memory System,” Master’s Thesis, School of Electrical and Electronic Engineering, Yonsei University, Dec. 2020.

B. Kim, S. Lee, A. Trivedi, and W. Song, “Energy-Efficient Acceleration of Deep Neural Networks on Realtime-Constrained Embedded Edge Devices,” IEEE Access, Nov. 2020.

B. Kim, S. Lee, C. Park, H. Kim, and W. Song, “The Nebula Benchmark Suite: Implications of Lightweight Neural Networks,” IEEE Transactions on Computers, Oct. 2020.

Patents

W. Song, B. Kim, and S. Lee, “Operation Device of Artificial Neural Network, Operation Method of Artificial Neural Network and Computer Program Stored in A Recording Medium to Execute The Method,” 10-2021-0071874, June 2021.

Scholarships and Awards

Scholarship for combined BS/MS program (Yonsei University, Full tuition for master course, Mar. 2019 - June 2020)

4th Place Award (Korea Auto-Vehicle Safety Association (KASA), Autonomous Car Racing in 2017 International Student Car Competition)

Honors Student Award (Yonsei University, Mar. 2013 / Sep. 2017)