I am a second-year Ph.D. student at KAIST, advised by Professor Joon Son Chung. My research centers on multimodal learning, with a particular focus on deepening the understanding and reasoning capabilities of audio-visual large language models (AV-LLMs).
I am also passionate about generative modeling in audio, including text-to-audio generation, as well as speech-related tasks such as speech enhancement, source separation, and lip-to-speech synthesis.
ICASSP
NeurIPS
Interspeech
Interspeech
CVPR
ICASSP
Interspeech
ICASSP
ICASSP
Powered by Jekyll and Minimal Light theme.