Chaeyoung Jung

About Me

I am a second-year Ph.D. student at KAIST, advised by Professor Joon Son Chung. My research centers on multimodal learning, with a particular focus on deepening the understanding and reasoning capabilities of audio-visual large language models (AV-LLMs).

I am also passionate about generative modeling in audio, including text-to-audio generation, as well as speech-related tasks such as speech enhancement, source separation, and lip-to-speech synthesis.

Work Experience

Research Intern at Meta Reality Lab, Burlingame, CA (2026.06 - 2026.11, Expected)
- Supervised by Zhaojiang Lin

Education

Korea Advanced Institute of Science and Technology (KAIST), South Korea (2025 - Present)
- Ph.D in Electrical Engineering (advisor: Prof. Joon Son Chung)
Korea Advanced Institute of Science and Technology (KAIST), South Korea (2023 - 2025)
- M.S in Electrical Engineering (advisor: Prof. Joon Son Chung)
Korea Advanced Institute of Science and Technology (KAIST), South Korea (2018 - 2023)
- B.S in Electrical Engineering

News

[Jan. 2026] One paper is accepted to ICASSP 2026.
[Sep. 2025] One paper is accepted to NeurIPS 2025.
[May. 2025] Two papers are accepted to Interspeech 2025.
[Mar. 2025] I started Ph.D course in MMAI, KAIST!

Publications

ICASSP

FastAV: Efficient Token Pruning for Audio-Visual Large Language Model Inference

Chaeyoung Jung, Youngjoon Jang, Seungwoo Lee, Joon Son Chung

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026.

PDF

Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models

Chaeyoung Jung*, Youngjoon Jang*, Jongmin Choi, Joon Son Chung

Preprint, 2025.

PDF

NeurIPS

AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding

Chaeyoung Jung*, Youngjoon Jang*, Joon Son Chung

Conference on Neural Information Processing Systems (NIPS), 2025.

PDF

Interspeech

InfiniteAudio: Infinite-Length Audio Generation with Consistent Acoustic Attributes

Chaeyoung Jung, Hojoon Ki, Ji-Hoon Kim, Junmo Kim, Joon Son Chung

Conference of the International Speech Communication Association (Interspeech), 2025.

PDF Oral

Interspeech

SEED: Speaker Embedding Enhancement Diffusion Model

KiHyun Nam, Jungwoo Heo, Jee-weon Jung, Gangin Park, Chaeyoung Jung, Ha-Jin Yu, Joon Son Chung

Conference of the International Speech Communication Association (Interspeech), 2025.

PDF

CVPR

From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech

Ji-Hoon Kim, Jeongsoo Choi, Jaehun Kim, Chaeyoung Jung, Joon Son Chung

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.

PDF Highlight

ICASSP

Voicedit: Dual-condition diffusion transformer for environment-aware speech synthesis

Jaemin Jung*, Junseok Ahn*, Chaeyoung Jung, Tan Dat Nguyen, Youngjoon Jang, Joon Son Chung

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025.

PDF Code

Interspeech

FlowAVSE: Efficient audio-visual speech enhancement with conditional flow matching

Chaeyoung Jung, Suyeon Lee, Ji-Hoon Kim, Joon Son Chung

Conference of the International Speech Communication Association (Interspeech), 2024.

PDF Code

ICASSP

Seeing through the conversation: Audio-visual speech separation based on diffusion model

Suyeon Lee*, Chaeyoung Jung*, Youngjoon Jang, Jaehun Kim, Joon Son Chung

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024.

PDF Code Oral

ICASSP

Talknce: Improving active speaker detection with talk-aware contrastive learning

Chaeyoung Jung*, Suyeon Lee*, Kihyun Nam, Kyeongha Rho, You Jin Kim, Youngjoon Jang, Joon Son Chung

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024.

PDF Code Oral