[논문분석] Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization
Audio Conditioned Diffusion Models - Talking Face generation
Audio Conditioned Diffusion Models - Talking Face generation
GAN을 활용한 sound guided video generation, clip의 latent space를 활용
기존 Unit based audio Multilingual translate으로 제안된 논문에 Korean을 추가
기존 Unit based audio Multilingual translate으로 제안된 논문에 Korean을 추가
Hubert, [Speech audio Unit encoding] conditioning, diffusion video generation