[논문분석] E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR
VAD(Voice Activity Detector)와 Streaming End-to-end (E2E) models for ASR을 통합해 성능 향상
VAD(Voice Activity Detector)와 Streaming End-to-end (E2E) models for ASR을 통합해 성능 향상
ASR에 Decoder를 LLM으로 사용, LoRA finetuning
CPU에서도 실시간으로 동작하는 Sound Enhancement model
neural contextual adapter를 활용한 context-biasing
neural contextual adapter를 활용한 context-biasing