Key Achievements
AVVA achieves significant improvements in video-to-audio retrieval across AudioCaps, VALOR, and VGGSound datasets using only 192 hours of LLM-curated training data – demonstrating that data quality effectively trades for data quantity.