Abstract: This paper proposes a novel framework utilizing multimodal large language models (MLLMs) for referring video object segmentation (RefVOS). Previous MLLMbased methods commonly struggle with ...
Abstract: Array-geometry-agnostic speech separation (AGA-SS) aims to develop an effective separation method regardless of the microphone array geometry. Conventional methods rely on permutation-free ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果