Abstract: Multimodal Named Entity Recognition (MNER) is a task that leverages multimodal information (such as text and images) to identify named entities within social media text. Traditional MNER ...