Abstract: Remote sensing image interpretation is a crucial process for Earth observation and geoscience research, and multimodal fusion techniques are key to improving semantic segmentation accuracy.