1st MOSE Challenge on CVPR 2024

Introduction

The 1st MOSE challenge will be held in conjunction with CVPR 2024 PVUW Workshop in Seattle, USA. In this edition of the workshop and challenge, we focus on video object segmentation under complex environments. MOSE contains 2,149 video clips and 5,200 objects, with 431,725 high-quality object segmentation masks. The video resolution is 1920×1080 and the video lengths are 5 to 60 seconds in general. The most notable feature of MOSE is complex scenes, including the disappearance-reappearance of objects, inconspicuous small objects, heavy occlusions, crowded environments, etc. The goal of MOSE dataset is to provide a platform that promotes the development of more comprehensive and robust video object segmentation algorithms. The workshop will culminate in a round table discussion, in which speakers will debate the future of video object representations.

Leaderboard

**TABLE 1. Top 3 Leaderboard of MOSE Challenge in CVPR 2024 PVUW Workshop.**
Team Name	Team Members	Organization	Technical Report	*J&F* \| J \| F
PCL_VisionLab	Deshui Miao^1,2, Xin Li², Zhenyu He^1,2, Yaowei Wang², Ming-Hsuan Yang³	¹Harbin Institute of Technology (ShenZhen), ²Peng Cheng Laboratory, ³University of California at Merced	PDF Video	84.5 \| 81.0 \| 87.9
Yao_Xu_MTLab	Zhensong Xu¹, Jiangtao Yao¹, Chengjing Wu¹, Ting Liu¹, Luoqi Liu¹	¹MT Lab, Meitu Inc	PDF	83.5 \| 80.1\| 86.8
ISS	Xinyu Liu¹, Jing Zhang¹, Kexin Zhang¹, Yuting Yang¹, Licheng Jiao¹, Shuyuan Yang¹	¹Intelligent Perception and Image Understanding Lab, Xidian University	PDF	82.2 \| 78.8 \| 85.6

Dates

● 1 Feb 2024: Release the training and validation dataset, check [here].
● 1 Feb 2024: Setup the submission server on CodaLab and open the submission of the validation results.
● 8 Apr 2024: Workshop paper submission deadline.
● 12 Apr 2024: Notification to authors of workshop paper.
● 15 May 2024: Release the test dataset and open the submission of the test results.
● 25 May 2024: Challenge submission end.
● 30 May 2024: The final competition results will be announced and high-performance teams will be invited.
● 17 Jun 2024: The workshop begins.

Rules

● Extra training datasets besides MOSE are allowed, but contestants must disclose any extra datasets used.
● There is no limitations to the models, large models like SAM can be used, but contestants must report the models used.

Call for Papers

This workshop includes workshop papers, covering but not limit to the following topics:
● Semantic/panoptic segmentation for images/videos
● Video object/instance segmentation
● Efficient computation for video scene parsing
● Object tracking
● Language-guided segmentation
● Semi-supervised recognition in videos
● New metrics to evaluate the quality of video scene parsing results
● Real-world video applications, including autonomous driving, indoor robotics, visual navigation, etc.

Submission: We invite authors to submit unpublished papers (8-page CVPR format) to our workshop, to be presented at a poster session upon acceptance. All submissions will go through a double-blind review process. Accepted papers will be published in the official CVPR Workshops proceedings and the Computer Vision Foundation (CVF) Open Access archive. All contributions must be submitted (along with supplementary materials, if any) at this link.
Paper Submission Dates:
● Workshop paper submission deadline: 8 April 2024 (23:59 PST)
● Notification to authors: 12 April 2024
● Camera ready deadline: 14 April 2024

MOSE Dataset Examples

Evaluation

Online Evaluation (🔥ready now!)

● Following DAVIS, we use Region Jaccard J, Boundary F measure F, and their mean J&F as the evaluation metrics.
● For the validation sets, the first-frame annotations are released to indicate the objects that are considered in evaluation.
● The validation set online evaluation server is [here] for daily evaluation.
● The test set online evaluation server will be open during the competition period only (TBD).

MOSE Challenge Organizers


	Henghui Ding Primary Organizer Fudan University	Chang Liu Primary Organizer Nanyang Technological University

Shuting He Nanyang Technological University	Xudong Jiang Nanyang Technological University	Philip H.S. Torr University of Oxford	Song Bai ByteDance

BibTeX

Please consider to cite MOSE if it helps your research.

@inproceedings{MOSE,
  title={{MOSE}: A New Dataset for Video Object Segmentation in Complex Scenes},
  author={Ding, Henghui and Liu, Chang and He, Shuting and Jiang, Xudong and Torr, Philip HS and Bai, Song},
  booktitle={ICCV},
  year={2023}
}

License

MOSE is licensed under a CC BY-NC-SA 4.0 License. The data of MOSE is released for non-commercial research purpose only.

CVPR 2024 Complex Video Object Segmentation Challenge