Synthmantic LiDAR: A Synthetic Dataset for Semantic Segmentation on LiDAR Imaging

Synthmantic LiDAR is a synthetic dataset built using a modified version of the Carla simulator, designed to train and evaluate semantic segmentation models on LiDAR imaging, and set up to match the SemanticKITTI dataset. To achieve this, we included some new classes to the simulator, and also modified many classes that had inconsistent labels with the SemanticKITTI dataset. Some of the additional classes that were not supported by Carla were car,motorcycle, motorcyclist, bicycle, bicyclist and truck, which were previously grouped in the vehicle and person classes.

The dataset is composed of 8 sequences of 6,000 scans each from 7 unique maps, for a total of 48,000 scans, which is more than double the number of labeled scans in the SemanticKITTI train subset. We also define a smaller subset, SynthmanticLiDAR-LT composed from the first 2,000 scans from each sequence.

To validate our dataset we trained SqueezeSegV3 and Cylinder3D using a naive approach: pre-train on synthetic data and then train on real data. We found that the models trained on Synthmantic LiDAR outperformed the models trained on SemanticKITTI alone.

Method	car	bicycle	motorcycle	truck	other-vehicle	person	bicyclist	motorcyclist	road	parking	sidewalk	other-ground	building	fence	vegetation	trunk	terrain	pole	traffic-sign	mIoU
SPVCNN	95.7	47.7	49.5	47.2	48.3	64.1	66.7	48.2	88.5	57.7	70.7	23.2	90.1	63.9	84.5	67.7	69.0	53.1	62.1	63.0
SPVCNN-F	95.8	47.7	47.2	48.4	49.0	63.2	69.7	49.0	88.9	58.4	71.4	24.0	89.9	63.6	84.4	67.2	68.7	54.0	62.6	63.3
SPVCNN-LT	95.7	45.6	44.5	48.0	47.6	62.6	68.6	59.1	88.8	58.3	71.1	26.8	90.3	64.7	84.2	66.6	68.1	53.2	62.4	63.5

SSV3	81.4	16.0	25.3	3.7	13.3	34.0	33.1	13.5	88.8	52.8	68.4	21.9	76.1	43.3	75.6	44.1	59.9	30.3	30.6	42.7
SSV3-F	84.2	22.8	28.8	4.2	15.6	38.2	33.4	9.0	88.1	51.2	68.9	21.8	76.7	44.6	76.6	44.9	61.9	31.0	35.3	44.1
SSV3-LT	84.4	20.7	26.8	6.2	17.1	35.5	32.4	19.7	87.9	52.2	68.6	19.9	77.5	45.2	76.1	42.0	62.6	31.7	33.8	44.2

Best

Second Best

We hope that this it will be useful for the community to develop new models and domain adaptation techniques for semantic segmentation in LiDAR image. Our work was accepted into the 2024 International Conference on Image Processing (ICIP).