Teeth3DS+: An Extended Benchmark for Intraoral 3D Scans Analysis

Teeth3DS+ offers a large variablity of intraoral 3D Scans that are annotated for various perception tasks, typically teeth detection, segmentation, and labelling. In addition, dental landmarks are also annotated for a subset of the scans. Teeth3DS+ was used for the organization of two MICCAI challenges: 3DTeethSeg in MICCAI 2022 at Singapore and 3DTeethLand MICCAI 2024 at Marrakesh.

The challenge 3DTeethLand is the second edition of the 3DTeethSeg22 challenge associated with MICCAI 2024. It is organized by Udini (France) in collaboration with Inria Grenoble Morpheo team (France) and the Digital Research Center of Sfax (Tunisia).

In the previous edition of the challenge, known as 3DTeethSeg22 challenge, the focus was on teeth segmentation and labeling from intraoral 3D scans. Building upon this foundation and seeking to enhance our comprehension of intraoral scans, we are thrilled to introduce a more complex task in this challenge: 3D Teeth Landmark Detection.

Main objective

The objective of the 3D Teeth Landmark Detection task is to create algorithms that can automatically identify essential landmarks on individual teeth using 3D intraoral scans. These landmarks play a vital role in orthodontic treatment planning and assessment by providing crucial anatomical references for tooth alignment and positioning.

Data Description

To facilitate accurate analysis and understanding of tooth positioning and alignment, we define specific dental landmarks on each tooth, as illustrated in the figure below.

1- Mesial (red) and Distal (green) Points:

These points are located on the proximal surfaces of the tooth. The mesial point is on the side of the tooth closer to the midline of the mouth, while the distal point is on the side farther from the midline. These points are important for determining the alignment and positioning of the tooth in the dental arch.

2- Cusp Point (blue):

The cusp is the pointed or rounded mound on the chewing surface of a tooth. The cusp point is located at the highest point of the cusp. It is significant for understanding the occlusion (bite) relationship between the upper and lower teeth.

3- Inner (yellow) and Outer (cyan) Points:

These points are located at the limits of the tooth, where the tooth meets the gingiva (gum tissue). The inner point is on the inner side of the tooth, closer to the tongue or palate, while the outer point is on the outer side of the tooth, closer to the cheek or lips. These points are important for determining the pose of the tooth, including its orientation.

4- Facial Axis (magenta) Point:

This point is located at the midpoint of the facial surface of each tooth. The facial surface is the surface of the tooth that is visible from the front of the mouth. The facial axis point is important for determining the angulation and inclination of the tooth.

The landmarks annotated dataset consisting of 340 intraoralscans (IOS). This dataset is divided into two main groups: 240 scans from the Teeth3DS dataset, used as the training set for the 3DTeethLand Challenge and containing segmentation and labeling annotations, and an additional 100 scans without segmentation or labeling annotations, designated as the hidden private test set.

The landmarks annotations are provided in JavaScript Object Notation (JSON) format for each IOS scan in the Teeth3DS dataset. An example of the scanname_arch_kpt.json file format is shown below:

{
  "version":"1.0",
  "description":"landmarks",
  "key":"01A6HAN6_lower.obj",  # lower arch of scan named 01A6HAN6, which can be found in Teeth3DS files.
  "objects":[  # list of landmarks
  {
      "key":"uuid_0",  # unique id for the keypoint
      "class":"Mesial", # the class of the keypoint
      "coord":[  # xyz coordinate of the keypoint
          2.3146634105298736,
          -14.671770076868356,
          -82.42080180486484
                ]
  },....
}

Download

The landmark annotation files for the 3DTeeth scans can be downloaded from the following link: url : https://osf.io/um96h/

Evaluation metrics

Mean Average Precision

The Mean Average Precision (mAP) evaluates the accuracy of keypoint localization by assessing both the confidence of predictions and their alignment with ground truth across multiple distance thresholds. It is computed by first generating precision-recall curves at varying confidence score cutoffs, then calculating the area under these curves for each distance threshold and landmark category. The mAP is derived by averaging the AP values across all thresholds for each category. A higher mAP reflects the model's ability to make confident, accurate predictions while minimizing false positives and poorly localized detections, emphasizing the precision aspect of model performance.

Mean Average Recall

Mean Average Recall (mAR) measures the model's ability to detect all ground truth keypoints, providing an assessment of prediction completeness. It is computed as the average recall across various distance thresholds. Recall is calculated at each threshold as the ratio of correctly detected keypoints to the total number of ground truth keypoints, and the area under the recall curve is determined for each landmark category across all thresholds. The mAR is then derived by averaging the AR values for each category, resulting in a separate mAR value for every landmark category. A higher mAR reflects the model's capacity to detect a significant proportion of true keypoints, independent of prediction confidence, making it a valuable metric for evaluating coverage.

Ranking

To ensure robust rankings, we will employ a point-based ranking method enhanced by bootstrapping. The process begins with the computation of the mAP and mAR metrics for each landmark category. Teams are then pairwise compared for each metric using the Wilcoxon Signed Rank Test. A team is awarded one point for each comparison where it is deemed statistically superior (p-value < 0.001), resulting in a "total point count" that reflects the number of comparisons won. Bootstrapping is applied by resampling 10% of the data and repeating the pairwise comparison process on the remaining data, generating a "total point count" for each resampling iteration. This process is repeated 100 times. The final point score for each team is normalized by the total number of comparisons, calculated as The normalized scores are then aggregated to produce the final ranking, ensuring a statistically robust and fair evaluation of performance.

Leaderboard

Team	Rank Score	AP_cusp	AP_facial	AP_inner_outer	AP_mesial_distal	mAP	AR_cusp	AR_facial	AR_inner_outer	AR_mesial_distal	mAR
Radboud	0.9172	0.772	0.768	0.793	0.792	0.785	0.675	0.637	0.661	0.651	0.656
ChohoTech	0.8325	0.765	0.761	0.78	0.781	0.775	0.627	0.586	0.625	0.672	0.634
YY-LAB	0.6224	0.684	0.726	0.748	0.705	0.719	0.719	0.569	0.601	0.576	0.579
YN-LAB	0.3171	0.656	0.667	0.61	0.657	0.643	0.538	0.522	0.511	0.539	0.527
IGIP-LAB	0.1358	0.636	0.59	0.634	0.523	0.59	0.519	0.445	0.505	0.41	0.466
CG_sayaka	0.1253	0.574	0.553	0.531	0.529	0.541	0.55	0.476	0.501	0.481	0.498
3DIMLAND	0.0325	0.594	0.621	0.551	0.575	0.578	0.457	0.459	0.459	0.459	0.438

3DTeethSeg Challenge MICCAI 2022

The challenge 3DTeethSeg22 is a first edition associated with MICCAI 2022. It is organized by Udini (France) in collaboration with Inria Grenoble Morpheo team (France) and the Digital Research Center of Sfax (Tunisia).

Main objective

The main objective of the 3DTeethSeg’22 challenge was to develop and evaluate algorithms for teeth localization, segmentation, and labeling from intra-oral 3D scans. This challenge aimed to address the difficulties posed by variations in dental anatomy, imaging protocols, and the limited availability of publicly accessible data, ultimately advancing automated teeth analysis for improved dental diagnostics and treatment planning.

Data description

A total of 1800 3D intra-oral scans have been collected for 900 patients covering their upper and lower jaws separately. The ground truth tooth labels and tooth instances for each vertex in the obj files are provided in JavaScript Object Notation (JSON) format. A JSON file example is shown below:

{
    "id_patient": "6X24ILNE",
    "jaw": "upper",
    "labels": [0, 0, 44, 33, 34, 0, 0, 45, 0, .. ,41,  0, 0, 37, 0, 34, 45, 0, 31, 36],
    "instances": [0, 0, 10, 2, 12, 0, 0, 9, 0, 0, .. , 10, 0, 0, 8, 0, 0, 9, 0, 1, 8, 13],
}

The length of the tables "labels" and "instances" is the same as the total number of vertices in the corresponding 3D scan. The label and instance ”0” are reserved by default for gingiva. And, other than ”0”, the unique numbers in table ”instances” indicate the number of teeth in the 3D scan. The labels are provided in the FDI numbering system.

Download

Dataset is structured under 6 data parts. It is required to download all of them and merge them to a same folder architecture. url : https://osf.io/xctdy/

Dataset splits

Two dataset train/test splits are provided , which specify the samples to consider for each dataset: 3D Teeth Seg Challenge split (used during the challenge) Teeth3DS official dataset split

Evaluation metrics

Teeth localization accuracy (TLA): calculated as the mean of normalized Euclidean distance between ground truth (GT) teeth centroids and the closest localized teeth centroid. Each computed Euclidean distance is normalized by the size of the corresponding GT tooth. In case of no centroid (e.g. algorithm crashes or missing output for a given scan) a nominal penalty of 5 per GT tooth will be given. This corresponds to a distance 5 times the actual GT tooth size. As the number of teeth per patient may be variable, here the mean is computed over all gathered GT Teeth in the two testing sets.
Teeth identification rate (TIR): is computed as the percentage of true identification cases relatively to all GT teeth in the two testing sets. A true identification is considered when for a given GT Tooth, the closest detected tooth centroid : is localized at a distance under half of the GT tooth size, and is attributed the same label as the GT tooth
Teeth segmentation accuracy (TSA): is computed as the average F1-score over all instances of teeth point clouds. The F1-score of each tooth instance is measured as: F1=2*(precision * recall)/(precision+recall)

📌 NOTE: Metrics calculation scripts are gathered in evaluation.py in the challenge GitHub repository.

Leaderboard

Team	Exp(-TLA)	TSA	TIR	SCORE	Github link
CGIP	0.9658	0.9859	0.9100	0.9539	GitHub
FiboSeg	0.9924	0.9293	0.9223	0.9480	GitHub
IGIP	0.9244	0.9750	0.9289	0.9427	GitHub
TeethSeg	0.9184	0.9678	0.8538	0.9133	GitHub
OS	0.7845	0.9693	0.8940	0.8826	GitHub
Radboud	0.6242	0.8886	0.8795	0.7974	GitHub

Citing us

@article{ben2022teeth3ds,
title={{Teeth3Ds+: An Extended Benchmark for Intra-oral 3D Scans Analysis}},
author={Ben-Hamadou, Achraf and Neifar, Nour and Rekik, Ahmed and Smaoui, Oussama and Bouzguenda, Firas and Pujades, Sergi and  Boyer, Edmond and Ladroit, Edouard},
journal={arXiv preprint arXiv:2210.06094},
year={2022}
}

@article{ben20233dteethseg,
title={3DTeethSeg'22: 3D Teeth Scan Segmentation and Labeling Challenge},
author={Achraf Ben-Hamadou and Oussama Smaoui and Ahmed Rekik and Sergi Pujades and Edmond Boyer and Hoyeon Lim and Minchang Kim and Minkyung Lee and Minyoung Chung and Yeong-Gil Shin and Mathieu Leclercq and Lucia Cevidanes and Juan Carlos Prieto and Shaojie Zhuang and Guangshun Wei and Zhiming Cui and Yuanfeng Zhou and Tudor Dascalu and Bulat Ibragimov and Tae-Hoon Yong and Hong-Gi Ahn and Wan Kim and Jae-Hwan Han and Byungsun Choi and Niels van Nistelrooij and Steven Kempers and Shankeeth Vinayahalingam and Julien Strippoli and Aurélien Thollot and Hugo Setbon and Cyril Trosset and Edouard Ladroit},
journal={arXiv preprint arXiv:2305.18277},
year={2023}
}

Teeth3DS+: An Extended Benchmark for Intraoral 3D Scans Analysis

Abstract

3DTeethLand Challenge MICCAI 2024

Main objective

Data Description

1- Mesial (red) and Distal (green) Points:

2- Cusp Point (blue):

3- Inner (yellow) and Outer (cyan) Points:

4- Facial Axis (magenta) Point:

Download

Evaluation metrics

Mean Average Precision

Mean Average Recall

Ranking

Leaderboard

3DTeethSeg Challenge MICCAI 2022

Main objective

Data description

Download

Dataset splits

Evaluation metrics

Leaderboard

Citing us