On this page, we present the results obtained during the testphase of the data challenge (January 2020)
The feedbacks received from this testphase were useful to improve the data challenge before setting the new deadline for the phase 1 to the 30th of September 2020.
Among the 51 participants who registered on CodaLab, we received 9 submissions. We publish here the results from the ADI subchallenge (subchallenge_1/) on the VLT/SPHEREIRDIS data taken with the K1 narrowband filter (SPHERE_IRDIS_3/).
We warmly thanks the participants who have submitted their results for this first testphase (note that the results submitted during the testphase remain valid for the final phase 1).
1. Data VLT/SPHEREIRDIS narrowband
We chose this dataset among the data for the data challenge, (presented on this page), because this is the most used SPHERE mode for exoplanet detection (e.g. for the SHINE large survey, Chauvin et al., 2017).
The observing conditions were not stable so the image cube shows large temporal variations.
Figure 1. First frame of the provided data cube (left), temporal median of the data cube (middle) and normalised offaxis PSF (right). The intensity is the raw contrast in terms of magnitude. 
2. Baseline result
In order to define the detection limit around which the synthetic planetary signals are injected, we ran a classic annular PCA from the VIP toolbox. The resulting detection map is shown in Fig. 2 and the F1score obtained for this widely used specklesubtraction technique is 0.45.
Figure 2. Postprocessed image using a classic annular PCA, as implemented in VIP.
3. Results from participants
From the submissions on CodaLab, 6 are valid and the results are shown below.
The six detection maps are displayed from their minimum value to the participant provided threshold: every resel above this threshold is considered as a detection. The true positifs are encircled in dashed pale yellow.
From these detection map, we extracted the true positive fraction (TPF, see definitions here) and the false positive fraction (FPF). For this we first apply the same binary mask to all the detection maps (between a radius of 15 to 70 pixels, the detection maps being 159x159 pixels). We then varied the threshold from 0.1 to 10 and counted the detections per resels, compared to the injected signals. The FPF (red solid line) and the TPF (green solid line) as a function of the threshold are shown below the detection maps. The goal is to minimize the area under the red curve (the FPF must be as close as possible to zero, whatever the threshold), and to maximize the area under the green curve (the TPF must be as close as possible to one, whatever the threshold).
The global F1score (defined here, to be as close as possible to 1) is indicated below each detection map.
Figure 3. Detection maps submitted (the colorbar is given from the minimal value to the given threshold): at the participantprovided threshold, true detections are encircled in pale yellow. For each image we plotted the corresponding FPF (red) and TPF (green) as a function of threshold (vertical line is the participantprovided threshold), and the F1score is indicated below. 
More information about each algorithm used can be find at the end of this page.
4. Discussions
4.1 Visual inspection

First, we can notice that similar concepts show very comparable structures in the detection map. For instance, speckle subtraction techniques (such as the baseline, PCA_padova and PCA_mpia) show similar residuals, and inverse problems based techniques (such as FMMF and ANDROMEDA) show similar structures due to correlation of the residuals with the planetary signal model. The RSM map looks for a noiseplanet regime switch along the temporal axis, leading to high signaltonoise ratio. The STIMca map combines the PCA subtracted data by taking into account the temporal statistics of the residual noise to optimise the signaltonoise ratio.

Second, some algorithm process the whole field of view (FMMF and marginally PCA_padova), while the other ones are working in inscribed circles.

Third, similarly, some algorithm process the data as close as possible to the star (STIM+), whereas others are starting at larger angular separation (FMMF), depending on the concept behind (angular subtraction etc.)
4.2 True Positive Fraction
In terms of TPF (green curves), the STIM+ map stands beyond other methods, by always revealing the 5 injected companions even at very high thresholds. Both FMMF and PCAPadova provide a threshold (dashed vertical line) that detects the 5 injected companions.
4.3 False Positive Fraction
In terms of FPF (red curves), the RSM map stands beyond other methods, with no false positives, whatever the threshold. On the contrary the speckle subtraction technique PCA_padova shows a high number of false positives all over the field of view. Notably, FMMF and ANDROMEDA, based on a very similar approach (modeling and tracking the planetary signal after a speckle subtraction), have a very similar trend of false positive fraction as a function of threshold.
In terms of tradeoff, FMMF is the most powerful technique to minimize the FPF while maximizing the TPF in order to detect the 5 injected signals above the participantprovided threshold (hence the larger F1score).
4.4 Running time
Ongoing
5. Additional information about the algorithms used
 Forward Modeling Matched Filter (FMMF): available in the pyKLIP package.
 ANDROMEDA (ANDRO): available in the VIP package.
 Regime Switching Model (RSM): available in the VIP package.
 Standardized Trajectory Intensity Mean (STIM): available in the VIP package.
 Principal Component Analysis (PCA, KLIP)