Problem class, ground truth annotations and reported metrics

Last updated: November 18st, 2020

Introduction

To perform benchmarking, ground truth annotations should be encoded in a format that is specific to the associated problem class. BIA workflows are also expected to output results in the same format.

Currently 9 problem classes are supported in BIAFLOWS and their respective annotation formats and computed benchmark metrics are described below.

Note: each problem class has a long name (explicit) and short name (e.g. Object Segmentation / ObjSeg). The same hold for metrics (e.g. DICE / DC).

A description of each benchmark is available on the workflow runs result table by clicking on the symbol.

screenshot

Problem Class

Problem Class Tasks Shortname Annotation Example Metrics Tools
Object Segmentation Delineate objects or isolated regions ObjSeg
Label masks

Label masks
sample
  • DICE (DC) et AVERAGE_HAUSDORFF_DISTANCE (AHD), computed by VISCERAL executable (archived here)
  • Fraction overlap (FOVL) computed by custom Python code
  • Mean Average Precision computed by Data Science Bowl 2018 Python code
Pixel/Voxel classification Estimate pixels class PixCla
Label masks

Label masks
sample
  • F1_SCORE (F1), ACCURACY (ACC), PRECISION (PR), RECALL (RE), computed by custom Python code
Spot/Object Counting Estimate the number of objects SptCnt
2D/3D binary masks, exactly 1 spot/object per non null pixel

Binary masks
sample
  • RELATIVE_ERROR_COUNT (REC), computed by custom Python code.
Spot/Object Detection Detect objects in an image (e.g. nucleus) ObjDet
Label masks

Binary masks
sample
  • CONFUSION_MATRIX (TP, FN, FP), F1_SCORE (F1), PRECISION (PR), RECALL (RE), Distance RMSE (RMSE), computed by Particle Tracking Challenge metric Java code (particle matching only, archived here in bin/DetectionPerformance.jar
Filament Tree Tracing Estimate the medial axis of a connected filament tree network (one per image) TreTrc
SWC

SWC
sample
SWC format
  • UNMATCHED_VOXEL_RATE (UVR), computed by custom Python code
  • NetMets metrics: Geometric False Negative rate (FNR), Geometric False Positive rate (FPR) computed by NetMets Python code.
  • GATING_DIST (UVR): Maximum distance between skeleton voxels in reference and prediction skeletons to be considered as matched (default = 5 pix)
  • Sigma (NetMets): tolerance in centreline position (default: 5 pix).
Filament Networks Tracing Estimate the medial axis of one or several connected filament network(s) LooTrc
Skeleton binary masks

Skeleton binary masks
sample
  • UNMATCHED_VOXEL_RATE (UVR), computed by custom Python code
  • NetMets metrics: Geometric False Negative rate (FNR), Geometric False Positive rate (FPR) computed by NetMets Python code.
  • GATING_DIST (UVR): Maximum distance between skeleton voxels in reference and prediction skeletons to be considered as matched (default = 5 pix)
  • Sigma (NetMets): tolerance in centreline position (default: 5 pix)
  • Skeleton sampling distance (NetMets): skeletons are sampled to be converted to OBJ models. (default: 3 voxels, default Z Ratio: 1).
Landmark Detection Estimate the position of specific feature points LndDet
2D/3D class masks, exactly 1 landmark per non null pixel, gray level encodes landmark class (1 to N, N is the number of landmarks)

Label masks
sample
  • Number of reference / predicted landmarks (NREF, NPRED)
  • Mean distance from predicted landmarks to closest reference landmarks with same class (MRE).
  • All metrics computed by custom Python code.
Particle Tracking Estimate the tracks followed by particles (no division) PrtTrk
2D/3D label masks, exactly 1 particle per non null pixel, gray level encodes particle track ID

Label masks
sample
  • Normalized pairing score alpha (NPSA)
  • Full normalized pairing score beta (FNPSB)
  • Number of reference tracks (NRT)
  • Number of candidate tracks (NCT)
  • Jaccard Similarity Tracks (JST)
  • Number of paired tracks (NPT)
  • Number of missed tracks (NMT)
  • Number of spurious tracks (NST)
  • Number of reference detections (NRD)
  • Number of candidate detections (NCD)
  • Jaccard similarity detections (JSD)
  • Number of paired detections (NPD)
  • Number of missed detections (NMD)
  • Number of spurious detections (NSD)
  • All metric computed by Particle Tracking Challenge Java code (archived here).

  • GATING_DIST (default = 5, maximum distance between particle detections in reference / prediction tracks to be considered as matching)
Object Tracking Estimate object tracks and segmentation masks (with possible divisions) ObjTrk
2D/3D label masks, gray level encodes object ID + division text file
										(see Cell Tracking Challenge format)

Label masks + Division text file
sample
  • Segmentation measure (SEG), implementation archived here
  • Tracking measure (TRA), implementation archived here
  • All computed from Cell Tracking Challenge metric command-line executables.