The Spinal Tap

Learning Full Spine Segmentations without Training Examples

Rhydian Windsor, Amir Jamaludin, Timor Kadir, Andrew Zisserman


Want to extract 3-D models of the spine from MRI scans

Intermediate step for automated analysis of spine disease e.g. multiple myeloma, ankylosing spondylitis, disc degeneration, scoliosis


Labelled full spine scans are very difficult to get (time consuming, requires clinical expertise)

However, lots of labelled lumbar (lower back) scans available

Using annotated lumbar spine scans only, can we train a model which will generalise up the spine?

Two Approaches

  1. Deformable Part Model with Vector Field Grouping
  2. The Ladder Algorithm

Deformable Part Models

DPMs model an object as a set of parts constrained by a set of spatial arrangements

e.g. Person = 2 legs + 2 arms + face connected in a star shape

Popular approach in pose estimation

Left: Fischler et al. , 1973. Right: Sun et al. , 2019

Deformable Part Models

In our network, predict heatmaps for each corner of vertebrae + centroids

Heatmaps for top left, bottom left, bottom right, top right and centroids of vertebrae in two lumbar scans

Get individual detections by thresholding and finding locally connected parts

Vector Field Grouping

Now we have part detections, next goal is to group into individual vertebrae

Vertebrae vary in size and angle so simply assigning parts to closest centroid gives poor grouping:

Heatmaps for top left, bottom left, bottom right, top right and centroids of vertebrae in two lumbar scans

Instead, we predict a vector field for each corner.

At the point of detections the vector field must point to the centroid of the vertebrate

Vector fields for top left, bottom left, bottom right and top right corners. Blue pixels are supervised signals. Yellow is resulting field

At test time we can perform groupings by getting part detections and observing the corresponding vector field value. We group the part to the centroid it points closest to


Model Architecture

VGG U-Net (Ronneberger et al. , 2015), BatchNorm between layers

L1 loss for heatmaps, L2 loss for vector fields


Generalising Up the Spine

Applying the detector directly to the full spine images does not work very well

This is likely due to resampling the full spine images to squares results in much smaller vertebrae.


To fix this, split full spine scans into smaller images and then perform detections on this image


This results in much better segmentations


Attempt 2: The Ladder Algorithm

Initialise with bounding box around S1 (blue box)

Apply for a fixed number of iterations (6 in lumbar scans, 23 for full spine scans)

Train on lumbar, apply to full spine scans

Similar to induction; we use nth case to show (n+1)th case


Use previous algorithm to find the S1 vertebrate.

During training time use vertebrae pairs next to each other as training examples

At test time, apply for 23 iterations up the spine

Training set: ~6000 lumbar MRI

Test set: ~400 (badly) labelled full spine MRI

Previous state of the art for lumbar scans 99.6% accuracy (Forsberg, 2017)

Little drop in accuracy as we move to full spine images