Skip to content

REVIEW ARTICLE OF AUTOMATED AI-BASED SEGMENTATION OF ARTICULAR TISSUES

Published on January 9, 2026 by Chondrometrics-admin

Chondrometrics authors publish a review on clinical validation of automated AI based segmentation of articular tissues in a special issue of Osteoarthritis Cartilage Open, with reference to two recently published original articles by the group.

Chondrometrics’ CEO Wolfgang Wirth, and former team and now SAB member Jana Eder, have just published a review in which they put the focus on how the performance of new segmentation algorithms should be tested in a setting of reproducing observed clinical effects vs. a gold standard approach (https://pubmed.ncbi.nlm.nih.gov/41550414/)

Automated image analysis has recently made enormous technical progress in osteoarthritis imaging science. Deep learning (DL)–based tissue segmentation using AI has displayed good accuracy metrics and reproducibility benchmarks. But technical validation alone is not enough.

The review is part of a Special Issue on “Artificial Intelligence in Osteoarthritis Imaging” in “Osteoarthritis and Cartilage Open” (OACopen) edited by Chunyi Wen, ThomasbLink, Simo Saarakkala and Chondrometrics’ CMO Felix Eckstein The review looks beyond Dice similarity scores and asks a harder, and clinically relevant question: Do fully automated methods behave like clinically meaningful biomarkers when applied longitudinally, and can they reproduce clinical effects observed based on benchmark manual segmentations with expert quality control?

From  873 publications screened, only 9 evaluated the clinical validity of automated (peri-)articular tissue analysis — and only 5 directly compared the sensitivity to differences in osteoarthritis progression between clinically defined cohorts. The key message is: Clinically validated, automated methods closely match manual expert reference measurements in sensitivity to change over time, and in discrimination between structural progression rates  However, most published methods stop at the technical validation level and fail to demonstrate that they can reproduce known clinical patterns of disease progression, risk stratification, or treatment response, which is the ultimate goal of these technologies

As automated analysis moves into multi-centre studies, clinical trials, and patient-level decision-making, clinical validation is not a “nice to have”. It is a bridge from algorithm to actual evidence.

The review refers to two sample articles of Chondrometrics’ authors:

In their original article in the same issue (A fully-automated technique for cartilage morphometry in knees with severe radiographic osteoarthritis – Method development and validation) Wolfgang Wirth and Felix Eckstein point out that AI misses what It cannot see: https://pubmed.ncbi.nlm.nih.gov/40697622/  (left infographic)

In severe osteoarthritis, cartilage disappears, displaying areas of full-thickness loss and denuded areas of subchonral bone (dAB). These exposed regions are associated with knee pain and further structural deterioration, making their accurate delineation critically important. Patients with such dAB lesions may be eligible for DMOAD trials if the drug tested shows promise in promoting their repair — so the technology must be able to detect treatment effects in precisely those areas. The tricky part: AI, like humans, tends to look for what it expects to find. This is “positive detection bias and implies that absence often goes unnoticed — even though it may matter most. In practice, algorithms thus may “see” cartilage where MRI does not show any. The authors therefore challenge their previous algorithms to do better! They combine AI-based cartilage analysis with subchondral bone segmentation, and with a smart post-processing tool. This algorithm not only finds cartilage, but also recognizes when it’s gone. A critical capability that conventional approaches often miss. After all, true intelligence isn’t just seeing what is there, it’s to know where there is nothing

In another original article (Comparison between coronal FLASH and sagittal double echo steady state MRI in detecting longitudinal cartilage thickness change by fully automated segmentation – Data from the FNIH biomarker cohort) in the same Special Issue, Felix Eckstein et al. focus on the application of the new algorithm in a longitudinal setting: https://pubmed.ncbi.nlm.nih.gov/40822965/ (right infographic)

A useful metric to assess longitudinal performance of a new algorithms is the standardized response mean (SRM; mean/ SD), not the magnitude of cartilage loss. If automated analysis overestimates change, the SRM is inflated, but this does not help differentiating more or less change as required for clinical trials (Cohen’s D). To overcome this, the authors applied the above algorithm to the FNIH Biomarker Consortium dataset, containing progressors and non-progressors, defined a priori by change in radiography and pain. Comparing automated vs. gold-standard manual segmentation, and coronal FLASH vs. sagittal DESS MRI, the authors find that all methods differentiate progressors very well from non-progressors. Auto-FLASH slightly outperformed auto-DESS segmentation, but manually segmented DESS (with expert QC) slightly outperformed both automated methods This is a significant step forward in applying automated segmentation to clinical trials of cartilage-preserving therapies.

2 Comments

  1. Felix Eckstein

    Kudos, Jana and Wolfgang, for taking on the task of reviewing this somewhat neglected topic of “clinical validation” of novel automated imaging biomarkers in a Special Issue in Osteoarthritis Cartilage Open, on the use of “AI in osteoarthritis research”.

    1. Jana

      Glad to have contributed to this review. A central point is that many AI segmentation papers stop at technical validation. Demonstrating clinical validity, the ability to reproduce meaningful clinical effects, is essential if automated analysis is to support clinical research and trials.

Leave a Comment

Your email address will not be published. Required fields are marked *

After you submit your comment, it may take a little time to be approved and shown on the site.