Table 2

Validation of models’ performance. Hold-out and external validation performance scores for multiclass and binary predictions. External validation was conducted on patients from the POMA study

Multiclass predictions
ValidationModelN featuresAUC-PRC (95% CI)AUC-ROC (95% CI)F1 score (95% CI)Precision (95% CI)Recall (95% CI)
Hold-outAP1_mu110.649 (0.646 to 0.652)0.847 (0.847 to 0.847)0.529 (0.528 to 0.530)0.559 (0.532 to 0.586)0.633 (0.633 to 0.633)
AP5_mu3040.665 (0.661 to 0.669)0.842 (0.841 to 0.843)0.488 (0.485 to 0.491)0.500 (0.469 to 0.531)0.614 (0.612 to 0.616)
AP5_top5_mu50.650 (0.647 to 0.653)0.846 (0.845 to 0.847)0.552 (0.549 to 0.555)0.558 (0.537 to 0.579)0.629 (0.627 to 0.631)
ExternalAP1_mu110.727 (0.726 to 0.728)0.881 (0.881 to 0.881)0.593 (0.592 to 0.594)0.637 (0.631 to 0.643)0.670 (0.670 to 0.670)
AP5_top5_mu50.589 (0.583 to 0.595)0.838 (0.836 to 0.840)0.563 (0.555 to 0.571)0.566 (0.552 to 0.580)0.596 (0.585 to 0.607)
Binary predictions
ValidationModelN featuresAUC-PRC (95% CI)AUC-ROC (95% CI)F1 score (95% CI)Precision (95% CI)Recall (95% CI)
Hold-outAP1_bi110.637 (0.636 to 0.638)0.700 (0.699 to 0.701)0.631 (0.630 to 0.632)0.669 (0.666 to 0.672)0.666 (0.666 to 0.666)
AP5_bi3040.598 (0.590 to 0.606)0.675 (0.674 to 0.676)0.622 (0.619 to 0.625)0.636 (0.633 to 0.639)0.648 (0.645 to 0.651)
AP5_top5_bi50.618 (0.613 to 0.623)0.693 (0.689 to 0.697)0.660 (0.656 to 0.664)0.675 (0.670 to 0.680)0.680 (0.676 to 0.684)
ExternalAP1_bi110.764 (0.762 to 0.766)0.780 (0.780 to 0.780)0.714 (0.713 to 0.715)0.733 (0.732 to 0.734)0.726 (0.726 to 0.726)
AP5_top5_bi50.688 (0.683 to 0.693)0.702 (0.691 to 0.713)0.653 (0.615 to 0.691)0.669 (0.641 to 0.697)0.663 (0.626 to 0.700)
  • AP, AutoPrognosis; AUC-PRC, area under the precision-recall curve; AUC-ROC, area under the receiver operating characteristic curve; POMA, Pivotal Osteoarthritis Initiative MRI Analyses.