P238 - Automated knee osteoarthritis assessment increases physicians’ agreement rate and accuracy: data from the Osteoarthritis Initiative

Corresponding Author
Christoph Goetz, ImageBiopsy Lab, Employee Richard Ljuhar, Imagebiopsy Lab, Shareholder Tiago Paixao, ImageBiopsy Lab, Employee Zsolt Bertalan, ImageBiopsy Lab, Employee
Presentation Topic
Poster Rating



The diagnosis of knee osteoarthritis depends on the identification and classification of several radiographic features, such as presence and degree of osteophytes, sclerosis, and joint space narrowing (JSN). Here, we assess the impact of a computerized system on physicians’ accuracy and agreement rate, as compared to unaided assessment.

Methods and Materials

A set of 124 unilateral knee radiographs from the OAI study were selected and analyzed by a computerized method with regard to Kellgren-Lawrence (KL) grade, as well as Joint Space Narrowing, Osteophytes and Sclerosis OARSI grades. Physicians were instructed to score all images, with respect to these features, in two modalities: being shown simply the image of a radiograph (unaided) and when presented with the report from the computer assisted detection system (aided). The two reading sessions were separated by an appropriate washout period. The readers were blinded to each other’s grades and to the groundtruth grading (OAI). Agreement rates (Intra-Class Correlation - ICC) between the physicians were calculated for both modalities. The physicians’ performance was compared to the ground truth grading, and sensitivity and specificity in both modalities were calculated for each feature.


Agreement rates for KL grade, sclerosis, and osteophyte OARSI grades, were statistically increased in the aided modality vs the unaided modality. Readings for JSN OARSI grade did not show a statistical difference between the two modalities. Readers’ accuracy for detection of any abnormality (KL>0), osteoarthritis (KL>1), sclerosis (sclerosis OARSI grade > 0), and osteophytosis (osteophyte OARSI grade > 0) was significantly increased in the aided modality. These increases in accuracy were driven by significant increases in specificity, with no statistical difference in sensitivity.



These results show the use of an automated knee osteoarthritis software increases consistency between physicians when grading radiographic features of OA. Furthermore, the use of a software solution increases specificity with no losses in sensitivity.