Learning how to differ: agreement and reliability statistics in psychiatry

D L Streiner

Learning how to differ: agreement and reliability statistics in psychiatry

Can J Psychiatry. 1995 Mar;40(2):60-6.

Author

D L Streiner¹

Affiliation

¹ Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, Ontario.

PMID: 7788619

Abstract

Whenever two or more raters evaluate a patient or student, it may be necessary to determine the degree to which they assign the same label or rating to the subject. The major problem in deciding which statistic to use is the plethora of different techniques which are available. This paper reviews some of the more commonly used techniques, such as Raw Agreement, Cohen's kappa and weighted kappa, and shows that, in most circumstances, they can all be replaced by the intraclass correlation coefficient (ICC). This paper also shows how the ICC can be used in situations where the other statistics cannot be used and how to select the best subset of raters.

MeSH terms

Humans
Mental Disorders / classification
Mental Disorders / diagnosis*
Mental Disorders / psychology
Observer Variation
Personality Assessment / statistics & numerical data*
Psychiatric Status Rating Scales / statistics & numerical data*
Psychometrics
Reproducibility of Results