The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements

J Sim; Christine Wright

The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements

J Sim, Christine Wright

Nursing and Midwifery

Research output: Contribution to journal › Article

2183 Citations (Scopus)

Abstract

PURPOSE: This article examines and illustrates the use and interpretation of the kappa statistic in musculoskeletal research. SUMMARY OF KEY POINTS: The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings. Often, these ratings lie on a nominal or an ordinal scale. For such data, the kappa coefficient is an appropriate measure of reliability. Kappa is defined, in both weighted and unweighted forms, and its use is illustrated with examples from musculoskeletal research. Factors that can influence the magnitude of kappa (prevalence, bias, and non-independent ratings) are discussed, and ways of evaluating the magnitude of an obtained kappa are considered. The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated. CONCLUSIONS: The article concludes with recommendations for the use and interpretation of kappa.

Original language	English
Pages (from-to)	257-268
Number of pages	12
Journal	Physical Therapy
Volume	85
Issue number	3
Publication status	Published - 31 Mar 2005

Cite this

@article{14147f0cf81646839b18295550d32bc3,

title = "The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements",

abstract = "PURPOSE: This article examines and illustrates the use and interpretation of the kappa statistic in musculoskeletal research. SUMMARY OF KEY POINTS: The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings. Often, these ratings lie on a nominal or an ordinal scale. For such data, the kappa coefficient is an appropriate measure of reliability. Kappa is defined, in both weighted and unweighted forms, and its use is illustrated with examples from musculoskeletal research. Factors that can influence the magnitude of kappa (prevalence, bias, and non-independent ratings) are discussed, and ways of evaluating the magnitude of an obtained kappa are considered. The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated. CONCLUSIONS: The article concludes with recommendations for the use and interpretation of kappa.",

author = "J Sim and Christine Wright",

year = "2005",

month = mar,

day = "31",

language = "English",

volume = "85",

pages = "257--268",

journal = "Physical Therapy",

publisher = "American Physical Therapy Association",

number = "3",

}

TY - JOUR

T1 - The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements

AU - Sim, J

AU - Wright, Christine

PY - 2005/3/31

Y1 - 2005/3/31

N2 - PURPOSE: This article examines and illustrates the use and interpretation of the kappa statistic in musculoskeletal research. SUMMARY OF KEY POINTS: The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings. Often, these ratings lie on a nominal or an ordinal scale. For such data, the kappa coefficient is an appropriate measure of reliability. Kappa is defined, in both weighted and unweighted forms, and its use is illustrated with examples from musculoskeletal research. Factors that can influence the magnitude of kappa (prevalence, bias, and non-independent ratings) are discussed, and ways of evaluating the magnitude of an obtained kappa are considered. The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated. CONCLUSIONS: The article concludes with recommendations for the use and interpretation of kappa.

AB - PURPOSE: This article examines and illustrates the use and interpretation of the kappa statistic in musculoskeletal research. SUMMARY OF KEY POINTS: The reliability of clinicians' ratings is an important consideration in areas such as diagnosis and the interpretation of examination findings. Often, these ratings lie on a nominal or an ordinal scale. For such data, the kappa coefficient is an appropriate measure of reliability. Kappa is defined, in both weighted and unweighted forms, and its use is illustrated with examples from musculoskeletal research. Factors that can influence the magnitude of kappa (prevalence, bias, and non-independent ratings) are discussed, and ways of evaluating the magnitude of an obtained kappa are considered. The issue of statistical testing of kappa is considered, including the use of confidence intervals, and appropriate sample sizes for reliability studies using kappa are tabulated. CONCLUSIONS: The article concludes with recommendations for the use and interpretation of kappa.

M3 - Article

C2 - 15733050

VL - 85

SP - 257

EP - 268

JO - Physical Therapy

JF - Physical Therapy

IS - 3

ER -

The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements

Abstract

Fingerprint

Cite this