Has the Reliability of the Scoring Been Assessed?

Examiners may use an instrument to assess disease levels (such as a periodontal disease index) or perform some type of rating during investigations. This practice brings up the issue of reliability or consistency of the rating process. Consistency is essential, since experiments and their results should be reproducible. Intra-examiner reliability, the ability of an examiner to rate the same conditions in the same way over time, is usually acceptable with training and experience. Inter-examiner reliability, consistency among examiners, is more difficult to achieve.

When multiple examiners are involved, they should undergo a program where they are trained to use the rating criteria. This training process is often referred to as calibration. In the research report, the author should describe the efforts toward achieving a high level of reliability and provide the numerical value obtained for reliability whenever possible. For example, Cohen's Kappa Coefficient of Reliability (κ) yields a score of 0 to 1, where 1 represents perfect reliability or agreement among raters. A κ value greater than 0.7 is typically considered a good level of agreement.