Reliability in assessment-what role should it play and how should we explain it?
This symposium will explore the concept of reliability and alternative ways of representing it both technically and publicly. Current concepts of reliability and their applicability to different types of assessment will be considered. These will include test-related reliability (with an emphasis on errors of measurement) and decision-related reliability (with an emphasis on comparability of judgments). The relationship between reliability and validity will be explored. The three presentations will cover: background on theories of reliability and history on public concerns about reliability; alternative ways of framing our approach to uncertainty in assessment, some of the difficulties in explaining these matters publicly, as well as some possible ways of doing this more satisfactorily; and reliability as a media story-the case of England-public understandings and levels of tolerance and how to manage them better.
There will be three presentations:
1. Recent reassertion and continuing concern in various countries about the importance of reliability in assessment: the nature of these concerns and their origin. Current thinking about reliability and its relationship to validity: repositioning the debate to place validity first and take steps to see consistency/certainty as a work in progress (do as well as possible).
2. Dealing with inconsistency/uncertainty: sorting out different types of reliability; allowing different levels of consistency/certainty for different types/uses of assessment; generalisation as a broader concept. Avoiding use of the term 'error'. What is it we are trying to assess and what are the relevant sources of inconsistency/uncertainty? What is their relative importance and what can we do about them?
3. Explaining reliability (and unreliability) to the public: reliability as a media story-the case of England. Why reliability is hard to explain. Public understanding and tolerance for different sources of unreliability in examination results; how these might be handled better. Some suggested ways forward.
