To an increasing extent international assessment results inform education policy debates, yet little is known about the floor effects in these assessments. Floor effects can be understood as non-differentiation among worse performing students, because the test is set at too difficult a level, relative to the competencies of students. The paper explores the magnitude of these effects in widely used international assessment programmes, and the implications of this, for instance in terms of the comparability of national statistics across space and time. First, the theory and statistics required are clarified. In particular, the effects of random guessing where multiple choice questions are used, must be properly taken into account. Microdata from the international TIMSS programme, and the regional SACMEQ (Southern and East Africa) and LLECE (Latin America) programmes are analysed, with reference to primary schools. Results in reading and mathematics are considered. Both item response theory (IRT) scores and the underlying classical scores are examined.
In TIMSS, floor effects have been greatly reduced through the introduction, in 2015, of TIMSS Numeracy, which includes a greater number of easier items relative to regular TIMSS. SACMEQ and LLECE, despite being specifically designed for developing countries, often display worryingly large floor effects, in part because of their almost total reliance on multiple choice questions. This results in a situation where many students scoring zero, after adjustments for random guessing, are classified as having passed proficiency thresholds. To illustrate, a quarter of students in the worst performing LLECE countries fall into this anomalous category. This is clearly not desirable and undermines a basic purpose of the assessment. Though these floor effects do not substantially alter the rankings of countries, they are large enough to undermine proper monitoring of progress over time. They can also undermine public trust in the programmes, and leave information gaps in relation to those students requiring most support. The designers of SACMEQ and LLECE could reduce floor effects through the presence of a greater number of easy multiple choice items. Greater utilisation of constructed response items, as in TIMSS, is another possibility, but a more costly and complex one.
Martin Gustafsson is an education economist who has worked extensively for the South African Ministry of Education as an analyst.
Bilal Barakat is a senior policy analyst within the GEM Report team at UNESCO, with a background in statistics, demographics and economics.