The aim of this paper is to understand students’ response-styles in large-scale assessment studies, starting from a specific deepening of data from the International Civic and Citizenship Education Study (ICCS) 2016 by the International Association for the Evaluation of Educational Achievement, that focuses on the process of building citizenship measures.
Different studies show the presence of response-styles, distortions or misrepresentations on answering, the presence of potential methodological biases during fieldwork and various measurement errors such as social desiderability or acquiescence. It is already difficult to manage these aspects when the researcher goes personally into the fieldwork to administer research tools, it is more difficult to check it when the research consists of secondary data analyses.
Generally in large-scale surveys administrating research tools, both cognitive (such as testing) and of attitude and perception (such as questionnaires), takes place in one-to-many mode, sometimes online, and is by an administrator who, being neither the researcher nor an interviewer, receives specific training on administration procedures and how to ensure responses.
If the pre-test phase does not provide the stages of knowing if there is a real respondents’ understanding of the research tools and, in other terms, if the process of double hermeneutics has been provided (Giddens, 1976; Palumbo 1992, 2004), managing it during the fieldwork administration is very difficult.
Indeed, precisely because of a rationalization of research procedures, often the assumption is that respondents are able to answer to questionnaires because they understand that. This is usually not only for cognitive tests, which are the focus of large-scale studies on student achievement, but also for questionnaires about the opinions and attitudes.
What is the response-style of students on the bases of age, origin e.g. immigrant background, context? What influence does the type of question? Are there differences of these effects between cultures?
To answer these research questions, the European regional questionnaire of ICCS 2016 had been studied. The European regional questionnaire is a research instrument administered to students from the European participating countries, with the aim of assessing aspects of civic and citizenship education related to the European context. The questions are measured with ordinal scales, e.g. Likert scale. The sample taken into account for the analyses is that of the 3,766 students attending the third class of the 170 secondary schools of the Italian grade, representative for macro-geographical areas.
With the aim of detecting distortions due to missing data, acquiescence phenomena, response set and estimating the perceived distance between the different modes of response, a correspondence analysis (Benzécri, 1970; Amaturo, 1989; Marradi & Macrì, 2012) on data from ICCS 2016 had been done to study the equalities of the categories of some questions from the European regional questionnaire. The aim is to study the influence of the type of question on the behavior of the respondents and to estimated the distance among response categories (Marradi, 2002; 2012).
In the international large-scale studies, ordinal scales are often treated as cardinal or quasi-cardinal scales, and are used to construct factors using factorial techniques and estimating the reliability of the scale (in general estimated by Cronbach’s α). However, this approach assumes that the categories are perceived by respondents as equidistant from each other even though studies over time have shown that this is not usual.
In addition, when there are categories, because there are not intersubjective and replicable units, it is preferable to use the correspondence analysis approach rather than using factorial and principal component analyses (Marradi & Gasperoni, 2002; Of Franco, 2006, 2011). The categories of Likert scales are not evenly perceived by respondents and omitted responses mean, especially where there is no neutral/intermediate response category but the respondent is required to polarize or negatively or positively.
The correspondence analysis is a factorial technique that provides synthetic representations of large arrays of data. Like all of factorial techniques, the correspondence analysis synthesizes variables through one or more combinations of the same (factors), and groups cases that are homogeneous with a certain group of variables (Di Franco, 2007). From the technical point of view, it can be considered a particular form of principal component analysis, with similarities both mathematically and geometrically; the difference consists of that principal component analysis can only be applied to cardinal variables (Di Franco & Marradi 2003). The matrix that is tested for matches is a contingency table between two or multiple category variables. The representations we obtain with this technique are geometric; the proximity between the dots is interpreted as semantic proximity (Amaturo 1989). These representations are the value of the analysis of matches, as they allow having a synthetic image of very large arrays. For this research I built, based on the frequency distributions, arrays that had in line the categories of the Likert scales analyzed and in column the response categories. Applying the correspondence analysis two factors are extracted interpreted such as the representation of the agreement-disagreement dimension (first factor) and the intensity of attitude (second factor). STATA software has been used for data analysis
Results show that the equivalence between the response categories is guaranteed where questions are on facts or concrete aspects, while there are criticisms where people are required to express an opinion on aspects more related to the value dimension, which is also subject to bias such as acquiescence or social desirability in the answers, as well as defects of form, such as semantic overlap although partial (not all the items), despite being the first question of the questionnaire and therefore assumes a greater degree of attention.
Data collection is a very important phase that the researcher conducting secondary data analyses is not able to control directly. Results underline the respondents’ comprehension of the questions and if there are semantic differences or similarities among items of the same battery, in the idea that having comparable indicators, if properly constructed and used by researchers with caution, can favor the comparison between the various educational systems and be a useful policy-orienting tool.