Review Article

A Systematic Review of Reporting of Psychometric Properties in Educational Research

Simon Ntumi 1 * , Kwesi Twum Antwi-Agyakwa 2
More Detail
1 Department of Educational Foundations, Faculty of Educational Studies, University of Education, Winneba, West Africa, GHANA2 Department of Fisheries and Aquaculture, University of Cape Coast, West Africa, GHANA* Corresponding Author
Mediterranean Journal of Social & Behavioral Research, 6(2), June 2022, 53-59,
Published: 22 March 2022
OPEN ACCESS   601 Views   531 Downloads
Download Full Text (PDF)


Background: The validity and reliability of research outputs are important elements of the research trail. They drive accuracy, transparency, and minimize researcher biases, contributing to rigor and dependability. This paper reviews the frequency of published articles reporting the psychometric properties of the scales/subscales employed in educational research.
Methods: We conducted a systematic review of psychometric properties in educational research papers published between 2010 and 2020 from 15 education-related journals. In our search, we included quantitative studies with primary data. The methodological quality assessment was performed using trained reviewers. The search was conducted using PRISMA 2020 to identify, screen eligible papers for inclusion. The extracted was analyzed using SPSS v25 while reported and interpreted in descriptive statistics.
Findings: We extracted 763 papers published between 2010 and 2020 from 15 education-related journals. More than half of the articles reviewed did not report either validity (n=456 out of 763, 59.8%) or reliability (n=400, out of 763, 52.4%) statistic. For those reporting either validity or reliability, the alpha coefficient was the most widely used statistic to establish reliability (n=185, 50.9%) and correlation coefficient was frequently reported (n=219, 71.3%) for validity.
Conclusions: The paper concluded that to produce dependable conclusions and recommendations in educational research, it is imperative for researchers to pursue psychometric properties to ground their findings and take-home learning.


Ntumi, S., & Twum Antwi-Agyakwa, K. (2022). A Systematic Review of Reporting of Psychometric Properties in Educational Research. Mediterranean Journal of Social & Behavioral Research, 6(2), 53-59.


  1. Adams, E. J., Goad, M., Sahlqvist, S., Bull, F. C., Cooper, A. R., Ogilvie, D., & Connect, C. (2014). Reliability and validity of the transport and physical activity questionnaire (TPAQ) for assessing physical activity behaviour. PloS One, 9(9), 107-139.
  2. Bannigan, K., & Watson, R. (2009). Reliability and validity in a nutshell. Journal of Clinical Nursing, 18(23), 3237-3243.
  3. Barry, A. E., Chaney, B., Piazza-Gardner, A. K., & Chavarria, E. A. (2014). Validity and reliability reporting practices in the field of health education and behaviour: A review of seven journals. Health Education & Behaviour, 41(1), 12-18.
  4. Blumberg, E. J., Hovell, M. F., Kelley, N. J., Vera, A. Y., Sipan, C. L., & Berg, J. P. (2005). Self-report INH adherence measures were reliable and valid in Latino adolescents with latent tuberculosis infection. Journal of Clinical Epidemiology, 58(6), 645-648.
  5. Bolarinwa, O. A. (2015). Principles and methods of validity and reliability testing of questionnaires used in social and health science researches. Nigerian Postgraduate Medical Journal, 22(4), 195-101.
  6. Bull, C., Byrnes, J., Hettiarachchi, R., & Downes, M. (2019). A systematic review of the validity and reliability of patient‐reported experience measures. Health Services Research, 54(5), 1023-1035.
  7. Campos, C.M.C., da Silva Oliveira, D., Feitoza, A. H. P., & Cattuzzo, M. T. (2017). Reliability and content validity of the organized physical activity questionnaire for adolescents. Educational Research, 8(2), 21-26.
  8. Chakrabartty, S. N. (2013). Best split-half and maximum reliability. IOSR Journal of Research & Method in Education, 3(1), 1-8.
  9. Corbett, N., Sibbald, R., Stockton, P., & Wilson, A. (2015). Gross error detection: Maximising the use of data with Uba on global producer III (Part 2). In Proceedings of the 33rd International North Sea Flow Measurement Workshop. Tonsberg, Norway.
  10. Estabrooks, C., Wallin, L., & Milner, M. (2003). Measuring knowledge utilization in health care. Nursing Leadership, 3(3),45-67.
  11. Flake, J. K., Pek, J., & Hehman, E. (2017). Construct validation in social and personality research: Current practice and recommendations. Social Psychological and Personality Science, 8(4), 1-9.
  12. Forza, C. (2002). Survey research in operations management: A process-based perspective. International Journal of Operations and Production Management, 22(2), 152-194.
  13. Hall, B. W., Ward, A. W., & Comer, C. B. (1988). Published educational research: An empirical study of its quality. Journal of Educational Research, 8(1), 182-189.
  14. Hogan, T. P., & Agnello, J. (2004). An empirical study of reporting practices concerning measurement validity. Education and Psychological Measurement, 6(4), 802-812.
  15. Johnson, R. E., Kording, K. P., Hargrove, L. J., & Sensinger, J. W. (2017). Adaptation to random and systematic errors: comparison of amputee and non-amputee control interfaces with varying levels of process noise. PLoS One, 12(3), 17-27.
  16. Kimberlin, C. L., & Winterstein, A. G. (2008). Validity and reliability of measurement instruments used in research. American Journal of Health-System Pharmacists, 65(1), 2276-2284.
  17. Liang, Y., Lau, P. W., Huang, W. Y., Maddison, R., & Baranowski, T. (2014). Validity and reliability of questionnaires measuring physical activity self-efficacy, enjoyment, social support among Hong Kong Chinese children. Preventive Medicine Reports, 1(2), 48-52.
  18. Linn, R. L. (2011). The standards for educational and psychological testing: Guidance in test development. In S. M. Downing, & T. M. Haladyna (Eds.), Handbook of test development (pp. 41-52). Routledge.
  19. Mahmood, K. (2017). A systematic review of evidence on psychometric properties of information literacy tests. Library Review, 7(2), 20-29.
  20. Moana-Filho, E. J., Alonso, A. A., Kapos, F. P., Leon-Salazar, V., Gurand, S. H., Hodges, J. S., & Nixdorf, D. R. (2017). Multifactorial assessment of measurement errors affecting intraoral quantitative sensory testing reliability. Scandinavian Journal of Pain, 16(6), 93-98.
  21. Mohajan, H. K. (2017). Two criteria for good measurements in research: Validity and reliability. Annals of Spiru Haret University, 17(4), 59-82.
  22. Mohamad, M. M., Sulaiman, N. L., Sern, L. C., & Salleh, K. M. (2015). Measuring the validity and reliability of research instruments. Procedia-Social and Behavioural Sciences, 204(2), 164-171.
  23. Oliver, V. (2010). 301 smart answers to tough business etiquette questions. Skyhorse Publishing.
  24. Plake, B. S., & Wise, L. L. (2014). What is the role and importance of the revised AERA, APA, NCME standards for educational and psychological testing? Educational Measurement: Issues and Practice, 33(4), 4-12.
  25. Robson, C. (2011). Real world research: A resource for users of social research methods in applied settings. John Wiley & Sons.
  26. Singh, A. S. (2014). Conducting case study research in non-profit organisations. Qualitative Market Research: An International Journal, 1(7), 77-84.
  27. Squires, J. E., Estabrooks, C. A., O’Rourke, H. M., Gustavsson, P., Newburn-Cook, C. V., & Wallin, L. (2011). A systematic review of the psychometric properties of self-report research utilization measures used in healthcare. Implementation Science, 6(1), 1-18.
  28. Standards for Educational and Psychological Testing. (2014). Standards for educational and psychological testing. American Educational Research Association.
  29. Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach’s alpha. International Journal of Medical Education, 2(7), 53-55.
  30. Twycross, A., & Shields, L. (2004). Validity and reliability what’s it all about? Part 1 validity in quantitative studies: this is one of a series of short papers on aspects of research by Alison Twycross and Linda Shields. Paediatric Nursing, 16(9), 28-29.
  31. Yarnold, P. R. (2014). How to assess the inter-method (parallel-forms) reliability of ratings made on ordinal scales: emergency severity index (version 3) and Canadian triage acuity scale. Optimal Data Analysis, 3(4), 50-54.
  32. Zohrabi, M. (2013). Mixed method research: Instruments, validity, reliability and reporting findings. Theory & Practice in Language Studies, 3(2), 12-18.