Quantitative Data Analysis

Quantitative data analysis is a complex issue and from my experience also intimidating - especially if you lack a strong background in mathematics or computer science. Endless spreadsheets, endless choices about what test or analysis to run, and endless questions about whether what you did was correct. On top of this, all this usually happens in a more or less user friendly and intuitive software. I have a background in physics especially in theoretical astrophysics but still I struggled with many of these issues for a while because I did not recognise any - not to say coherent - structure within quantitative data analysis. In this blog post I want to share some insights and resources that have really helped me to develop a better understanding and become confident in statistical analyses.

Theory, Design, Analysis - repeat

What will probably help you the most is to get two things straight in the very beginning. Your theory and based on that your research design. Based on a well developed theory you will usually be able to come up with a design that allows you to answer your research questions using relatively simple statistical tools. Well developed theory helps you to pinpoint what it really is you have to measure and how you should measure it. While you are thinking this through, ask yourself how you would analyse the resulting data to answer your research questions. If you can't answer that question or the answer becomes very complicated, you might want to reconsider. Doing this will make your life easier and your research better. Also, it allows you to preregister your study.

rethinking your analysis

There are two books that really helped me to get a better sense and understanding of statistics. The first one is "Data Analysis Using Regression and Multilevel/Hierarchical Models" by Andrew Gelman and Jennifer Hill. The book is thorough in its mathematical background yet accessible and gives many great examples on how to build and evaluate statistical models. The other title is "Statistical Rethinking" by Richard McElreath. Its main message is about how to use Bayesian statistics but regardless of whether you want to do that or not it provides great insight in statistical modelling, the role of statistical models and theories, and how to evaluate models. 




R

Remember how I wrote about not to user friendly statistical software? R is just that. So, why should you still use it? R offers basically endless flexibility and devoid of a point and click interface it forces you to really know what you want to do before you can do it, which brings back up my second point. While R may have a steeper learning curve than say SPSS, it offers more possibilities without cluttering your screen with a list of all kinds of possible statistical models you may or may not want to use. I recommend "R for Data Science" by Garrett Grolemund and Hadley Wickham for an accessible introduction.



In sum, I advocate for spending more time on thinking about your theory and research design in conjunction with the statistical analysis you want to perform before you start to collect data. That should help you to get better data and make stronger inferences.


Written by Marcus Kubsch, Research assistant and doctoral student at IPN - Leibniz Institute for Science and Mathematics Education.
You can follow me on twitter @MarKubsch

Comments