• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Data Analysis in Sociology

2020/2021
Academic Year
ENG
Instruction in English
4
ECTS credits
Course type:
Compulsory course
When:
2 year, 3, 4 module

Instructors


Vólchenko, Olesya

Course Syllabus

Abstract

This course lasts for three years. The 1st year aims at beginners. This year starts from introductory topics (variable types, hypothesis testing, descriptive statistics) to working with some methods (chi-square, t-test, nonparametric statistics, one-way ANOVA, and linear regression). The course covers the building blocks of quantitative data analysis with the aim to train students to be informed producers and consumers of quantitative research. The applied part introduces working in R (RStudio) for calculations and reporting. No prior knowledge of R is required, but understanding the key ideas in statistics and probabilities is an asset. This course is the starting point for students interested in pursuing training in advanced methods of data analysis or planning to use quantitative methods in their own research.
Learning Objectives

Learning Objectives

  • develop skills necessary to solve typical data analysis problems on social data in the R software environment
Expected Learning Outcomes

Expected Learning Outcomes

  • Choose appropriate methods and techniques for certain types of variables and certain aims of the analysis
  • Give meaningful interpretation of statistical results: regression coefficients, tables, plots and diagrams (produced in R)
  • Perform data transformations
  • Represent graphically the results of the statistical analyses
  • Create analytical reports describing all the stages of analysis and interpreting its results
  • Conduct statistical analyses in RStudio
Course Contents

Course Contents

  • Central tendency measures
    Mean, median, mode. Standard normal distribution and its use. Z-scores. Moments of distributions. Distribution plots and reading them. Sources of bias in data. Interpretation of z-scores. Mean as a data model.
  • Chi-square
    Observed and expected frequencies. Measures of association for categorical variables. Reading and interpreting chi-square tests. Assumptions of chi-square. Independence. Standardised residuals. Odds ratio. Chi-square and other association measures in R.
  • Two means comparison
    Independent and paired samples. Assumptions behind the t-test. One-sample t-test. Two-sample t-tests. Nonparametric tests for two samples and for multiple samples. Reading and interpreting means comparison. Confidence intervals. Means comparison in R
  • One-way ANOVA
    Assumptions and usage of ANOVA. Between-group and within-group variance, their ratio. Planned and non-planned comparisons; corrections. Post hoc comparisons for equal and unequal variances. Reading and interpreting ANOVA. One-way ANOVA in R. Presenting the results of ANOVA. Getting to know RMarkdown: reports and slide shows.
  • Linear regression
    Correlations. Research problems for correlational analysis. Correlation coefficients for different types of data. ANOVA, correlation, regression as linear models. Building a linear regression. Ordinary least squares. Fitting the regression line. Assumptions behind linear regression. Reading and interpreting regressions. Presenting and interpreting a linear regression. Categorical predictors in a linear regression. Dummy-coding. Linear regression in R. Plotting linear regressions in R (case studies).
  • Linear regression with multiple predictors
    The concept of interaction effects for categorical by categorical, categorical by continuous, continuous by continuous variables. Effect coding. Centring. Multicollinearity. Reading and interpreting interaction models in a linear regression. Testing for interactions in R. Reporting and interpreting a linear regression with interactions.
Assessment Elements

Assessment Elements

  • non-blocking Projects
    Late submissions are not considered (try us). If you are ill during the project submission, present a medical certificate to get the formula adjusted for you. If you miss more than one project, there might be a makeup assignment. When you submit a project in MS Teams, you must click on the "Turn in" button to complete the submission. All projects are, first, posted to the dedicated channel where they are peer-reviewed, and submitted in the Assignments section by each contributing student. If you have any questions about the project, sign up for a consultation.
  • non-blocking In-class activity
  • non-blocking Exam
  • non-blocking Short tests
  • non-blocking MOOC completion
  • non-blocking Mid-Term Test
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    0.2 * Exam + 0.1 * In-class activity + 0.15 * Mid-Term Test + 0.05 * MOOC completion + 0.4 * Projects + 0.1 * Short tests
Bibliography

Bibliography

Recommended Core Bibliography

  • Denis, D. J. (2016). Applied Univariate, Bivariate, and Multivariate Statistics. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1091881
  • Tabachnick, B. G., & Fidell, L. S. (2014). Using Multivariate Statistics: Pearson New International Edition (Vol. 6th ed). Harlow, Essex: Pearson. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1418064

Recommended Additional Bibliography

  • Agresti, A., & Finlay, B. (2014). Statistical Methods for the Social Sciences: Pearson New International Edition (Vol. Pearson new international ed., 4. ed). Harlow England: Pearson. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1418314
  • Crawley, M. J. (2013). The R Book (Vol. Second Edition). Chichester, West Sussex: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=531630