• A
• A
• A
• ABC
• ABC
• ABC
• А
• А
• А
• А
• А
Regular version of the site

# Data Analysis in Sociology

2019/2020
ENG
Instruction in English
4
ECTS credits

#### Author

Tenisheva, Ksenia
Course type:
Compulsory course
When:
3 year, 3, 4 module

### Course Syllabus

#### Abstract

This course lasts for three years. The 2nd year provide an intermediate-advanced statistical analysis for quantitative research in sociology. In the 2nd year, the course covers two main topics: factor analysis and statistical prediction, including linear regression and structural equation modelling. We also discuss key issues in statistical analysis, such as creating indices and identifying causality based on the results of the analysis. The course covers the building blocks of quantitative data analysis with the goal of training students to be informed consumers and producers of quantitative research. This course is also the starting point for students interested in pursuing advanced methods training or planning to use quantitative methods in their own research.

#### Learning Objectives

• develop skills necessary to solve typical problems in analysing social data in R software environment

#### Expected Learning Outcomes

• Conduct statistical analyses in RStudio
• Choose appropriate methods and techniques for certain types of variables and certain aims of the analysis
• Give meaningful interpretation of statistical results: regression coefficients, tables, plots and diagrams (produced in R)
• Perform data transformations
• Represent graphically the results of the statistical analyses
• Create analytical reports describing all the stages of analysis and interpreting its results

#### Course Contents

• Introduction to GLM
Covariance and correlation. Basic concepts and logics of linear regression and GLM.
• Linear regression: OLS. Diagnostics
OLS estimator of linear regression, interpretation and statistic test of OLS estimators, fitted values and residuals, R-squared, addressing nonlinearity in linear regression framework, standardized coefficients, drawing plots, practice in R.
• Linear regression: Interaction effects
Main and multiplicative effects in regression models. Interaction effects, additive effects. Interpreting results. Choosing best model. Practice in R.
• Exploratory factor analysis
Dimensionality reduction. Manifest and latent variables. Factors, graphical representation of factors. Exploratory factor analysis. Factor scores, factor space, types of rotation. Optimal number of factors. Interpretation of the results. Creating indices based on factor analysis. Practice in R.
• Confirmatory factor analysis
Difference between exploratory and confirmatory factor analyses. Factor structure. Testing your (or somebody else’s) scales. Types of latent variables. Constructing factor model in lavaan package. Calculation of degrees of freedom, minimal number of cases. Non-correlated and correlated latent factors. Interpreting results. Model diagnostics. Cronbach’s alpha. Practice in R.
• Introduction in SEM
Structural equation modeling as extension of confirmatory factor analysis. Exogenous and endogenous variables. Testing causal assumptions. Partial correlation, heterogeneous correlations (polychoric, tetrachoric and polyserial correlations). Practice in R.
• SEM: model specification
Formulating theory-based causal hypotheses. Causal inference. Specification concepts. Mediation and moderation effects. Measurement error: correlated and uncorrelated. Practice in R.
• Path analysis
Concept of “path”. Path analysis: only observed variables. Graphical representation. Identification of path model. Estimation of structural equation model. Model fit. Degrees of freedom, number of cases. Meaning of the indices. Corrected chi-square measures. Interpreting the results. Practice in R.
• SEM with latent variables
Introducing latent factors in the model. Identification of SEM. Estimation of structural equation model. Model fit. Meaning of the fit indices. Model modification. Interpreting the results. Practice in R.
• Putting it all together
Implementing all the methods to the real-life research. Combining factor analysis and regression analysis. Using SEM to test theoretical assumptions about causality. Advantages and disadvantages of the methods.

#### Assessment Elements

After each seminar, students are assigned a practical task which should be completed until Friday, 12 p.m.
• Project1
Project. There are three basic features assessed: correct calculations and correct code (syntax); correct interpretations – students must describe trends properly, assess significance of the results, and predict values of dependent variable correctly; and produce correct graphics, with proper types of plots and formatting applied.
• Exam
• DataCamp
• Project2
There are three basic features assessed: correct calculations and correct code (syntax); correct interpretations – students must describe trends properly, assess significance of the results, and predict values of dependent variable correctly; and produce correct graphics, with proper types of plots and formatting applied.
• Project 3
A project dedicated to the topics of causal modeling (SEM). One week will be given to prepare and submit your paper.

#### Interim Assessment

• Interim assessment (4 module)
0.2 * DataCamp + 0.1 * Exam + 0.15 * Practical tasks + 0.15 * Project 3 + 0.2 * Project1 + 0.2 * Project2