We use cookies in order to improve the quality and usability of the HSE website. More information about the use of cookies is available here, and the regulations on processing personal data can be found here. By continuing to use the site, you hereby confirm that you have been informed of the use of cookies by the HSE website and agree with our rules for processing personal data. You may disable cookies in your browser settings.

  • A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Basic Tools for Data Analysis

2024/2025
Academic Year
ENG
Instruction in English
3
ECTS credits
Course type:
Elective course
When:
1 year, 1 module

Instructor

Course Syllabus

Abstract

This course introduces students to the fundamental tools and techniques used in data analysis, providing a solid foundation for understanding and interpreting data. Through hands-on activities and practical exercises, participants will learn how to collect, clean, analyze, and visualize data using popular software tools such as Excel, R, and Python.
Learning Objectives

Learning Objectives

  • The class aims to introduce students to foundations of exploratory data analysis, vizualization, data wrangling, and basics of statistical inference.
Expected Learning Outcomes

Expected Learning Outcomes

  • The ability to create a PowerPoint presentation that will follow current design trends
  • Students are familiar with navigating in Excel, filtering, and sorting data. Basic functions: sum, average, count, max, min, Absolute and relative references, Subtotal, if, sumif, Syntaxis of functions
  • Students are able to use the following features of excel: AND, OR, NOT, IFS, and nested IF functions; Pivot tables, slices; Vlookup function; Visualise data; Merging tables.
  • Students are familiar with syntaxis of R, ways of getting help; notions of objects, vectors, and types of data in R. They can code basic contingency tables. Students can use main function from dplyr.
  • Students are able to use main dplyr functions. Students are capable of creating main plot types using ggplot2. Students can calculate main descriptive statistics.
  • Students are familiar with open source data such as survey projects (WVS, EVS, ESS, Barometers) and government statistics (Rosstat). Students can use function from the R package rvest to scrape tables and textual data from the web. They can use join function to combine data from different tables.
  • Students are familiar with applications of ggplot2 in the context of working with textual data. Students can pre-process and explore a text corpus using basic text statistics and tidytext package.
  • Students can identify appropriate data for fitting an OLS model. Students can check statistical assumptions for variables to be included in statistical analysis. Students can fit and interpret a basic OLS model in R. Students can present results of an OLS regression using stargazer and jtools packages.
Course Contents

Course Contents

  • Introduction to Excel 1
  • Introduction to Excel 2
  • Introduction to R and Exploratory Data Analysis in R 1
  • Data manipulation and vizualization
  • Open source data and data scraping
  • Introduction to text mining in R
  • OLS Regression
  • PowerPoint
Assessment Elements

Assessment Elements

  • non-blocking Data exploration and manipulation tests
    Students will have 4 at-home tests that cover 4 main topics of this course: 1) excel, 2) exploratory data analysis in R, 3) text mining and data scraping, 4) OLS Regression
  • non-blocking Exam
    Take home project where students have to work on a problem using Excel and R. Students will provide Excel spreadsheets and R scripts for evaluation. Results of data analysis from Excel and R should be organized in a PowerPoint presentation.
Interim Assessment

Interim Assessment

  • 2024/2025 1st module
    0.6 * Data exploration and manipulation tests + 0.4 * Exam
Bibliography

Bibliography

Recommended Core Bibliography

  • Discovering statistics using R, Field, A., 2012
  • R in action: Data analysis and graphics with R, Kabacoff, R.I., 2015

Recommended Additional Bibliography

  • Quantitative finance: a simulation-based introduction using excel, Davison, M., 2014

Authors

  • Zubarev Nikita Sergeevich