• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Basic Tools for Data Analysis

2024/2025
Учебный год
ENG
Обучение ведется на английском языке
3
Кредиты
Статус:
Курс по выбору
Когда читается:
1-й курс, 1 модуль

Преподаватель

Course Syllabus

Abstract

This course introduces students to the fundamental tools and techniques used in data analysis, providing a solid foundation for understanding and interpreting data. Through hands-on activities and practical exercises, participants will learn how to collect, clean, analyze, and visualize data using popular software tools such as Excel, R, and Python.
Learning Objectives

Learning Objectives

  • The class aims to introduce students to foundations of exploratory data analysis, vizualization, data wrangling, and basics of statistical inference.
Expected Learning Outcomes

Expected Learning Outcomes

  • The ability to create a PowerPoint presentation that will follow current design trends
  • Students are familiar with navigating in Excel, filtering, and sorting data. Basic functions: sum, average, count, max, min, Absolute and relative references, Subtotal, if, sumif, Syntaxis of functions
  • Students are able to use the following features of excel: AND, OR, NOT, IFS, and nested IF functions; Pivot tables, slices; Vlookup function; Visualise data; Merging tables.
  • Students are familiar with syntaxis of R, ways of getting help; notions of objects, vectors, and types of data in R. They can code basic contingency tables. Students can use main function from dplyr.
  • Students are able to use main dplyr functions. Students are capable of creating main plot types using ggplot2. Students can calculate main descriptive statistics.
  • Students are familiar with open source data such as survey projects (WVS, EVS, ESS, Barometers) and government statistics (Rosstat). Students can use function from the R package rvest to scrape tables and textual data from the web. They can use join function to combine data from different tables.
  • Students are familiar with applications of ggplot2 in the context of working with textual data. Students can pre-process and explore a text corpus using basic text statistics and tidytext package.
  • Students can identify appropriate data for fitting an OLS model. Students can check statistical assumptions for variables to be included in statistical analysis. Students can fit and interpret a basic OLS model in R. Students can present results of an OLS regression using stargazer and jtools packages.
Course Contents

Course Contents

  • Introduction to Excel 1
  • Introduction to Excel 2
  • Introduction to R and Exploratory Data Analysis in R 1
  • Data manipulation and vizualization
  • Open source data and data scraping
  • Introduction to text mining in R
  • OLS Regression
  • PowerPoint
Assessment Elements

Assessment Elements

  • non-blocking Data exploration and manipulation tests
    Students will have 4 at-home tests that cover 4 main topics of this course: 1) excel, 2) exploratory data analysis in R, 3) text mining and data scraping, 4) OLS Regression
  • non-blocking Exam
    Take home project where students have to work on a problem using Excel and R. Students will provide Excel spreadsheets and R scripts for evaluation. Results of data analysis from Excel and R should be organized in a PowerPoint presentation.
Interim Assessment

Interim Assessment

  • 2024/2025 1st module
    0.6 * Data exploration and manipulation tests + 0.4 * Exam
Bibliography

Bibliography

Recommended Core Bibliography

  • Discovering statistics using R, Field, A., 2012
  • R in action: Data analysis and graphics with R, Kabacoff, R.I., 2015

Recommended Additional Bibliography

  • Quantitative finance: a simulation-based introduction using excel, Davison, M., 2014

Authors

  • ZUBAREV NIKITA SERGEEVICH