• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Бакалаврская программа «Социология и социальная информатика»

28
Январь

Probability Theory and Mathematical Statistics

2021/2022
Учебный год
ENG
Обучение ведется на английском языке
6
Кредиты
Статус:
Курс обязательный
Когда читается:
1-й курс, 3, 4 модуль

Преподаватель

Course Syllabus

Abstract

This course is designed as an introduction to Probability Theory and Statistics with the focus on practical problems. It’s divided into two main parts Probability Theory (Module 3) and Statistics (Module 4). Topics include: combinatorics, conditional probability, random variables, limit laws, statistical point estimation, hypothesis testing. The main statistical topics are also illustrated by the program languages R/Python.
Learning Objectives

Learning Objectives

  • The goals of the course are to give the theoretical foundations of probability
  • to build skills of honest computation of probability of events
  • to develop intuition of students about the conditional probability and Bayessian methods
  • to explain connection betweed probability and statistics
  • to study main notations of applied statistics
  • to study the mathematical basis of statistical methods
  • to demonstrate the mathematical base of hypothesis testing
  • to explain importance of the Central Limit Theorem and to show key role of the normal distribution in statistics
Expected Learning Outcomes

Expected Learning Outcomes

  • describes a random experiment and a sample set
  • applies tables of standard normal distributions to compute probabilities and quantiles
  • calculates sample mean, sample variances, sample proportion, sample quantiles and is able to do this by means of computer programs
  • constructs confidence intervals for the means and proportions and explains their meaning
  • correctly constructs the null and the alternative hypothesis
  • correctly applies Classical Formula of probability
  • performs computations of Posterior probabilities by using Bayes' formula
  • performs computation of expectation and variance of random variable and describes the model properties through them
  • distinguishes the discrete and the continuous random variables
  • computes Z-score of a given value and interprets it
  • computes approximate probabilities of large number of similar events by CLT
  • correctly chooses a test to check a hypothesis
  • makes a statistical inference by the significance level and by p-value
Course Contents

Course Contents

  • Combinatorics and Bernoulli trials.
    History of Probability, Probability and Data Analysis, Random experiment. Outcomes. Events, Operations with Events, Statistical definition of probability, Axiomatic definition of probability, Classical formula of probability for equally likely outcomes, Inclusion-exclusion formula, Birthday paradox, Independent and dependent events, Bernoulli trials, Formula for the number of successes, Erdos-Renyi random graphs
  • Conditional probability. Bayes' formula.
    Definition of conditional probability. Illustration via contingency tables, Multiplication formula for probabilities, Chain rule, Independence via conditional probability, Law of Total Probability, Monty Hall Paradox, Bayes’ Theorem, Prior and Posterior probabilities
  • Discrete and continuous random variables.
    Discrete random variables, Probability mass function, Binomial random variable. Newton Binomial formula, Hypergeometric random variable, Poisson random variable. Rare events. Poisson limit theorem, Expectation. Variance. Standard deviation, Mode. Median, St.Petersburg paradox, Continuous Random Variables, Cumulative distribution function, Density of continuous random variable, Uniform, Normal, and Exponential random variables, Formulas for expectation and variance, Independence and dependence of random variables. Joint distribution
  • Normal distribution and Central Limit Theorem
    Family of Normal distributions. Standard Normal distribution. Standardization. Z-score. Quantiles of Normal variables. Central Limit theorem. Applications in insurance, stock market, etc, Law of Large Numbers
  • Statistical Point estimation.
    Main definitions: population, sample, representative sample, Frequency histogram, Cumulative empirical distribution function, Probability vs Statistics, Properties of Point estimates: unbiasedness and consistency, sample quantiles, sample mean, sample variance, unbiased sample variance, sample proportion, Outliers, Correlation, Statistics in R/Python
  • Interval estimation.
    Point estimates vs Interval estimates, Confidence interval for the mean, One-sided confidence intervals, Confidence intervals for proportions
  • Hypothesis testing.
    Statistical hypothesis, Statistical test, Type 1 and Type 2 errors, Significance level. Rejection Region, P-value, Hypotheses for a mean of normally distributed data, Z-test. Connection with the CLT, Connection with the confidence intervals, Student T-distribution. Two-sample T-test, Applications: A/B testing, Model validation problems, Double blind experiment, Homogeneity hypothesis, Mann-Whitney rank test, Two-sample Kolmogorov test
Assessment Elements

Assessment Elements

  • non-blocking Test 1 (80 minutes)
    All students must write the Tests and Mini-tests at the time that the teacher announced in advance. Those, who are absent at the Tests by legitimate reasons (medical reasons confirmed by the corresponding medical documents), will have an opportunity to write the test at the end the course. If a student was absent at the tests without legitimate reasons he/she gets 0 for this test. In case of online test, all students must keep cameras turned on and must stay clearly visible by the cameras. Otherwise, the professor has a right to put 0. It is forbidden to use the internet resources, internet calculators, and social networks for this test. It is forbidden to communicate with anybody during this test. Otherwise, the professor has a right to put 0. The results and personal mistakes are discussed at the consultations with the teacher.
  • non-blocking Test 2 (80 minutes)
    All students must write the Tests and Mini-tests at the time that the teacher announced in advance. Those, who are absent at the Tests by legitimate reasons (medical reasons confirmed by the corresponding medical documents), will have an opportunity to write the test at the end the course. If a student was absent at the tests without legitimate reasons he/she gets 0 for this test. In case of online test, all students must keep cameras turned on and must stay clearly visible by the cameras. Otherwise, the professor has a right to put 0. It is forbidden to use the internet resources, internet calculators, and social networks for this test. It is forbidden to communicate with anybody during this test. Otherwise, the professor has a right to put 0. The results and personal mistakes are discussed at the consultations with the teacher.
  • non-blocking Mini-tests (30-minutes)
    This mark is constructed as an average of Mini-tests. The duration of each mini-test is 30 minutes. Mini-tests are given more frequently than usual tests. They are used for more regular control of students knowlege and understanding. The rounding to the nearest integer is used. At the tests that pass online, the students must switch on theirs cameras.
  • non-blocking Self-Study (report)
    The goal of this task is independent work with mathematical literature, extraction the main ideas from a given mathematical book chapter. After the work with the given literature, the student should present the material to the teacher (or a teacher assistant) keeping the structure: introduction, the main results, conclusion. The part "main results" should contain the most important formulas, ideas, theorems clearly explained. The teacher (teacher assistant) has a right to ask additional questions or/and give additional tasks connected with the subject of the given material.
  • non-blocking Activity gradе
    Activity grade: The activity grade includes regular home work solution and its demonstration, questions during and after lectures. Also there will be quizzes by using kahoot.com or mentimeter.com. The best students in quizzes will get additional points for the activity grade. Solutions and active discussions of the most difficult tasks (problems with an asterisk) also gives additional points to the activity grade. At the seminars that pass online, all students must keep their cameras switched on. If a student refuse to switch on the camera and the microphone during a class, this is estimated as no participation in the class and is equivalent to 0-activity.
  • non-blocking Exam – 60 min written examination
    At the exam, it is forbidden to use any notes and any copybooks, as well as any gadget with internet. The students who violate these rules will get a warning, if they violate the rules twice they will get 0. All students must turn on cameras on their computers and must stay clearly visible by the cameras during the exam, if the exam is online. Otherwise, the mark will be 0. It is forbidden to use internet calculators, and social networks for the exam. It is forbidden to communicate with anybody during the exam. Otherwise, the professor has a right to put 0. It is highly recommended to upload the solutions in .pdf format in case of online exam.
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    Grading Policy: Activity grade — 11%, Test №1 — 18%, Test №2 — 18%, Mini-test — 18%, Self-study (report) — 10%, Final Exam — 25%.
Bibliography

Bibliography

Recommended Core Bibliography

  • A first course in probability, Ross, S., 2010
  • Essentials of statistics for the behavioral sciences, Gravetter, F. J., 2014

Recommended Additional Bibliography

  • Задачник по математической статистике : для студентов социально - гуманитарных и управленческих специальностей, Макаров, А. А., 2019