• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Бакалаврская программа «Социология и социальная информатика»


Probability Theory and Mathematical Statistics

Учебный год
Обучение ведется на английском языке
Курс обязательный
Когда читается:
1-й курс, 3, 4 модуль


Course Syllabus


This course is designed as an introduction to basic concepts of Probability Theory and Statistics with the emphasis on practical problems. It’s divided into two main parts Probability Theory (Module 3) and Statistics (Module 4). Topics include -combinatorics, -conditional probability, -random variables, -limit laws, -statistical point estimation, -hypothesis testing. The main topics are also illustrated and studied in computer statistical programs such as R, Excel, Mathematica
Learning Objectives

Learning Objectives

  • 1. Studying the theoretical foundations of probability theory and mathematical statistics, their practical development through the construction of mathematical models and solving statistical problems 2. Understanding the types of practical problems, including those arising in sociology, which can be solved using statistical methods, and the ability to use the knowledge gained to solve them 3. Ability to work with programs for mathematical calculations 4. Deepening and expanding the range of knowledge about applied mathematical methods 5. Mastering modern methods of data analysis, for example, basic skills of data research using statistical packages such as R (S-plus)
Expected Learning Outcomes

Expected Learning Outcomes

  • Is able to describe a random experiment and the set of all outcomes. Is able to apply Classical Formula of probability. Knows how to apply Bernoulli formula.
  • Is able to describe a random experiment and the set of all outcomes. Is able to apply Classical Formula of probability. Knows how to apply Bernoulli formula. Knows the law of total probability. Is able to calculate posterior probabilities by Bayes’ formula
  • Is able to construct random variables describing a given random experiment. Is able to calculate the expected value and variance of these random variables. Knows the main families of discrete and continuous random variables. Calculates any probability for Normal Distribution. Is able to approximate probabilities of large number of similar events by CLT.
  • Knows the main approaches of hypotheses testing. Is able to construct the null and the alternative hypothesis. Is able to make a statistical inference by the significance level or by p-value.
  • Knows what is a population and what is a sample from population. Calculates sample mean, sample variance, unbiased sample variance, sample proportion and quantiles. Is able to do this in R.
  • Knows how to construct confidence intervals for the means and proportions. Is able to show connection with the CLT. Knows when and why one should use Student instead of Normal distribution for CI.
Course Contents

Course Contents

  • Section 1. «Combinatorics» Topic I. Introduction to Probability. Independence. Bernoulli trials.
    Introduction to Probability • History of Probability • Probability and Data Analysis • Random experiment. Outcomes. Events. • Operations with Events • Statistical definition of probability Properties of Probability • Axiomatic definition of probability • Classical formula of probability for equally likely outcomes. • Inclusion-exclusion formula • Hypergeometric distribution formula • Birthday paradox • Independent and dependent events • Random walks on graphs (simplest Markov chains). Transition probability matrices. Bernoulli trials • Independent experiments • Formula for the number of successes. Proof. • Banach's matchbox problem • Erdos-Renyi random graphs
  • Section 1. «Combinatorics» Topic II. Conditional probability. Bayes formula.
    Conditional probability • Definition. Illustration via contingency tables • Multiplication formula for probabilities • Independence via conditional probability • Generalization of Multiplication formula for k events. Chain rule. Law of Total Probability • Law of Total Probability for two events. Proof. • Group of jointly exhaustive events • Generalization of the law of total probability • Monty Holl Paradox Bayes’ Theorem • Bayes’ formula • Prior and posterior probabilities • Example from bookmaking company • Conditional independence
  • Section 2 «Random Variables» Topic I. Discrete and continuous random variables. Theirs numerical characteristics.
    Discrete random variables • Definition. Main properties • Probability mass function • Binomial random variable. Newton binomial formula in math analysis. • Geometric variable • Hypergeometric random variable • Poisson random variable. Rare events. Poisson limit theorem Numerical characteristics of random variables • Expectation. Variance. Standard deviation. Theirs properties. • K-th moment • Mode. Median. • St. Petersburg paradox Continuous Random Variables • Cumulative distribution function • Density of continuous random variable • Uniform, Gaussian (Normal), Exponential random variables • Formulas for expectation and variance • Independence and dependence of random variables. Joint distribution.
  • Section 2 «Random Variables» Topic II. Normal distribution and Limit laws: CLT, LLN.
    Calculates any probability for Normal Distribution. Is able to approximate probabilities of large number of similar events by CLT.
  • Section 3 «Statistics» Topic I. Introduction to statistical analysis. Sample. Statistical population. Point estimation.
    Statistics. Preliminaries • Main definitions: population, sample, representative sample • Frequency histogram • Empirical distribution function • Probability vs Statistics Properties of Point estimates • Unbiasedness • Consistency • Quantiles. • Sample mean, sample variance, unbiased sample variance, sample proportion • Outliers • Correlation • Introductory seminar for R
  • Section 3 «Statistics» Topic II. Interval estimation.
    Confidence intervals • Point estimates vs Interval estimates • Confidence interval for a mean (variance is known) • Student T-distribution • Confidence interval for a mean (variance is not known) • One-sided confidence intervals • Confidence intervals for proportions
  • Section 3 «Statistics» Topic III. Hypothesis testing.
    Statistical Hypothesis Testing • Statistical hypothesis • Statistical test • Type 1 and Type 2 errors • Significance level. Rejection Region. • P-value Hypotheses for a mean of normally distributed data • One-sample Z-test. Connection with the CLT. • Two-sample T-test • Applications: A/B testing, Model validation problems, Double blind experiment • Connection with the confidence intervals Homogeneity hypothesis • Mann-Whitney rank test • Two-sample Kolmogorov test • Applications: A/B testing, Model validation problems, Double blind experiment
Assessment Elements

Assessment Elements

  • non-blocking Test 1 (80 minutes)
  • non-blocking Test 2 (80 minutes)
  • non-blocking Small-tests (30-minutes)
  • non-blocking Self-Study (report)
  • non-blocking Activity gradе
  • non-blocking Final Exam – 60 min written examination
    Conditions for writing Exam: In case of online exam: At the exam it is forbidden to use any notes and any copybooks, as well as any gadget with internet. The students who violate these rules will get a warning, if they violate the rules twice they will get 0. In case of on-line exam: All students must have their cameras switched on during the exam. It is strongly recommended to send solution of the tests in .pdf format.
Interim Assessment

Interim Assessment

  • Interim assessment (4 module)
    Grading Policy: Activity grade — 11%, Test №1 — 18%, Test №2 — 18%, Average of all Small-test — 18%, Self-study (report) — 10%, Final Exam — 25%.


Recommended Core Bibliography

  • A first course in probability, Ross, S., 2010
  • Essentials of statistics for the behavioral sciences, Gravetter, F. J., Wallnau, L. B., 2014

Recommended Additional Bibliography

  • Задачник по математической статистике : для студентов социально - гуманитарных и управленческих специальностей, Макаров, А. А., Пашкевич, А. В., 2019