- to stimulate students to apply the methods and concepts they learned in data analysis and research methods to solving practical business and marketing analytics tasks.
- Students solve analytical problems using data analysis techniques suitable for the task; develop policies and scenarios using several methods of analysis
- Students perform various statistical analyses, use them appropriately, and develop suggestions (recommendations, policies, scenarios) for the task (churn prevention, segmentation, etc.)
- Students work individually and in teams to interpret the results and develop policies and scenarios for the company
- Students collect information on the business context of a given case and evaluate possible solutions to a given task
- Students select data features to be used in segmentation procedures; as a team, students develop analytical pipelines covering necessary tasks and combining individual results into a summary report
- Students can plan the analytical data cycle, from formulating requests for data collection, to data cleaning and dimension reduction, to data analysis and reporting the recommendations
- The trade of business analyticsJobs in data science for business. What do business analysts do? Responsibilities of business analysts. Domains of applied business analytics.
- Introduction to Data Analysis for Consumer Behavior and Client Analytics. Customer Lifetime ValueBasic concepts of consumer behavior and client analytics. Differences and similarities between typical academic research and business analytics pipelines. Customer acquisition, conversion, churn, segmentation, consumer behavior. Market basket analysis. Association rules. Classification of consumer behavior models. Generic marketing strategies. Types of business models. Statistical methods for client analytics. Life-time value (CLV, LTV). Net profit. Predicting future margin with current sales data. Predicting customer lifetime value with linear regression in R. Omitted variable problem. Multicollinearity. Model validation. Risk of overfitting: use of statistics (AIC), automatic model selection, out-of-sample validation. Adjusted R-squared.
- Customer Churn. Churn PreventionHow to predict customer churn? How to detect and prevent customer churn? Factors of churn: expectations, performance, disconfirmation (disappointment based on perceived quality), satisfaction, churn intention/switching decisions. The push-pull-mooring paradigm for churn and service switching. Measurement of latent variables: satisfaction and expectation disconfirmation. Models of satisfaction, expectation disconfirmation, performance. Sources of data: Experts, logs, surveys. Case: Yandex Music vs. Spotify. Predicting customer's churn with logistic regression in R. The meaning of p-value. Interpretation of logistic regression coefficients. Model selection based on significance vs. theory. Inspecting the results of automatic model selection. In-sample model fit for logistic regression: Pseudo-R-squared (interpretation of reasonable, good, and very good fit); accuracy calculation. The rule of “garbage in, garbage out”. Accuracy. Confusion matrix. Finding the optimal threshold: a table of potential payoffs. Composing a payoff matrix. Dealing with overfitting: out-of-sample validation and cross-validation. Splitting the sample in R. Specifying on the train and predicting on test subsamples. K-fold methods of cross-validation. Accuracy for out-of-sample vs. cross-validation. Addressing churn using segmentation and advertisement. Naive Bayes in predicting churn. Description of task and data for the project.
- Predicting Customer’s Time to ChurnPredicting time till next purchase with survival analysis. Addressing churn using segmentation and advertisement. Survival function. Censored data problem. Survival analysis models: pros and cons. Applications of survival models in customer analytics. Types of data censoring (left, interval, right, type I, type II, random). Assumptions of survival analysis. Survival curve analysis by Kaplan-Meier. Survival function and cumulative hazard function. Cumulative risk. Hazard rate. Kaplan-Meier estimation with a categorical covariate. Cox proportional hazards (CPH) model for multiple covariates. Assumptions of CPH. Interpretation of coefficients for categorical and continuous predictors. Survival plot. Visualization of CPH estimates. When assumptions are violated: stratified Cox model, model time-dependent coefficients. Prediction of survival curve for new customers. CPH model interpretation, calculation of customer lifetime value.
- Customer Segmentation and Cohort AnalysisFactors of customer segmentation: demographics, technology, geography, lifestyles, behavior, new/returning contract, time from last purchase, frequency and value of spending, etc. Reducing the complexity of extensive correlated data. Differences between the goals of LTV models and segmentation techniques. Business-related criteria for segmentation: RFM (recency, frequency, monetary) analysis. Analytical techniques for customer segmentation: PCA, cluster analysis (k-means, DBSCAN, agglomerative algorithms). Applications of PCA for exploration in customer analytics. Reducing multicollinearity, building an index, visualizing multidimensional data. Visualizing correlations. Standardizing variances (scaling). Loadings of principal components. Interpretation of principal components. PCA model specification. Kaiser-Guttman criterion. Scree plot. Biplot of variables and components. Further analysis: fitting loadings to linear regression. Clustering algorithms. Distances between data points. Linkage criteria. Dendrogram plot. Applications of cluster analysis for customer analytics in R.
- What-If AnalysisThe analytical pipeline: database, model, dashboard, what-if analysis. Use of simulations in business for decision making. Scenarios as ways to construct prediction on data. From scenario, to simulation model, to prediction. What-if analysis vs. Extraction, Transformation and Loading (ETL) approach. Source variables and scenario parameters. Seven stages of what-if analysis: goal analysis, business modeling, data source analysis, multidimensional modeling, simulation modeling, data design and implementation, and validation (if failed, repeat 4-7). Activity diagram (scenario diagram). Case: productivity of branches. Stating the assumptions required to perform what-if analysis of models. Grouping assumptions into scenarios describing different ways of customers’ reaction to the policies. Building what-if models for each policy for each scenario. Compare and reflect on the results of scenario models. Reactive programming. Functions in R.
- Consumer PreferencesIntroduction to consumer preference theory. Utility analysis. Cardinal utility, ordinal utility. Indifference curves show combinations that give equal utility. Marginal rate of substitution (MRS). Constraints: income, price, time. Uses of choice models in marketing and business analytics. Modeling customers' choice by product features. A/B testing and preference testing. A/A tests and A/B tests. Power analysis. Multiple A/B testing. Backward A/B testing. Bootstrapping and its use in A/B testing. Checklists and common pitfalls in A/B testing. Common metrics and reporting the results. A/B testing vs(?) qualitative customer research.
- Customer SatisfactionCustomer feedback surveys. Net promoter score (NPS) for measuring loyalty. Promoters, passives and detractors. Customer satisfaction survey (CSAT) for meeting expectations. Post-purchase surveys, product/service development surveys, usability surveys. Expectation disconfirmation theory of post-purchase satisfaction. Key constructs: expectations, perceived performance, disconfirmation of beliefs and satisfaction. Inputs to expectations of value. Measuring the perceived performance: overall quality, interaction, service experience, value for money, social status. Problems of customer satisfaction surveys. Self-selection, overdelivering, expectation adjustment. Combining survey and behavior data.
- Market Basket AnalysisBaskets and itemsets as tools of understanding customers. Subsets and supersets, transactional data. Use cases within and beyond retail. Metrics for market basket analysis: support, confidence, and lift. The apriori algorithm; basket rules. Visualizing metrics and the extracted association rules. From association rules to recommendation systems.
- Online Practice
- Project 1
- Project 2 (ind)
- Project 2 (team)Important: if working together online, keep an online document with your ideas and progress (e.g., in Google Docs or MS Teams) and submit the link to this file along with your project.
- Career Essay
- Interim assessment (2 module)0.1 * Career Essay + 0.25 * Engagement + 0.15 * Online Practice + 0.2 * Project 1 + 0.1 * Project 2 (ind) + 0.2 * Project 2 (team)
- Chapman, C., & Feit, E. M. (2015). R for Marketing Research and Analytics. Cham: Springer. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=964737
- Provost, F., & Fawcett, T. (2013). Data Science for Business : What You Need to Know About Data Mining and Data-Analytic Thinking (Vol. 1st ed). Beijing: O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=619895
- Struhl, S. M. (2017). Artificial Intelligence Marketing and Predicting Consumer Choice : An Overview of Tools and Techniques. London: Kogan Page. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1494508