Advanced Data Analysis
- The course covers the foundations and popular techniques of categorical data analysis with the goal of training students to be informed producers and consumers of quantitative research.
- Students can apply classification techniques, propose hypotheses and choose the methods in categorical data analysis in R, including supervised classification with a binary outcome, and unsupervised classification with clustering techniques of mixed data types.
- Students create customized R Markdown reports.
- Students create reproducible analysis scripts.
- Students define basic terms and identify the purposes of Bayesian inference vs frequentist inference.
- Students describe known problems with the null hypothesis statistical testing and propose known solutions to them.
- Students inspect missing data patterns and apply various methods of data imputation.
- Students interpret the results and assess the quality of proposed analytical and visualization solutions, provide reasons for their choice of techniques, interpret the outputs correctly, and assess the quality of models and data stories.
- Students propose and apply tools for reproducible and ethical data analysis.
- Students scrape simple web tables and texts with R and convert them into standard data formats.
- Coding with Style
- Web Scraping
- Binary Logistic Regression
- Missing Data
- Cluster Analysis
- Data Culture and Data Acumen
- Binary Outcome projectThe project includes an extra part for additional points (up to 2 points, counts if the main task is complete): use one or more decision tree methods to compare and contrast the quality of both logistic regression and decision tree solutions. Compare the performance of the two methods and make a conclusion about which of them performs better here.
- Dimension Reduction project
- Cluster Analysis project
- Coding reflection paperThis is a non-compulsory task for extra points.
- Rmd Customization
- Web scraping
- Bayes reaction paperThis is a non-compulsory task for extra points.
- Final Exam
- Data ImputationThe binary logistic regression project is another project to be evaluated separately.
- 2021/2022 1st module
- 2021/2022 2nd module0.05 * Bayes reaction paper + 0.25 * Binary Outcome project + 0.2 * Cluster Analysis project + 0.05 * Coding reflection paper + 0.1 * Data Imputation + 0.1 * Dimension Reduction project + 0.1 * Final Exam + 0.05 * Rmd Customization + 0.1 * Web scraping
- Ledolter, J. (2013). Data Mining and Business Analytics with R. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=587979
- Munzert, S. (2014). Automated Data Collection with R : A Practical Guide to Web Scraping and Text Mining. HobokenChichester, West Sussex, United Kingdom: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=878670
- Upton, G. J. G. (2016). Categorical Data Analysis by Example. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1402878
- Wickham, H., & Grolemund, G. (2016). R for Data Science : Import, Tidy, Transform, Visualize, and Model Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1440131
- Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data (Vol. Second edition). Hoboken: Wiley-Interscience. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=838162
- McElreath, R. (2016). Statistical Rethinking : A Bayesian Course with Examples in R and Stan. Boca Raton: Chapman and Hall/CRC. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=nlebk&AN=1338291
- Mood, C. (2010). Logistic Regression: Why We Cannot Do What We Think We Can Do, and What We Can Do About It. European Sociological Review, 26(1), 67–82. https://doi.org/10.1093/esr/jcp006
- Seppe vanden Broucke, & Bart Baesens. (2018). Practical Web Scraping for Data Science : Best Practices and Examples with Python. Apress.