Data Analysis in Sociology
- The course covers the foundations and popular techniques of quantitative data analysis with the goal of training students to be informed producers and consumers of quantitative research.
- Two specific goals of this course are to systematize the principles of data analysis for all standard problems and to alleviate old-time anxieties related to any part of the data analysis cycle.
- Students can apply a theoretical framework to define hypotheses and explain the results of a study; they can apply appropriate visualization to communicate the results.
- Students can generalize and analyze the data they have, assess it critically, express their own opinions, and give their interpretation on the best possible decision.
- Students can set research goals, propose a research plan based on the results of previous research and social theory, carry out data analysis and report the results.
- Best Practices in Data WranglingBuilding data acumen: making meaningful, correct and useful judgments about data. Privacy and ethical concerns in data analysis and research. Data culture areas: data life-cycle, data curation, understanding causality, understanding conditional and joint probabilities, false negatives and false positives, critical assessment of popular practices and further use of R functionalities to make sense of the data. The data life-cycle: generation, collection, processing, management, analysis, visualization, interpretation, and delivery.
- Data ManagementGetting and cleaning data in R. Importing data from various formats. Applying standard operations in an industrial setting. Good practices in recoding, rescaling, reordering, discretizing, and renaming. Packages for quick calculations and interactive visualizations. Data curation. Providing reproducible results with documented code and simulations. Delivering results in applications. Data simulation for hypothesis testing. Transforming data values to simplify the structure. Research questions and data types.
- Communicating Data Analysis ResultsLearning from data. Effective communication to decision-makers. Common Best practices in science communication. Understanding the message and audience. Choosing the best visualization. Visualizing complex data: distributions, change over time, correlations. Common pitfalls in data visualization. Customizing graphs and reports in R.
- Written ExamThe exam consists of four problems involving the methods covered in this course.
- Test 1If the student has a respected reason to miss the test, the student should inform the instructor about it before the test. The documents confirming the student's absence are to be presented no later than two weeks after the test, otherwise, they will not be considered.
- Test 2If the student has a respected reason to miss the test, the student should inform the instructor about it before the test. The documents confirming the student's absence are to be presented no later than two weeks after the test, otherwise, they will not be considered.
- Practice engagement
- Homework tasksIndividual practice files should be submitted as knitted R Markdown files (HTML) turned in via the MS Teams Assignments section (don't forget to click on the button 'Turn In'). If the students fail to knit their own script, the mark for submission is cut by half.
- Online projects
- Interim assessment (3 module)0.2 * Homework tasks + 0.1 * Online projects + 0.3 * Practice engagement + 0.1 * Test 1 + 0.1 * Test 2 + 0.2 * Written Exam
- Inter-university Consortium for Political and Social Research. (2012). Guide to Social Science Data Preparation and Archiving: Best Practice Throughout the Data Life Cycle. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.AA22F59E
- Knaflic, C. N. (2015). Storytelling with Data : A Data Visualization Guide for Business Professionals. Hoboken, New Jersey: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1079665
- Wickham, H., & Grolemund, G. (2016). R for Data Science : Import, Tidy, Transform, Visualize, and Model Data (Vol. First edition). Sebastopol, CA: Reilly - O’Reilly Media. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1440131
- Yau, N. (2013). Data Points : Visualization That Means Something. New York: Wiley. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=566405