Final Project Assignment
Overview of the course
Instructor
Course Details
- Mon/Wed
- January 14th - April 29th
- 5:15-6:44 PM
- PCPE 202
- Slack
Course Description and Objectives
Global development increasingly relies on data and quantitative evidence to inform policies. Thus, the growth of new data sources and advances in computational tools have expanded what researchers can study and how well they can study it, creating the possibility of better-informed responses to long-standing challenges such as poverty, inequality, and governance.
However, more data alone does not guarantee better policy. This course examines how contemporary social science research uses data to make claims about development and governance, and how to evaluate those claims critically. Students will engage with recent research and gain hands-on experience analyzing real-world data using common research designs and computational tools, with an emphasis on transparency, inference, and reproducibility.
The course will be organized around several timely substantive topics in global development. As we explore these topics, students will gain a deeper understanding of the challenges that shape governance and development. At the same time, students will be introduced to data analysis methods, inferential techniques, and computational tools that are useful across a wide range of applications. Specifically, students will deepen their understanding of basic statistical methods common to the social sciences, learn how these methods can be used to make inferences about population characteristics and causal relationships, and prepare documents that contain reproducible data analysis workflows.
At the end of the course you should be able to:
- Evaluate the quality of evidence in the development field
- Think clearly about how data can be used to learn about development and governance challenges
- Use tools for data analysis such as R, RStudio, Quarto, and GitHub
- Produce professional-quality documents that summarize original research
This class is designed as a follow-up to PSCI 1102. Students that have not taken PSCI 1102 and PSCI 1800 (or an equivalent) should contact the instructor before enrolling.
Course Materials
This course relies primarily on free, open-source materials. However, the purchase of one textbook will be required. All other reading materials will be uploaded to this website or circulated on Slack at least one week before the meeting. Course site and instructional materials draw on content originally developed for PSCI 3200 by Jeremy Springman
Books
Students are required to purchase a copy of Data Analysis for Social Science: A Friendly and Practical Introduction (DSS). We will use DSS as a jumping-off point for this course. While students are expected to already be familiar with many of the tools and concepts covered by this book, it will serve as a method to review core concepts and orient discussion about how to expand on these skills.
I am asking students to purchase a copy of DSS because it comes with access to additional, helpful resources. If you are unable to purchase this book, you must let me know. To save money, consider e-renting the textbook.
Computing
In the course, we will be using R for data cleaning, analysis, and visualization. R is free, open source statistical computing environment available on all major operating systems. We will also be using RStudio, a free, widely used graphical interface for R. For document preparation, we will be using Quarto, a free, open-source scientific and technical publishing system that is compatible with both R/RStudio and Python, as well as several other languages. Both Quarto and RStudio are supported by a company called Posit. For version control, we will be using Github.
Students are expected to already have R and RStudio installed on the personal computer that they will be using for class. We will cover the installation of all other required computing tools during the course.
Several course requirements will require you to write and submit R code. Your code must be appropriately commented and reproducible. To ensure your code meets the course standards, please follow this style guide from Hadley Wickham (Chief Scientist at Posit and developer of ggplot2 and tidyverse).
Course Structure
Substantively, the course is divided into several parts. Part 1 presents a general introduction to the course and a brief introduction to correlation, causality and statistical inference. The remaining parts are structured around key substantive topics in development. For each topic, we will begin with a brief overview of the literature. Then, we will focus on one particular research question and use it as a guide to learn how to implement some of the most common research designs in the social sciences. Throughout, you will gain hands-on experience with diverse kinds of data through in-class workshops and data assignments. Weekly reading assignments will be divided between contemporary research on substantive topics in global development and textbook chapters focused on social science research methods. You are expected to attend class and be prepared to engage in discussion about the assigned readings.
Grading
Performance in this class will be evaluated by according to performance on the following course requirements:
| Requirement | Percent of Final Grade |
|---|---|
| Quizzes (4) | 10% |
| Workshops (4) | 15% |
| Data Assignments (3) | 35% |
| Final Project (1) | 40% |
Requirement Descriptions
There are four graded requirements for this course.
Quizzes
- You are expected to attend each course meeting. On four randomly selected meetings, there will be a brief quiz designed to test whether or not students did the readings. Students with a pre-approved absence will be required to take the quiz remotely within 24 hours of its administration (in the event your absence falls on a quiz date). Students will be permitted one pre-approved absence for the semester.
- Grade: 10%
Workshops
- We will have four workshops throughout the semester. These will be interactive, hands-on workshops allowing students to gain familiarity with new statistical methods or computational tools. These workshops will cover important tools, such as quarto and github, and data analysis tasks, including cleaning, visualization, and modeling. After each workshop, you will be required to submit a product (ex. the link to a git repo, a quarto doc, an R script, etc.) demonstrating completion of the workshop.
- Grade: 15%
Data Assignments
- There will be three data assignments throughout the semester. These assignments are designed to give you an opportunity to apply tools and methods discussed in readings, lectures, and workshops to data from the real world. These are individual assignments. While I encourage you to collaborate with your colleagues as you think through the tasks, you will be required to submit your own code and write-up.
- Grade: 35%
| Data Assignment | Due Date |
|---|---|
| Assignment 1 | Mar 4th 11:59pm ET |
| Assignment 2 | Mar 18th 11:59pm ET |
| Assignment 3 | Apr 15th 11:59pm ET |
Final Project
- The final project is a data analysis project that will use data of your choosing. The only stipulation is that this data must be relevant to one of the global development topics covered in this course. The assignment will require you to formulate a research question, find data that can help you answer that question, apply the tools and methods from this course to the data you have selected to answer your research question, and present those results for public consumption.
- The goal will be to produce a professional project that can showcase the skills that you have gained to potential employers. Your final submission will be a publicly available webpage that contains: (1) a brief introduction to your research question and data; (2) a discussion of your research design, its assumptions, and threats to inference; (3) a visualization that describes your data; (4) a presentation of the results from a regression model (as a table or graph) and discussion of its implications for your research question; and (5) a discussion of the implications of your findings for development policy or practice, including the limitations of your analysis and suggestions for future research. In addition to the public-facing webpage, you must share access to a GitHub repo that contains the code to reproduce the project output.
- Grade: 40%
- Due: May 10th 11:59pm ET
Course Policies
Please review these course policies carefully. Any questions or concerns about these policies should be raised during the first week of class.
Late Submissions and Regrading
Late submission of assignments will incur a penalty of 2 points for every day late, except in documented cases of serious illness or family emergency. If you feel there has been an error in the grading of one your assignment, you may request in writing a regrade of the assignment. First, I will request a detailed write-up of your dispute. Second, I will regrade the entire assignment, not just the part you are disputing. Therefore, your regrade might increase or decrease the overall grade on the assignment.
Use of AI Tools
You are welcome to use generative AI tools, such as ChatGPT, to assist you with your work in this course. There is mounting evidence from rigorous research that these tools increase human productivity. I believe that their use will continue to proliferate, so it is important to gain experience integrating them into professional tasks. However, AI tools frequently make errors and ‘hallucinate’ information about things that do not exist (journal articles, R functions, etc.). It is your responsibility to verify the information provided by such tools. Most importantly, you are required to disclose your use of AI tools for assignments in the form of footnotes or citations. The use of AI tools will not be counted against you. On the contrary, I want to adopt and share your clever or innovative applications of AI tools.
Electronic Devices
Laptops will be required in class. All other electronic devices should be silenced and hidden. If there is an emergency situation and your phone must be visible, please inform me at the beginning of class.
Controversial Topics and Statements
This course may deal with subject-matter that is difficult or controversial. It is crucial to approach these topics with sensitivity and openness. Students are required to treat one another with respect, even in cases of disagreement. At times, research findings may be in-tension with your normative commitments. I urge students to engage earnestly and critically with any evidence that challenges your prior beliefs.
Academic Honesty
Students are expected to follow the University of Pennsylvania’s Code of Academic Integrity. Suspected violations will be referred to university administration for disciplinary action. If you have any doubts or questions about what constitutes academic misconduct, please do not hesitate to contact me.
Mental Health
Your mental health is important to me. Struggles with mental health, as well as serious mental illnesses, are common across students, faculty, and staff. Please feel free to reach out to me about issues you are having within or outside of this course. I emphatically encourage anyone who thinks they may benefit to utilize the university resources listed below:
Please note that this list is not comprehensive. If there are services that you think should be added to this list, please let me know.
Accessibility
Accessibility is a shared value and a shared responsibility at Penn. The Weingarten Center partners with other departments throughout campus to coordinate and improve the accessibility of buildings and grounds, transportation, communication, and digital infrastructure. Students that require academic accommodations should contact the Weingarten Center. Academic accommodations are determined on an individualized basis through an interactive process that involves student self-disclosure, documentation of disability, and an initial meeting with a Disability Specialist. Accommodations do not alter fundamental requirements of the course and are not retroactive. Students should request accommodations as early as possible, since they may take time to implement. Students can notify the Weingarten Center at any time during the semester if adjustments to their communicated accommodation plan are needed.
Agenda
- Introductions
- Course Description and Objectives
- Requirements
- Policies
- Schedule
- course website
- survey
Introductions
Carolina
Background:
- PhD from NYU, Postdoc here!
- Comparativist
- Political Methodologist
Interests:
- Violence and inequality in Latin America
- Epistemology of causal research
- Movies
You
Please tell us:
- Name, Year, Major
- One thing you’re interested in
- One thing you’d like to get from the course
Course Description and Objectives
Course Description
- Blending subject-matter, research methods, and computational tools
- Focusing on the type of work that goes on with and within development agencies
- Almost no math
Course Description
- Follow-up to PSCI 1102
- 1102 covered big academic debates that we won’t (ex. institutions vs geography)
- Focus on applied research with development agencies/industry
- What is the state of the art?
- “Big ideas” of political science as context
- MUST have a good understanding of the basics of research methods
Course Description
- Substantive focus areas
- Democracy and Autocracy
- Migration
- Gender
- Poverty and Inequality
- Crime and Conflict
- Foreign Aid
- Climate change and adaptation
Course Description
- Methods:
- Deepen understanding of ‘workhorse’ statistical methods and research designs
- How these methods can be used to make inferences about causal relationships
- Tools
- Introduce the computational tools that are needed to implement these methods
- Software necessary to prepare professional documents and reproducible data analysis workflows
Course Objectives
At the end of the course you should be able to:
- Have a good overview of the field and be capable of evaluating the quality of evidence
- Think clearly about how data can be used to learn about development and governance challenges
- Use tools for data analysis such as R, RStudio, Quarto, and GitHub
- Produce professional-quality documents that summarize original research
Requirements and Policies
Prerequisites
- Substance
- Big academic debates in development research (PSCI 1102)
- Methods and Tools (PSCI 1800)
- Basic familiarity with R and RStudio
- Basic knowledge of statistics/econometrics/data science
Textbook
- Data Analysis for Social Science: A Friendly and Practical Introduction (DSS)
- We will use DSS as a jumping-off point for this course
- Some of this material will be review
Grading
Performance in this class will be evaluated by according to performance on the following course requirements:
| Requirement | Percent of Final Grade |
|---|---|
| Quizzes (4) | 10% |
| Workshops (4) | 15% |
| Data Assignments (3) | 35% |
| Final Project (1) | 40% |
Quizzes
- On 4 randomly selected meetings, there will be a brief quiz
- If you paid any attention or did readings, you should get full credit
- One pre-approved absence allowed
Workshops
- Four interactive, hands-on workshops working with diverse types of data, covering different statistical methods, or using new computational tools.
- Tools: R and Rstudio, Quarto, github
- Data cleaning and processing
- Methods: Randomized and quasi-experiments, text analysis.
- You will be required to submit a product demonstrating completion of the workshop
Data Assignments
- Three data assignments designed to make progress towards your final project
- You will be required to submit your own code and write-up
| Assignment | Due Date |
|---|---|
| Assignment 1 | Mar 4th |
| Assignment 2 | Mar 18th |
| Assignment 3 | Apr 15th |
Final Project
- Data analysis project with data of your choosing
- Formulate a research question
- Find data that can help you answer that question
- Apply the tools and methods from this course
- Write-up analysis
- Produce a webpage to present your results for public consumption
- Due: May 10th
Your final submission will be a publicly available webpage that contains: (1) a brief introduction to your research question and data; (2) a discussion of your research design, its assumptions, and threats to inference; (3) a visualization that describes your data; (4) a presentation of the results from a regression model (as a table or graph) and discussion of its implications for your research question; and (5) a discussion of the implications of your findings for development policy or practice, including the limitations of your analysis and suggestions for future research.
Policies
Late Submissions and Regrading
- Late submission of assignments
- penalty of 2 points for every day late
- except in documented cases of serious illness or family emergency
- Regrade request
- detailed write-up of your dispute
- Regrade of the entire assignment (might increase or decrease)
Use of AI Tools
- You are welcome to use generative AI tools (if you must) but beware!
- Do not let it come at the peril of your understanding of the material
- AI tools frequently make errors and ‘hallucinate’ (journal articles, R functions, etc.)
- It is your responsibility to verify the information provided
- You must disclose your use of AI tools for assignments in the form of footnotes or citations
Electronic Devices
- Laptops will be required in class
- All other electronic devices should be silenced and hidden
Controversial Topics and Statements
- Students are required to treat one another with respect
- Diverse perspectives, experiences, and backgrounds are essential for effective development research and practice
- Contact me directly if you feel we’re not achieving an inclusive environment
Academic Honesty
- Students are expected to follow the University of Pennsylvania’s Code of Academic Integrity
- Suspected violations will be referred to university administration for disciplinary action.
Schedule
Assignment
Your first assignment for the final project is sketching a research question that you’d like to investigate and identifying data that could be used to answer that question.
Send me a quarto html file that:
- Briefly describes your idea for a research question
- This should be at least 3-4 sentences describing some relationship in the world that you want to investigate. This should involve at least two things in the world that can be measured with existing quantitative data.
- You are welcome to submit more than 1 idea.
- Proposes a dataset and measures that will help you answer it
- This should include a specific, existing dataset that you can access
- This should also include mention of the specific variables within that dataset that will be used to answer the research question
This project should be submitted via Slack by 11:59pm EST on March 4th. Your submission must include:
After you submit this assignment, I will provide feedback on the viability of the questions, the suitability of the data, and the extent to which your general idea will meet my expectations for the final project.
Types of data
There are many types of data in the world. Below is a brief discussion of the most common sources of data in the social sciences.
- Election returns
- There are various compilations of election data from around the world, such as the Constituency-Level Elections Archive (CLEA)
- Returns for specific elections are often available from a country’s electoral commission website
- Replication data
- Any published research from the last 5-10 years should make the data and analysis files publicly available. You can almost always find where these replication materials are hosted on the article’s webpage at whichever journal puslished the article. Oftentimes, these data are hosted on Harvard’s Dataverse.
- Survey data
- Survey data is used extremely heavily on the social sciences. Most prominently in political science are the various ‘barometer’ surveys (Afrobarometer, Latinobarometer, etc.).
- Administrative data
- Data on government (or organization) programs or
- Expert-coded data
- Data where experts code the characteristics of countries or political entities (such as parties)
Popular Public Datasets
There are many publicly available datasets. This Dataset of datasets may be useful in identifying datasets that are relevant for your research question. Below, I list several of the most widely used, high-quality datasets used by development scholars:
- Varieties of Democracy
- World Bank Development/Governance Indicators
- Armed Conflict Location & Event Data Project (ACLED)
- AidData
- Demographic and Health Survey
- Inernational Organization for Migration
Overview
Your second assignment for the final project is specifying a research plan that will guide your investigation of the research question you have identified. This assignment is designed to get you thinking more clearly about a specific hypothesis related to your research question and how you can produce and present evidence for or against it.
Like a Pre-Analysis Plan, you will need to state testable hypotheses, describe your measurement of the variables necessary to test the hypotheses, and specify a statistical model to conduct the test. Unlike a Pre-Analysis Plan, this is just a starting point.
This project should be submitted via Slack by 11:59pm EST on March 18th. Your submission must include:
- An html file presenting your written design and figures. It is not necessary to show printed code.
- A
.qmdfile that you used to generate the html file - All code should be thoroughly commented to explain the choices you are making and the techniques you are using.
Requirements
This assignment has four components. The components are presented below, along with their importance for the grading of the overall assignment.
- Describe your research question and provide some background on why you find it interesting or important. Be sure to incorporate the feedback provided on the first assignment. Include references to at least two pieces of existing research that illustrate what has already been discovered about the relationship you are investigating. This could be articles published in academic journals, policy reports and white papers, articles published by think tanks and other credible organizations (ex. Brookings, the United Nations High Commissioner for Refugees, etc.), or pieces of data journalism. A full-credit response will probably be around 150-200 words. (40%)
- State at least one testable hypothesis. Distill your research question into a statement about a specific relationship you expect to see in the world. In most cases, this hypothesis will describe a causal relationship between two variables (i.e. changes in x cause changes in y). Make an argument for why you expect to see this relationship. This should be based on related findings from existing research and/or your own theoretical/logical reasoning. A full-credit response will probably be around 150-200 words. (25%)
- Briefly discuss the specific variables you will use to test your hypothesis and the dataset they are drawn from. Be sure to mention the source of the data (the organization or individuals that produced it), the unit of analysis, and the sample. Create a
ggplotto visualize the relationship between the variables being used to test your hypothesis. The best method of visualizing this relationship will depend on the scale of your variables. Good methods to consider are scatter plots (for continuous variables), grouped bar charts (for ordinal or binary variables), and line graphs (for continuous variables with time-series). (25%) - Specify the main regression model you will use to test your hypothesis. This regression model should provide a preliminary test for or against the validity of your hypothesis. Use markdown to render the equation neatly to the html file. (10%)
A hypothesis should look like a prediction. Often, a hypothesis states that you expect a decrease or increase in one variable to cause a decrease or increase in another variable. A hypothesis must also be falsifiable. In other words, it must be possible to prove that the hypothesis is false by showing that the relationship you are predicting to see in your data is not actually present.
In the next assignment, you will be asked to discuss the assumptions that your research design is making. For example, you will need to describe the assumptions that are necessary to interpret your regression model as evidence for your hypothesis. You will also be asked to propose additional tests that can help justify these assumptions, using variables that are available in your dataset.
Overview
Your third assignment for the final project is refining the research plan that will guide your investigation of your research question. This assignment is designed to get you thinking more clearly about the limitations of your research design and how you can strengthen your inferences.
This assignment will help you plan the remaining components of your final project analysis. This should look like an expanded, cleaned-up version of your previous assignment.
This project should be submitted via Slack by 11:59pm EST on April 15th. Your submission must include:
- An html file presenting your written design and figures. It is not necessary to show printed code.
- A
.qmdfile that you used to generate the html file - All code should be thoroughly commented to explain the choices you are making and the techniques you are using.
Requirements
This assignment has four components. The components are presented below, along with their importance for the grading of the overall assignment.
- Introduction: Create an introduction section that discusses your research question. Incorporate previous feedback into your statement of the research question. Expand your discussion of previous work by other scholars and how your analysis will build-on earlier contributions. The best introductions will concisely explain the state of scholarly knowledge on the topic and the gap in this knowledge that your analysis aims to address. Your introduction should probably be around 400-600 words. (10%)
- Hypothesis: Incorporate previous feedback into the statement of your hypothesis or hypotheses. Expand your discussion of the logic behind the relationship that your hypothesis expects. (10%)
- Data: Describe the dataset(s) you are using to conduct your analysis. Discuss the specific variables you will use. Create a table or figures that communicate the mean and the range of your key variables. Add an informative caption to the
ggplotfigure that you created in the previous assignment. For guidance, see the captions in the academic papers we have been reading this semester. Interpret the plot; what does it tell us about your hypothesis? (30%) - Research Design:
- Specify the main regression model you will use to test your hypothesis. Include any covariates that you will use to control for potential confounders, and justify your decision to include these covariates. If you hypothesis states a causal relationship, discuss the threats to interpreting the coefficient on your primary independent variable as an estimate of a causal effect on the outcome. Describe potential unobserved confounders. (10%)
- Identify one empirical extension that you will conduct that will add credibility to your inference. Think back to the examples I provided regarding the effect of moving to a new city for college on student civic engagement. For example, students that already lived in Addis Ababa are likely to be different from students that moved to Addis in many ways that may affect their engagement. One extension I proposed was to restrict my analysis to only compare students that moved from an urban area with students already living in Addis. If I were still to see a negative relationship between moving and engagement among these students, this would allow me to rule-out differences between students from urban vs rural homes as a confounder. Your empirical extension should allow you to rule out at least one potential confounder. (30%)
- Clean-up the document. If you haven’t, incorporate a Bibliography using a
references.bibfile and Quarto citations. Remove the printed warnings and code from your document. Create clear sections for each stage of the analysis. (20%)
Overview
The final project is the culmination of this course. It should reflect the substantive knowledge and technical skills that you developed and the work from the three touch-point assignments that you submitted and received feedback on. The objective of this assignment is to give you experience generating a viable research question based on a topic that you are interested in, refining that broad research question into a testable hypothesis, identifying data that can be used to provide evidence for or against your hypothesis, and communicating the results of your analysis to others.
Understanding this process can not only help you answer research questions in the future, but can also help you assess the strength of evidence presented by others. As data becomes an increasingly important part of our lives, these skills will be useful in your career, especially if you conduct your own research or incorporate research by others into your decision-making.
This project should be submitted via Slack by 11:59pm EST on May 10th. The grade on your submission will constitute 40% of your final grade for this course. Your submission must include:
- A
urllinking me to a clean, professional webpage on your personal website that presents your research to potential employers - The
.qmdfile that you used to generate the html file - All code should be thoroughly commented to explain the choices you are making and the techniques you are using.
Importantly, the grade for this assignment will not be affected by whether your analysis uncovers a statistically significant relationship. If you provide a clear, plausible argument for why you expect to find a relationship between your dependent and independent variables, a null finding provides interesting information about the world! All good (and honest) researchers find that their theoretical expectations were wrong just as often as they find that they were right.
Structure
Your final project submission should be organized into the following sections. This format roughly follows the organization of an academic research paper or a policy report.
- Introduction to the Research Question (~600 words)
In the introduction, you should introduce your research question to readers. Explain why you find it interesting or important. Describe existing research that is related to this topic and include references to relevant studies. This could be articles published in academic journals, policy reports and white papers, articles published by think tanks and other credible organizations (ex. Brookings, the United Nations High Commissioner for Refugees, etc.), or pieces of data journalism. For most topics, you should be able to identify and cite at least 3-4 relevant studies.
Briefly discuss what has already been discovered about the relationship you are investigating and how these findings shaped your research question. The best introductions will concisely explain the state of scholarly knowledge on the topic and the gap in this knowledge that your analysis aims to fill.
- Theory and Hypotheses (~600 words)
In the theory and hypothesis section, you should refine your broad research question into a specific, testable hypothesis. State at least one testable hypothesis. Distill your research question into a statement about a specific relationship you expect to see in the world. In most cases, this hypothesis will describe a causal relationship between two variables (i.e. changes in x cause changes in y). Make an argument for why you expect to see this relationship. This should be based on related findings from existing research (cited in the introduction) and your own theoretical/logical reasoning. Readers should have a clear understanding of why you expect to see this relationship in the world. They might not agree, but they should understand your reasoning.
A hypothesis should look like a prediction. Often, a hypothesis states that you expect a decrease or increase in one variable to cause a decrease or increase in another variable. A hypothesis must also be falsifiable. In other words, it must be possible to prove that the hypothesis is false by showing that the relationship you are predicting to see in your data is not actually present.
- Research Design (~600 words; 1 table)
Describe the dataset(s) you are using to conduct your analysis. Be sure to mention the source of the data (the organization or individuals that produced it), the unit of analysis, and the sample. Are there any limitations that readers should be aware of?
Discuss the specific variables you will use to test your hypothesis. Explain how these variables map onto the theoretical concepts that underpin your hypothesis. Create a table that communicates the mean, range, and standard deviation of these variables. You can use packages like gt() or flextable.
Specify the main regression model you will use to test your hypothesis. Include any covariates that you will use to control for potential confounders, and justify your decision to include these covariates. If your hypothesis states a causal relationship, discuss the threats to interpreting the coefficient on your primary independent variable as an estimate of a causal effect on the outcome. Describe potential unobserved confounders.
Every empirical test has shortcomings that prevent us from being fully confident in the interpretation. What are the limitations of your test? Identify one empirical extension that you will conduct that will add credibility to the inference you are trying to make with your hypothesis test. Think back to the examples I provided regarding the effect of moving to a new city for college on student civic engagement. For example, students that already lived in Addis Ababa are likely to be different from students that moved to Addis in many ways that may affect their engagement. One extension I proposed was to restrict my analysis to only compare students that moved from an urban area with students already living in Addis. If I were still to see a negative relationship between moving and engagement among these students, this would allow me to rule-out differences between students from urban vs rural homes as a confounder.
Your empirical extension should allow you to rule out at least one potential confounder.Make sure to clarify What purpose will it serve. Be specific about how it makes us more confident in the ability of your test to answer the research question.
- Findings (~300 words; 1 figure; at least 1 regression table)
Create at least one ggplot to visualize the relationship between the variables being used to test your hypothesis. The best method of visualizing this relationship will depend on the scale of your variables. Good methods to consider are scatter plots (for continuous variables), grouped bar charts (for ordinal or binary variables), and line graphs (for continuous variables with time-series). Your figure should have an informative caption that allows readers to understand what is being shown. For guidance, see the captions in the academic papers we have been reading this semester. Interpret the plot; does it tell us anything about the veracity your hypothesis?
Use the modelsummary package to present the results of the regression model testing your hypothesis. Be sure to interpret the magnitudes of the main variables in your regression model. Substantively, is this a big effect or a small effect? Once you have discussed the magnitude of the key independent variable(s), tell us whether the results are statistically significant. If you have a large substantive magnitude on your independent variable’s coefficient but no statistical significance, discuss why this might be the case. Do you think it is related to the amount of statistical power of your hypothesis test?
- Empirical Extension (~300 words; at least 1 regression table)
Repeat the steps above for your empirical extension.
- Discussion and Policy Implications (~300 words)
Discuss what we learned from your final project. How conclusive are your results? Is this strong evidence for/against your hypothesis? If so, are there any implications for policy? If your analysis does not provide strong evidence for/against the hypothesis, what future research could provide stronger evidence?
Formatting Requirements
- Create clear sections for each stage of the analysis described above
- Remove the printed warnings and code from your document
- Incorporate a Bibliography using a
references.bibfile and Quarto citations - Post the resulting
.htmldocument to your webpage