R Studio Project Data Analysis

Posted Under: R

Ask A Question
Viewed 18
This is a research project that requires data from online sources and uses R studio. The online sources are provided in the pdf. The highlighted portion is my initial idea for this project but I am willing to change that if it seems to be incorrect or unreasonable. I do not need the memo portion of the assignment and can do that on my own I only need help with the script and what data to use. Thank you.
For your independent analysis project, you will answer a question using data and regression tools learned this quarter. You will produce a very brief high-level policy memo detailing your findings, and provide between 2-5 additional items (at least one figure and one table + 3 optional items) to present your findings in more details. In all cases, you must do cross-sectional analysis(i.e. You should not try to do time series or panel analysis, as we have not covered these techniques). Part of your job will be to figure out how to best address your topic using a cross-sectional framework. You are not required to do any outside research but you may. Management & Trade: World Bank Business Enabling Environment https://www.worldbank.org/en/programs/business-enabling-environment https://datatopics.worldbank.org/world-development-indicators/ Imagine that you are a public servant in a fictional country that is democratic and has an economy very similar to that of Mexico. In fact, you have just been appointed as the new Head of the National Institute for Entrepreneurship, which was modeled off of INADEM - an Institute created by the Mexican Government to promote entrepreneurship. Using data from the World Bank Business Data and World Development Indicators, what do you think would be the best strategy to foster Small and Medium enterprises in your country for entrepreneurship-driven economic growth? You can focus this analysis in any way you like within the bounds of the project guidelines and can make any necessary assumptions about your country(but be clear about them). 2 Guidelines You are responsible for 2 deliverables (both of which will be submitted electronically via Canvas): • MEMO: A 2000-word maximum policy memo presenting your research and conclusions. You will find it hard to summarize everything you have done in this amount of space. But you must concisely describe how you used data to answer the question at hand, your main findings, and the fundamental limitations to your analysis in clear language for a reader that may not understand regression analysis. (Any references you include will not count against your page total.) – SUPPORTING INFORMATION: Appended to the end of your memo you will include 2 to 5 additional items (Figures or Tables) that help present your research. You must have at least one table that presents your main analysis results and one figure (your choice!); the rest are optional and for you to determine. These figures and tables should include captions that let them stand alone. These items should be clearly labeled and you should reference them from your memo. • SCRIPT: A .R script that replicates all of the analysis for your memo and supporting information. Please comment your script so that we can easily navigate your code (e.g., # Generate Figure 1: Interaction Effects). 2 Your analysis should have the following general structure: https://www.worldbank.org/en/programs/business-enabling-environment https://datatopics.worldbank.org/world-development-indicators/ 1. Motivation and Theoretical Underpinnings: You should begin by presenting the motivating question, or set of questions, and a clear explanation of the theory guiding your analysis. WHY are you doing what you are doing? Are there intellectual schools of thought that guide your intuition? What hypothesis (or hypotheses) are you testing and what do you expect to find? To test whether small and medium enterprises will spurn economic growth, I want to gather data around different business sectors to see which sector drives the most growth. My assumption is that in countries that are similar to Mexico there will be the most economic growth with international trade and sectors involving agriculture. The reasoning behind my intuition is that many countries want to import produce for a cheaper price than the domestic country. My hypothesis is that there will be a strong positive correlation between economic growth and agricultural sectors. 2. Data Selection: You must explain how you use the data to test your hypothesis and answer the question at hand. What are the data you are using and why can they help you answer the question of interest? What is (are) the dependent (outcome) variable(s)? What is (are) the independent variable(s) of interest? This should include a discussion of case selection in light of your theory: explain why you are using the subset of data you are using (both in terms of observations and variables). You should also include a concise description of any data manipulations/variables you have generated (how and why). Anyone who reads your paper and looks at your do file should be able to easily replicate your analysis. 3. Methodology / Explanation of Model(s): You should present your model(s) with clear justifications for your variable selection and the functional form of your variables, including any interaction terms. What are you controlling for, and why? 4. Regression Analysis and Results: Your main analysis should be a series of regressions testing the impact of your independent variable(s) of interest on your outcome variable. All models should be reported in a clearly-labeled regression table on your supporting information. Explain the progression of your analysis clearly (e.g., adding other variables; testing interactions, etc.). Use graphics and simulations where appropriate. Discuss which model(s) have the strongest statistical and practical significance. Interpret the meaning of your coefficients in a useful manner and discuss the goodness of fit of your model. 5. Threats to Validity, Regression Diagnostics: Your analysis should include discussion of potential violations of the Gauss-Markov Assumptions. If you exclude variables because of high multicollinearity, please explain why, and present the appropriate diagnostics. You should discuss potential problems with the Zero Conditional Mean and Homoskedasticity assumptions. If such problems exist, discuss the implications for your analysis. Deal with these problems as you are able; if you are unable to address them sufficiently, discuss the impact on your ability to estimate regression parameters and conduct hypothesis testing. 6. Discussion and Conclusion: You should conclude with a thoughtful summary of your results, and a clear set of policy-relevant conclusions. You should also discuss the limits of your analysis, including problems with the data (e.g., selection bias and measurement error). How would you improve this research design? What would be the next steps in your research? A final note: You will do much more analysis than you can present in the memo and graphics. A huge part of the work here will be in compressing what you have done into your findings. You will need to spend time on the writing and presentation, so make sure to leave yourself time to do that. We recommend spending a few days getting to know your data, reading, and planning your analysis. Then try to consolidate your analysis to a few days, and spend Week 10 writing, editing, and putting together your final deliverables. 3 3 Grading Rubric The project is out of 60 points total: 25pts: Written Memo • Have you clearly articulated the motivation for the analysis, your theory, and how you used the data to test your hypotheses? What outcomes are you examining, and what is/are the main independent variables of interest? Are there other observable implications of your theory/hypothesis? How did you examine those? (6pts) • What are alternative explanations for what you find and how did you account for them? What kinds of controls did you use and why? How did you balance the various goals of model building (thoroughness v. simplicity, etc.)? Have you interpreted your models clearly and correctly? Does your analysis progression make sense? You should be telling a story here. (6pts) • Have you addressed potential issues with data (outliers, measurement error) and violations of the Gauss-Markov assumptions, and dealt with them as you are able? (5pts) • Have you summarized your findings with appropriate policy-relevant conclusions and discussed limitations to your analysis? (5pts) • Presentation: Is your memo clearly-written, easy to follow, and compelling? Have you eliminated spelling and grammatical errors? (3pts) 25pts: Supporting Information • Motivating figure - you should have one figure that motivates your analysis clearly by providing a visual display of the question. Remember that the caption of your figure should explain to the reader what is meaningful. (5pts) • It is said that a picture is worth a thousand words. Have you used additional graphics/figures to tell your story (motivation, illuminating results, diagnosing problems, etc.)? Remember, you can have up to 4 items including your regression table. (5pts) • Have you included a clearly-labeled regression table with all of your results? Have you summarized the table clearly with a few sentence-long caption? Remember that variable names are often meaningless to a reader, and you can often summarize grouping variables and controls for clarity. (10pts) Remember that presentation matters for each of these items: Are your graphics and nice to look at; is your table easy to read? Would we put it on the wall here at GPS? 10pts: .R script • Does your script run without error? (5pts) • Does your script run and replicate everything in your report? Have you commented it so we can navigate it easily? We will take a more in-depth look here. (5pts)
Explanations and Answers 0

No answers posted

Post your Answer - free or at a fee

Login to your tutor account to post an answer

Posting a free answer earns you +20 points.


NB: Post a homework question for free and get answers - free or paid homework help.

Get answers to: R Studio Project Data Analysis or similar questions only at Tutlance.

Related Questions