Related Questions
- I Need To Hire Someone To Take The Alteryx Core Certification Exam For Me
- Sql Questions
- Matlab Project
- I Am Needing Tutoring In Assembly Language
- What Is Php And How Do I Use It?
- Pendulum And Second-Order Ode
- Matlab Stability State / Space
- Matlab Stability State / Space
- This Is A Lab For My Intro To Gis Class.
- `Required To Create A Code That Determines If Multiple Race Vehicles Meet The Regulations Of A Race.
- Vacation Recommender Matlab Program
- You Should Select A Real Gis Topic Project And Try To Address The Issue By Applying Gis Approaches
- Matlab 3 For Differential Equations
- My Deliv C Need To Do Lcfs. I Did Dfs Eventually It Worked. But It Is Not What He Want. He Wants Lcfs
- Stata Replication Of Table 3 And 4 Homework
Popular Services
- Coursework writing help
- Term paper writing help
- Writing Help
- Paper Writing Help
- Research paper help
- Thesis Help
- Dissertation Help
- Case study writing service
- Capstone Project Writing Help
- Lab report Writing
- Take my online class
- Take my online exam
- Do my test for me
- Do my homework for me
- Do my math homework for me
- Online Assignment Help
- Do my assignment for me
- Essay Writing Help
- Write my college essay
- Write my essay for me
Post your project now for free and watch professional homework help experts outbid each other in just a few minutes.
Bio Stats: Applied Regression
Asked
Modified
Viewed
14
• Write your solutions under each of the questions.
• Some problems only apply to PHP 2511 students, so read the HW carefully.
• The assignemt was written to mimic what you would have been asked to do in-person. It is intended to take <1 class period to complete.
• All interpretations must be in context to the original problem including units.
• All R output is in this exam.
This order does not have tags, yet.
Additional Instructions
PHP 1511/2511 HW 2
Due: May 15, 2022 by midnight
Name:
Total Score (1511): /29
Total Score(2511): /37
Instructions
· Write your solutions under each of the questions.
· Some problems only apply to PHP 2511 students, so read the exam carefully.
· The exam was written to mimic what you would have been asked to do in-person. It is intended to take <1 class period to complete.
· All interpretations must be in context to the original problem including units.
· All R output is in this exam.
The first 3 questions refer to the following scenario.
A researcher is interested in how variables, such as GRE (Graduate Record Exam scores), GPA (grade point average) and prestige of the undergraduate institution (rank 1-4, with higher=less prestigious), effect admission into graduate school. The response variable, admit/don’t admit, is a binary variable. The following logistic regression model was fit:
Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.98998 1.13995 -3.50 0.00047 ***
## gre 0.00226 0.00109 2.07 0.03847 *
## gpa 0.80404 0.33182 2.42 0.01539 *
## rank2 -0.67544 0.31649 -2.13 0.03283 *
## rank3 -1.34020 0.34531 -3.88 0.00010 ***
## rank4 -1.55146 0.41783 -3.71 0.00020 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 499.98 on 399 degrees of freedom
## Residual deviance: 458.52 on 394 degrees of freedom
1. (6 pts) The table above shows the regression parameters from a logistic regression model. Note, they are not exponentiated. Interpret the significant effect in the context of the problem.
2. (1 pt.) Which of the following statements is false about the model results:
a. There is a significant effect of school rank on the odds of graduate school admission.
b. Rank is treated like a trend in the model.
c. The reference category is the highest rank schools.
d. GRE and GPA are independently associated with graduate school admission.
3. (6 pts, 2511 only) Identify the following parts of the GLM corresponding to the model above: systematic component, random component and link function.
4. Researchers were interested in identifying risk factors associated with giving birth to a low-birth-weight baby (weighing less than 2500 grams). Data was collected on 189 women, 59 of which had low birth weight babies and 130 of which had normal birth weight babies. Two variables which were thought to be of importance were self-reported race (white, black, other), and the smoking habits of the mother (1=smoker, 0=non-smoker).
Note: smoke1=1 if smoker, 0 if non-smoker; raceblack=1 if self-reported race=black, 0 otherwise; raceother=1 if self-reported race=other, 0 otherwise.
Consider the following logistic regression model output of the exponentiated coefficients from the regression model:
## term estimate p.value conf.low conf.high
## 1 (Intercept) 0.1000000 1.124386e-05 0.03001716 0.2479494
## 2 smoke1 5.7575752 3.429174e-03 1.93948909 21.3664428
## 3 raceblack 4.5454541 4.412016e-02 1.03972939 21.3006300
## 4 raceother 5.7142851 3.371066e-03 1.94185108 21.0892350
(2 pts) Interpret the effect of smoking on the outcome in the context of the problem.
5. (2 pts) Referring to question 4, what variables are significant risk factors of giving birth to a low birthweight baby?
6. (2 pts) An exercise physiologist wishes to explore the association between group cohesion and exercise performance and whether this differs by biological sex. The following model is fit:
How would you test whether Sex was a moderator of the association between Group Cohesion and Exercise Performance?
7. (2 pts) Referring to question 6: An exercise physiologist wishes to explore the association between group cohesion (X) and exercise performance (continuous variable Y) and whether this differs by biological sex.
Suppose a=13.25, b=2.14, c=-1.65, d=3.29. Sex=1 for women and 0 for men.
What is the mean difference in exercise performance for a one-unit difference in group cohesion for women?
8. A university conducts a study to investigate whether the relationship between student attendance at lectures (X) and their subsequent test scores (Y) is moderated by whether or not the lecturer has a teaching qualification (M).
a. (1 pt.) Which is the moderator variable?
b. (2 pts, 2511 only) Write out the model you would be fitting in order to test the moderation hypothesis. You can assume Y is a continuous variable.
9. (3 pts) In a multivariable linear regression model, list and briefly explain the possible methods that could be used to assess whether a predictor X has a significant effect on an outcome Y controlling for confounders?
10. (2 pts) An international company is worried that employees in a certain job at its headquarters in Country A are not being given raises at the same rate as employees in the same job at its headquarters in Country B. Using a random sample of employees from each Country, a regression model is fit with: Y = employee salary X1 = length of time employee has worked for the company X2 = 1 if employee is in Country A, and 0 if employee is in Country B.
What type of test would compare whether it is necessary to include X2 in a model predicting Y from X1?
11. Suppose you have the following scatterplot of Y vs X that looks like a negatively sloped line (with points clustered tightly around the line).
.
a. (1 pt.) Which choice is most likely to be the approximate value of R2 ?
a. −99.5%
b. 2.0%
c. 50.0%
d. 99.5%
b.(2 pts) What is the interpretation of R2 from part a?
12. (1 pt.) Consider a GLM predicting the outcome Y (binary indicator of smoking cessation status) from X1=concerns about gaining weight, X2=gender and X3=confidence in quitting. What is the systematic component of the model?
a. Y
b. b1X1 +b2X2 + b3X3
c. logit(p/1-p)
13. (1 pt.) Which choice is not an appropriate description of in a regression equation?
a. Estimated response
b. Predicted response
c. Fitted Response
d. Observed response
14. (2 pts) In a linear regression model, a 95% confidence interval for β1 was given as: (-5.65, 2.61). What would a test for H0 (Null hypothesis): β1=0 vs Ha (alternative hypothesis) : β10 conclude?
15. (1 pt.) The economic structure of Major League Baseball allows some teams to make substantially more money than others, which in turn allows some teams to spend much more on player salaries. These teams might therefore be expected to have better players and win more games on the field as a result. Suppose that after collecting data on team payroll (in millions of dollars) and season win total for 2021, we find a regression equation of
Wins = 71.87 + 0.101Payroll - 0.060League where League is an indicator variable that equals 0 if the team plays in the National League or 1 if the team plays in the American League
If Teams A and B both play in the same league, and Team A’s payroll is $1 million higher than Team B’s, then we would expect Team A to win, on average,
a. 0.101 games more than Team B.
b. 71.87 games more than Team B.
c. 0.060 games more than Team B.
d. 0.060 games fewer than Team B.
Answers
0No answers posted
Post your Answer - free or at a fee
Login to your account tutor account to post an answer
Posting a free answer earns you +20 points.
LoginTo get help with a similar task to Bio Stats: Applied Regression, Ask for R Assignment Help online.