EBook Problems GLM Regress

From Socr

Revision as of 01:29, 8 January 2009 by JayZzz (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

EBook Problems Set - Regression

Problem 1

Use the information from the Heights of Fathers and Sons to write the linear model that best predicts the height of the son from the height of the father.

  • Choose one answer.
(a) Son's height = 35 + 0.5*Father's height'
(b) Son's height = 1.00 + 1.00* Father's height
(c) The model cannot be determined without the actual data
(d) Son's height = 0.5 + 35*Father's height


Problem 2

A congressional report investigates the relationship between income of parents and educational attainment of their daughters. Data are from a sample of families with daughters age 18-24. Average parental income is $29,300, average educational attainment of the daughters is 13.1 years of schooling completed, and the correlation is 0.37.

The regression line for predicting daughter’s education from parental income is reported as: Predicted education = 0.000617*(income) + 8.1

Is the following statement true or false? "The above line is the regression line to predict education from income."

(a)True.
(b)False.


Problem 3

In the early 1900's when Francis Galton and Karl Pearson measured 1078 pairs of fathers and their grown-up sons, they calculated that the mean height for fathers was about 68 inches with deviation of 3 inches. For their sons, the mean height was 69 inches with deviation of 3 inches. (The actual numbers are slightly smaller, but we will work with these values to keep the calculations simple.) The correlation coefficient was 0.50. Use the information to calculate the slope of the linear model that predicts the height of the son from the height of the father.

  • Choose one answer.
(a) 0.50
(b) The slope cannot be determined without the actual data
(c) 35.00
(d) 3/3 = 1.00


Problem 4

Suppose that wildlife researchers monitor the local alligator population by taking aerial photograhs on a regular schedule. They determine that the best fitting linear model to predict weight in pounds from the length of the gators inches is:

Weight = -393 + 5.9*Length with r2 = 0.836.

Which of the following statements is true?

  • Choose one answer.
(a) A gator that is about 10 inches above average in length is about 59 pounds above the average weight of these gators.
(b) The correlation between a gator's length and weight is 0.836.
(c) The correlation between a gator's height and weight cannot be determined without the actual data.
(d) The correlation between a gator's height and weigth is about -0.914.


Problem 5

Which of the following is NOT a property of the LSR Line?

  • Choose one answer.
(a) The sum of the distances between each point and the LSR Line is minimized.
(b) The average x value and the average y value lies on the LSR Line
(c) The sum of squared residuals is minimized
(d) the sum of the residuals = 0


Problem 6

Suppose that the linear model that predicts fat content in grams from the protein of selected items from Burger Queen menu is: Fat = 6.83 + 0.97*Protein. We learn that there are actually 20 grams of fat in the Chucking burger that has 20 grams of protein. Which of the following statements is true?

  • Choose one answer.
(a) The linear model underestimates the actual fat content and produces a residual of -6.23
(b) the linear model overestimates the fat content and produces a residual of -6.23
(c) The linear model underestimates the fat content and produces a residual of -6.23
(d) The linear model overestimates the fat content and produces a residual of 6.23


Problem 7

Which statement describes the principle of "least squares" that we use in determining the best fit line?

  • Choose one answer.
(a) The best fit line minimizes the distances between the observed values and the predicted values.
(b) The best fit line minimizes the sum of the squared residuals.
(c) The best fit line minimizes the sum of the residuals.
(d) The best fit line minimizes the sum of the distances between the actual values and the predicted values.


Problem 8

A statistician wants to predict Z from Y. He finds that r-squared is 5%.Which one of the following conclusions is correct?

  • Choose one answer.
(a) The coefficient of correlation between Y and Z is 0.05
(b) Y explains 5% of the variance in Z
(c) Y is a good predictor of Z
(d) Z is a good predictor of Y


Problem 9

In a simple, linear regression model, the variable that is being predicted is called which of the following?

  • Choose at least one answer.
(a) response variable
(b) X variable
(c) Y variable
(d) dependent variable
(e) independent variable


Problem 10

An ice cream truck owner collects data on the number of sales made each day and the average temperature that day. He computes a regression line for predicting the number of sales based on how far the daily temperature is from freezing (32 degrees Fahrenheit) and finds sales = 0.22 + 1.8*(degrees over 32 Fahrenheit). Identify the y-intercept.

  • Choose one answer.
(a) We can't tell from the information given
(b) 32
(c) 0.22
(d) 1.8


Problem 11

Find the regression equation for predicting final score from midterm score, based on the following information:

Average midterm score=70, SD=10 Average final score=55, SD=20 r=0.60

  • Choose one answer
(a) Predicted final score=1.2*(midterm score)-29
(b) Predicted final score=1.2*(midterm score)-34
(c) Predicted final score=0.3*(midterm score)-34
(d) Predicted final score=0.3*(midterm score)-29


Problem 12

The scores of midterm and final exams for a random sample of Stats 10 students can be summarized as follows:

Mean of midterm score = 36.92; SD of midterm score = 37.79 Mean of final score = 24.71; SD of final score= 25.21 r= 0.978

Predict the final score for a student that got a midterm score of 35.

  • Choose one answer.
(a) 23.44
(b) 0.62
(c) 25.21
(d) 35


Problem 13

A popsicle truck owner collects data to predict the number of sales made each day (Y) from the average temperature of that day (in Fahrenheit) (X) . He finds that the regression line is "Predicted Y = 0.26 + 1.8 X" . What does the 1.8 mean?

  • Choose at least one answer.
(a) The correlation between the temperature and the sale of the popsicles is 1.8.
(b) The intercept of the regression line would be 1.8.
(c) On average, the truck driver sells 1.8 more popsicles on days that are 1 degree warmer than today.
(d) There is a positive correlation between the rise in temperature and the sale of popsicles.


Problem 14

If the correlation between two data sets was found to be approximately zero, which of the conclusions can be made about the scatterplot?

  • Choose at least one answer.
(a) The scatterplot could be a horizontal line
(b) The slope of the regression line will be positive but the points will not be so close to the line
(c) The scatterplot could be non-linear
(d) The slope of the regression line will be negative


Problem 15

n a study on the effect of nicotine ingestion by expectant mothers on birth weight, birth weight is which of the following in a simple, linear regression model?

  • Choose at least one answer.
(a) dependent variable
(b) independent variable
(c) explanatory variable
(d) response variable


Problem 16

We find a regression equation to predict the health of the elderly from the amount of hours they exercise per week. If the residual plot has the shape of the St. Louis Arch, would it be correct to go ahead and use the regression equation to make predictions about the health of the elderly?

  • Choose one answer.
(a) We would need more information
(b) Yes, because all that matters is the R-squared and if that is sort of high, we should use the fitted line
(c) Yes, because patterns in residuals mean that a linear model is appropriate for the data
(d) No, because this means that the relation between health and exercise of the elderly is non-linear, so a linear model should not be fitted


Problem 17

Which of the following is NOT a property of the Least Squares Regression Line?

  • Choose one answer.
(a) The sum of the distances between each point and the LSR Line is minimized.
(b) The sum of squared residuals is minimized
(c) The average x value and the average y value lie on the LSR Line
(d) The sum of the residuals = 0





Translate this page:

(default)

Deutsch

Español

Français

Italiano

Português

日本語

България

الامارات العربية المتحدة

Suomi

इस भाषा में

Norge

한국어

中文

繁体中文

Русский

Nederlands

Ελληνικά

Hrvatska

Česká republika

Danmark

Polska

România

Sverige

Personal tools