lab 3 Analyzing the Association between Variables
Miami University | College of Arts and Science | Department of Statistics | Spring 2021-22
Course Title: Statistics | Subject and Course Number: STA 261
Lab 3: Analyzing the Association between Variables
Due date: Thursday, 02/17/2022
Total: 50 points
Your Section:
Table number:
Name of students: Write down the name of students in your table who equally contributed in this lab:
Learning Objectives: Lab 3 will
Help you to determine if there is an association between two categorical variables
Develop your skills to interpret the estimated regression parameters and the coefficient of determination
Help you to understand residuals in regression analysis
Prepare you for the test
Help you solve some of the homework problems
Assist you in exploring the resources available in the class (such as, peers, instructor, graduate assistants, etc.)
Submission guidelines:
Make a copy under the File menu — share with group mates (give editing access). To share with others, click the Share button in the top right and add in email addresses, then click Done
Please write all answers in another color (red is preferred)
Only one person per table needs to submit a pdf document of the lab (File → Download → PDF Document, then submit through Canvas)
Question 1: Vehicular Accidents The data were collected in 2019 by the National Highway Traffic Safety Administration. The Vehicular Accidents 2019 dataset can be found on our StatCrunch group STA 261 Spring 2022. This data set compiles all of the accidents in 2019 and includes various variables that describe the details of the crash. We are interested in determining which variables, if any, have an association with the type of injuries sustained from the accident. (30 points)
Variables Used:
Max_Injury = A description of the maximum injury sustained in the crashNo injuries implies no persons were injured
Possible Injuries implies at least one person involved may have suffered an injury
Minor Injuries implies at least one person suffered minor injuries
Serious Injuries implies at least one person was severely injured in the crash
Fatalities implies at least one person died as a result of the crash
Alcohol_Involved = A Yes/No indicating if alcohol was involved in the crash.
AM_PM = indicates the time of the accident; AM or PM
Speeding_involved = A Yes/No indicating if a vehicle involved in the crash was speeding
Accident_Type = indicates if the accident was a single car or multi-car accident
Number_Injured = Number of persons involved in the crash that were injured in some way.
Part I: What is an observational unit within this dataset?
Answer:
Part II: Construct a contingency table that shows the conditional proportions of type of injuries, given alcohol consumption. Interpret
(5 points total, 1, point for explanatory/response variable, 2 points for creating a table, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?
Answer:
Paste your contingency table here:
Interpretation:
Part III: Is there a relationship between the time of the accident, AM or PM and type of injuries in vehicle accidents? Make a comparative bar graph that compares the two distributions. Interpret.
(5 points total, 1, point for explanatory/response variable, 2 points for creating a plot, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?
Answer:
Paste your comparative bar graph here:
Interpretation:
Part IV: Is there a relationship between whether speeding was involved in the accident (Speeding_involvement) or not and type of injuries in vehicle accidents? Make a segmented bar graph that compares the two distributions. Interpret.
(5 points total, 1, point for explanatory/response variable, 2 points for creating a plot, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?
Answer:
Paste your segmented bar graph here:
Interpretation:
Part V: Is there a relationship between whether it was a single or multi-car accident (Accident_Type) and type of injuries in vehicle accidents? Make a contingency table OR a comparative bar graph OR a segmented bar graph that compares the two distributions. Interpret.
(5 points total, 1, point for explanatory/response variable, 2 points for creating a plot, 2 points for interpretation)
What is the explanatory variable? What is the dependent (or response) variable?
Answer:
Paste your plot here:
Interpretation:
Part VI: Does the maximum speed of the accident differ when alcohol is involved? Create a comparative chart and describe the distributions. (Hint: don’t forget your SOCS!)(5 points)
Comparative Plots:
Interpretation:
Part VII: Does the maximum speed differ based on the time of the accident(AM vs. PM)? Create a comparative chart and describe the distributions. (Hint: don’t forget your SOCS!)(5 points)
Comparative Plots:
Interpretation:
Question 2: Fire Damage A fire insurance company is interested in investigating the effect of the distance between the burning house and the nearest fire station (in miles) on the amount of fire damage (in thousands of dollars) in major residential fires. A sample of 15 recent fires in a suburb is selected. The Fire dataset can be found in our StatCrunch group STA 261 Spring 2022. (20 points)
Source: McClave, J.