We are revisiting the Texas Public Use Inpatient Database for our final project. This time we are working
with a combined 2014 file created for this assignment. The combined file contains inpatient records for Houston
County for all four quarters of 2014. The SAS file is available in Canvas for download. I saved you the
trouble of having to import each file and then merging them.
I suggest you save time by recycling
SAS code and editing it as necessary for this project.
The final report for this assignment is a two-page executive summary, and I will review templates during
our last live session. The summary will have more narrative than previous reports since there is more
information to report.
We have been asked to show whether inpatients diagnosed with ischemic heart disease in Houston
County during 2014 were more likely to get discharged to another type of medical facility or home care
compared to all other discharge statuses. We want to report descriiptive statistics and ORs using
multiple logistic regression for this association, and any interesting covariates and potential
confounders. The next section details the components of the assignment.
1. Familiarize yourself with the data and its contents by running a PROC CONTENTS and reviewing
the data dictionaries in Canvas
2. Create variables for the main exposure (ischemic heart disease) and the main outcome (patient
a. Detail the ICD-9-CM codes used to define ischemic heart disease in an Appendix (this
can be a third page)
3. Identify potential confounders from the variables available and clean those variables as needed
4. Be mindful of missing data and treat it appropriately
5. Summarize all your study variables in a table that includes p-values (refer to previous modules
for suggested table formats)
a. You may use PROC TABULATE or Tableau to generate this table
b. The table needs a descriiptive title and footnotes as needed
6. Create two graphs illustrating interesting differences in your descriiptive statistics by quarter
a. For example, you might show the percent of ischemic heart disease by patient discharge
status for each quarter
b. Correctly label all parts of the graphs
7. Describe the data you are presenting and the methods of your analysis in narrative form
a. Background and Introduction
i. What is your purpose?
ii. Provide a brief background on ischemic heart disease and associated patient
discharge status from the literature
1. Cite two peer-reviewed sources and list them at the end of your report
under a “References” section (this can be a third page)
b. How did you define your variables?
c. What does your population look like?
d. How did you assess potential confounding and how did you determine which variables
to include in your logistic regression model?
e. What are your findings?
i. Present the OR and 95% CI between the main exposure and outcome in a way
that stands out to the reader and describe the meaning in the narrative
ii. Include any recommendations for future analyses
f. You must reference each table and graph in the narrative
8. Do not use raw variable names on any part of the project!
9. Overall visual appeal will be graded
a. Be mindful layout and how text wraps around visuals
10. Deliverables for submission:
a. Two-page executive summary
b. Appendix and References
c. Two peer-reviewed articles
d. SAS Log, Program, and Output files saved using the lastname_type of SAS file naming
e. If applicable: Tableau workbook saved using the lastname_workbook naming