PART C
The dataset cvriskf.dta contains data that have been extracted from a study of the extent to which known risk factors for cardiovascular disease in early adulthood are associated with measurements of body size or adiposity (“fatness”) taken in childhood. A cohort of children was recruited across several Australian states, aged between 7 and 15 years, and had a range of physical measurements recorded (as well as questionnaires and also biomarkers from blood tests, but we will not consider these here). You are asked to consider the following outcome measure:
sbp_ad Systolic blood pressure at adult follow-up
If you perform some quick exploratory analysis you will see that the blood pressure measure is reasonably well behaved: please analyse this variable on its original scale.
You are asked to consider the following exposure measures, which are all of potential interest because they relate to body fat:
bmiz_ch Child’s body mass index (BMI), in “z-score” or “SD-score” units*
wc_ch Child’s waist circumference (cm)
log2tricep_ch Skinfolds thickness of the triceps (mm) on a logarithmic (base 2) scale
*The details of this calculation should not concern you but briefly this is the BMI expressed in units of standard deviations relative to the mean value for the child’s age and sex. BMI is weight (kg) divided by the square of height (m), and is a widely used measure of body size, often assumed to represent “fatness” although it also reflects other aspects of body size as well.
The other variables (age at recruitment (age_ch), age at follow up (age_ad), sex (sex), a measure of socioeconomic advantage (ses_adv_ad), smoking category (smk_ad)) are all to be considered as potential covariates in multiple regression analysis, as described in the questions below.
[Note that the usual caveats apply: these data have been sampled and modified from an original study and no substantive conclusions should be drawn from these analyses.]
The overall aim of your analysis is to examine the evidence for association between the endpoint and the three exposure measures using regression methods, following the analysis outline below. The essential purpose is an “explanatory” or “epidemiological” one: the researchers were interested in finding evidence to support the existence of causal pathways between having excess weight in childhood and early signs of cardiovascular risk in adulthood.
Perform the following steps:
Question 1
(i) Use a multiple linear regression model to obtain estimates of association the endpoint and all three exposure measures simultaneously. (For Question 1 ignore all of the other covariates.) Would you recommend omitting any of the exposure measures from a joint model for each outcome? Are there any issues with the regression assumptions and are there any collinearity problems with this model?
(ii) Provide an explicit “clinical” interpretation of the estimated regression coefficient (and related confidence interval) for the waist circumference effect.
Question 2
Your collaborator is concerned about potential confounding effects. Examine whether the results obtained in Question 1 are affected by adjustment for various covariates. Do this in three stages:
(i) Look at the effect of sex and age at recruitment (in childhood). Does adjustment for these variables affect the associations found in Question 1? If so, can you explain why for the major effects that you observe? (This should be in general statistical terms, without needing to be an expert in the subject matter.)
(ii) In addition, now also look at the effect of age at follow-up. When you put this in a model along with childhood age, what problem might you expect to arise? [Hint: consider the difference between these two ages]. Does this problem occur? What do you recommend?
(iii) Finally, examine the effect of adjusting for socio-economic status and smoking. Although these adjustments for measures taken at the adult follow-up can be done statistically, can you give a reason why it might not be sensible to use them? [Hint: can a characteristic measured in adulthood truly confound the effect of a childhood exposure?]
[N.B. For all of the analyses presented above, you may ignore the possible existence of interaction effects, to keep things a little simpler. In reality, some of these interactions may be of interest, e.g. sex with the fatness measures.]
(iv) Perform a quick validation of the model you decide to keep (present only 3 plots to keep your report short).
Question 3
Conclude with a general summary of your analysis. This should take the form of a single paragraph that summarises this analysis and attempts to interpret the associations with the exposure measures.
Students succeed in their courses by connecting and communicating with an expert until they receive help on their questions
Consult our trusted tutors.