Hypothesis

29 Matching Annotations

Oct 2025
localhost:3662 localhost:3662

Writing Assignment #1 (Methods & Results)

14
1. esteban.valencia 27 Oct 2025
  
  in Public
  
  filter
  
  This subsetting of raw data is incorrect. Listwise deletion of records prior to setting the survey design object will problematize variance estimation.
2. esteban.valencia 27 Oct 2025
  
  in Public
  
  # Low Food Security (reference + interaction)
  
  This is what I was looking for in the earlier primary models section regarding stratum-specific estimates. Why break it up this way?
  
  The approach you use here is called a linear combination of coefficients (or LINCOM) method. It's an appropriate method and properly implemented here.
  
  For future reference, though, you could just re-implement your initial cox models while altering the reference level of the modifier in order to obtain stratum specific estimates because the beta-coefficient is the stratum-specific estimate for the given reference level of the modifier.
3. esteban.valencia 27 Oct 2025
  
  in Public
  
  plot
  
  I highly encourage you to explore ggplot() as your primary visualization modality, given its flexibility in generating high quality graphics. The plot function here can work, but the graphics produced in your initial markdown HTML were difficult to parse visually.
4. esteban.valencia 27 Oct 2025
  
  in Public
  
  # Per-level curves (robust approach): fit separately then overlay
  
  I don't fully understand the rationale for this approach and why you consider it "robust." This seems needlessly complicated.
  
  The svykm() function can accommodate a singular categorical predictor like your depression and food security indicators.
  
  Indeed, if you wanted, you could generate a 4-level categorical predictor of the interaction between depression and food insecurity.
5. esteban.valencia 27 Oct 2025
  
  in Public
  
  # Convenience: extract tidy HR table
  
  For models 3a and 3b I cannot identify how you generate stratum-specific estimates of your main effect as reported in Table 2 of your main submission.
  
  This makes me wonder whether you accurately represent the quantities reported in Table 2...
6. esteban.valencia 27 Oct 2025
  
  in Public
  
  designs for each model (complete cases on what they actually use)
  
  The potentially problematic result of this practice is that each of your models will use a different sample size...
7. esteban.valencia 27 Oct 2025
  
  in Public
  
  Table 1A (UNWEIGHTED counts)
  
  This code section seems unnecessarily complicated given the tools available from the tableone or gtsummary or the course-specific svyTable1 packages, usage of which would be less prone to user-error.
8. esteban.valencia 27 Oct 2025
  
  in Public
  
  des_adult <- subset(des, RIDAGEYR >= 20)
  
  This subsetting procedure is correct, but the object df was initially improperly subset per line 439 of Section 1 (Overview).
9. esteban.valencia 27 Oct 2025
  
  in Public
  
  This Rmd assumes you already have a curated CSV.
  
  I was unable to re-run this file because of this assumption.
  
  In practice, your file size is not so large that re-running the import + data preparation protocols would be time-prohibitive. For the sake of file management, if you wish to create two separate files (one for data importing/cleaning and one for analysis) then that is workable. But iteratively saving and relying upon new data sets complicates reproducibility.
10. esteban.valencia 27 Oct 2025
  
  in Public
  
  # Optional: adjust/add your local path here # "/absolute/path/to/NHANES_Mortality_Analytic_N3557.csv"
  
  Again, importing data directly from the CDC FTP will obviate the need to set relative/absolute directory pathways.
11. esteban.valencia 27 Oct 2025
  
  in Public
  
  MORTWTSA for the Cox model
  
  If weights are used in analyses, they should be WTMEC2YR.
12. esteban.valencia 27 Oct 2025
  
  in Public
  
  mortality_file_name
  
  You are HIGHLY encouraged to import mortality data directly from the CDC FTP web portal to facilitate reproducibility.
  
  For example, if you were to submit this file as supplementary material for publication, your code would not be reproducible because mortality files are imported from a local directory.
13. esteban.valencia 27 Oct 2025
  
  in Public
  
  MORTWTSA = pos$MORTWTSA,
  
  This is not available for NHANES...
14. esteban.valencia 27 Oct 2025
  
  in Public
  
  # Define column positions
  
  This code is incorrect.
  
  NHANES supplies R code for importing mortality files (see "Sample R program").
  
  NOTE that the R import file contains instructions for importing NHIS mortality data and NHANES mortality data. Follow instructions for NHANES. This code is also adapted in the course text (see "Link mortality data to NHANES").
Visit annotations in context

Annotators

esteban.valencia

URL

localhost:3662/
localhost:3319 localhost:3319

Writing Assignment #1 (Methods & Results)

8
1. esteban.valencia 27 Oct 2025
  
  in Public
  
  filter(!(event == 1 & ucod_leading %in% c(4, 10))) %>%
  
  Problematic subletting? The preferred approach would be to recategorize as censored and perform a cause-specific cox analysis.
2. esteban.valencia 27 Oct 2025
  
  in Public
  
  filter(comorbidity_n >= 2) %>%
  
  Problematic subsetting?
3. esteban.valencia 27 Oct 2025
  
  in Public
  
  Diagnostics
  
  Do you intend to incorporate this into your supplementary materials or is it merely for your benefit?
4. esteban.valencia 27 Oct 2025
  
  in Public
  
  Spline sensitivity (MI-pooled) + non-linearity test + collinearity check
  
  I have validated your code here using two different methods, so good job.
  
  However, your code is difficult to parse and unnecessarily complicated. Compared to your SAP, where you relied on the predict() function to generate estimates, here you employ a manual linear combination of coefficients (LINCOM) method. This is ideal because you do not need to specify a reference grid to obtain final estimates, a feature which you don't take full advantage of.
  
  Included in my code review is a sample code file that takes your code and reduces it to the necessary components, which might prove easier to read.
5. esteban.valencia 27 Oct 2025
  
  in Public
  
  di <- dplyr::filter(di, .data[[flag]] == 1L)
  
  This looks like problematic subsetting, you are list-wise deleting records before generating the survey design object.
6. esteban.valencia 27 Oct 2025
  
  in Public
  
  imp, 1
  
  Given the descriptive nature of the KM plot it would be preferable to implement the analysis on the un-imputed analytic data set.
  
  However, here it doesn't matter assuming the data were subset to complete case observations on the exposure and outcome already.
7. esteban.valencia 27 Oct 2025
  
  in Public
  
  design_list[[1]]
  
  You should implement descriptive statistics on the unimputed analytic data set, not the imputed data set.
8. esteban.valencia 27 Oct 2025
  
  in Public
  
  files
  
  I highly recommend importing mortality files directly from the CDC FTP web portal. Otherwise your code will not be self-contained when submitted for publication.
Visit annotations in context

Annotators

esteban.valencia

URL

localhost:3319/
Local file Local file

Writing Assignment #1 (Methods & Results)

7
1. esteban.valencia 27 Oct 2025
  
  in Public
  
  files
  
  I would highly recommend importing mortality data directly from the CDC FTP website.
  
  As submitted, your code here is not self-contained, which it should be assuming it were submitted as supplemental materials for publication.
2. esteban.valencia 27 Oct 2025
  
  in Public
  
  filter(!(event == 1 & ucod_leading %in% c(4, 10))) %>%
  
  Problematic subsetting?
3. esteban.valencia 27 Oct 2025
  
  in Public
  
  filter(comorbidity_n >= 2) %>%
  
  If I am reading this correctly, you are subsetting the raw data, which is problematic.
4. esteban.valencia 27 Oct 2025
  
  in Public
  
  Diagnostics
  
  Where does this come from? Also it was not mentioned in the methods section or in the results/supplementary materials, so I assume you don't intend to do anything with it?
5. esteban.valencia 27 Oct 2025
  
  in Public
  
  theme_minimal
  
  I encourage you to use theme_classic() or to explore the ggthemes::theme_stata() function, which both generate more professional looking figures.
6. esteban.valencia 27 Oct 2025
  
  in Public
  
  1.96
  
  While this is a useful shorthand, use qnorm(0.975) for a more accurate measure.
7. esteban.valencia 27 Oct 2025
  
  in Public
  
  Spline sensitivity (MI-pooled) + non-linearity test + collinearity check
  
  I have validated your code using two separate methods and arrived at the same results, good work.
  
  However, your method here is unnecessarily complex. Compared to your SAP code, this time around you attempted to estimate the HR using a linear-combination of coefficients approach (whereas you used the predict() function last time).
  
  When employing this manual LINCOM approach, the reference grid of the adjustment covariates is irrelevant because you do not have any multiplicative interaction terms. In my code review I am including a simpler approach, which might be easier for external readers to parse.
Annotators

esteban.valencia

Annotators

URL

Annotators

URL

Annotators