29 Matching Annotations
  1. Oct 2025
    1. # Low Food Security (reference + interaction)

      This is what I was looking for in the earlier primary models section regarding stratum-specific estimates. Why break it up this way?

      The approach you use here is called a linear combination of coefficients (or LINCOM) method. It's an appropriate method and properly implemented here.

      For future reference, though, you could just re-implement your initial cox models while altering the reference level of the modifier in order to obtain stratum specific estimates because the beta-coefficient is the stratum-specific estimate for the given reference level of the modifier.

    2. plot

      I highly encourage you to explore ggplot() as your primary visualization modality, given its flexibility in generating high quality graphics. The plot function here can work, but the graphics produced in your initial markdown HTML were difficult to parse visually.

    3. # Per-level curves (robust approach): fit separately then overlay

      I don't fully understand the rationale for this approach and why you consider it "robust." This seems needlessly complicated.

      The svykm() function can accommodate a singular categorical predictor like your depression and food security indicators.

      Indeed, if you wanted, you could generate a 4-level categorical predictor of the interaction between depression and food insecurity.

    4. # Convenience: extract tidy HR table

      For models 3a and 3b I cannot identify how you generate stratum-specific estimates of your main effect as reported in Table 2 of your main submission.

      This makes me wonder whether you accurately represent the quantities reported in Table 2...

    5. designs for each model (complete cases on what they actually use)

      The potentially problematic result of this practice is that each of your models will use a different sample size...

    6. Table 1A (UNWEIGHTED counts)

      This code section seems unnecessarily complicated given the tools available from the tableone or gtsummary or the course-specific svyTable1 packages, usage of which would be less prone to user-error.

    7. This Rmd assumes you already have a curated CSV.

      I was unable to re-run this file because of this assumption.

      In practice, your file size is not so large that re-running the import + data preparation protocols would be time-prohibitive. For the sake of file management, if you wish to create two separate files (one for data importing/cleaning and one for analysis) then that is workable. But iteratively saving and relying upon new data sets complicates reproducibility.

    8. # Optional: adjust/add your local path here # "/absolute/path/to/NHANES_Mortality_Analytic_N3557.csv"

      Again, importing data directly from the CDC FTP will obviate the need to set relative/absolute directory pathways.

    9. mortality_file_name

      You are HIGHLY encouraged to import mortality data directly from the CDC FTP web portal to facilitate reproducibility.

      For example, if you were to submit this file as supplementary material for publication, your code would not be reproducible because mortality files are imported from a local directory.

    1. filter(!(event == 1 & ucod_leading %in% c(4, 10))) %>%

      Problematic subletting? The preferred approach would be to recategorize as censored and perform a cause-specific cox analysis.

    2. Spline sensitivity (MI-pooled) + non-linearity test + collinearity check

      I have validated your code here using two different methods, so good job.

      However, your code is difficult to parse and unnecessarily complicated. Compared to your SAP, where you relied on the predict() function to generate estimates, here you employ a manual linear combination of coefficients (LINCOM) method. This is ideal because you do not need to specify a reference grid to obtain final estimates, a feature which you don't take full advantage of.

      Included in my code review is a sample code file that takes your code and reduces it to the necessary components, which might prove easier to read.

    3. imp, 1

      Given the descriptive nature of the KM plot it would be preferable to implement the analysis on the un-imputed analytic data set.

      However, here it doesn't matter assuming the data were subset to complete case observations on the exposure and outcome already.

    1. Diagnostics

      Where does this come from? Also it was not mentioned in the methods section or in the results/supplementary materials, so I assume you don't intend to do anything with it?

    2. Spline sensitivity (MI-pooled) + non-linearity test + collinearity check

      I have validated your code using two separate methods and arrived at the same results, good work.

      However, your method here is unnecessarily complex. Compared to your SAP code, this time around you attempted to estimate the HR using a linear-combination of coefficients approach (whereas you used the predict() function last time).

      When employing this manual LINCOM approach, the reference grid of the adjustment covariates is irrelevant because you do not have any multiplicative interaction terms. In my code review I am including a simpler approach, which might be easier for external readers to parse.

    Annotators