68 Matching Annotations
  1. Last 7 days
  2. Jun 2024
    1. rmsb: Bayesian Regression Modeling Strategies Package, Focusing on Semiparametric Univariate and Longitudinal Models

      maybe subdivide 6.3 with 6.3.1 rms, 6.3.2 rmsb, and ? 6.3.3 the global options, etc. or maybe include that with 6.3.1?

      Main point is the toc for section 6 should show a subsection for rmsb explicitly!

  3. May 2024
    1. Models are usually the best descriptive statistics

      Still working on this comment! I think I understand the intent, but there must be better choices for "models" and "descriptive statistics" ...

      To me, "descriptive statistics" are the indispensable (IMHO) Hmisc::describe function, e.g. Section 8.2?? Or R Workflow Chp 9?

      But I think you are using it for (not recommended) exploratory graphs looking at associations between outcome and various covariates.

      Suggest something like this: (graphical) visualizations of a (well-fitted)(validated) model provide the (powerful) best (way)(tool) to (learn)(understand) the (underlying) data ....

    2. Steyerberg, 2009

      Here's a later edition of Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Statistics for Biology and Health) 2nd ed. 2019 Edition

      by Ewout W. Steyerberg (Author)

      link to Amazon: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.amazon.com/Clinical-Prediction-Models-Development-Validation/dp/3030163989&ved=2ahUKEwjSxoHgq5OGAxW3IUQIHWFdBtAQFnoECBsQAQ&usg=AOvVaw1rQeWGmpD-0CWnqdfmplNj

    3. as will

      this is a long sentence. Suggest replacing "as will auxiliary" with "as well as"

      dropping the word "auxiliary" since most of the topics listed are too important

    4. Study questions

      this link is broken as of 5-05-2024

    1. R-squared achieved in predicting each variable:

      maybe offer option to show this with vars ordered by the R^2?

    2. 42

      why is the number 42 shown separately from the lowest interval and the second lowest?

      also, wondering how to see the color associated with each interval? True, I can hover over various colors and figure out that yellow represents the highest interval ...

    3. form

      typo: "form" should be "from" (or maybe "using"

    1. g <- function(fit)

      see note below where g is the gini index

      maybe not use g here!

    2. g

      here, g is the gini index, not to be confused with the label g that tells the code what measures to use in the bootstrap

    3. mice

      nice not mice!!

    4. wo design matrices

      X_1, X_2 not X2

    1. Use the entire sample in model development

      unless sample size is > 20,000

    2. includes

      p here is the number of parameters, not the prob in the above formulas

    3. Riley, Snell,

      The references show Riley, et al Part II twice. Need to add the reference for Part I here:

      Minimum sample size for developing a multivariable prediction model: Part I - Continuous outcomes Richard D Riley 1 , Kym I E Snell 1 , Joie Ensor 1 , Danielle L Burke 1 , Frank E Harrell Jr 2 , Karel G M Moons 3 , Gary S Collins 4

      PMID: 30347470 DOI: 10.1002/sim.7993
      
    4. Does not at all affect predictions on model construction sample

      reword ?? "does not at all affect"???

    5. She

      I cringe every time I see "she" ... it seems as if "she" is always doing something wrong! suggest "they" (with or without changing investigator to investigators ...

    6. widering

      widering? maybe "widening" or just "wider"? or even "increasing the width of"

    7. in

      "only" in rare ...

    8. Grambsch & O’Brien

      1991 The effects of transformations and preliminary tests for non-linearity in regression. Stat Med, 10, 697–709.

    1. one gives the best estimate of the probability of the outcome to the decision maker and let her incorporate her own unspoken utilities in making an optimum decision for her.

      suggest replacing "her" with "them"

    1. f, height, abdomen

      ? plot title? Predicted Fat Fraction based on Height and Abdominal Circumference given Age = 43 ??

    1. Methods for checking fit:

      Confused ... you list 5 methods? Maybe number them 1 to 5 ...

      1. Fit simple

      2. Scatterplot of ...

      3. Stratify ...

      4. Separately for ...

      5. Fit flexible nonlinear ...

    2. Fit flexible nonlinea

      Ah now Method 5!

    3. Separately for levels

      Method 4

    4. Stratify the sample b

      Method 3 ?

    5. Scatterplot of Y vs.

      this is method 2?

    6. Stone and Koo Stone & Koo

      Just one reference? So maybe just

      "Stone and Koo (1985)"

      or

      "Stone & Koo (1985)"

    1. PART II

      You've referenced Part II twice. I suspect the first reference in this pair should be to Part I : https://doi.org/10.1002/sim.7993

  4. Apr 2024
  5. Jun 2023
  6. May 2023
    1. transformed

      discussion in class very helpful for use of cancore option in transcan

    2. truncate WBC to 40000

      Question: ?truncate? how would that var go in the formula? You said the prediction would be flat for wbc > 40000?

    3. Limiting Sample Size

      someone in class suggested define effective sample size in glossary!

    4. stepwise analysis

      and m is the sample size (number of observations?

    1. Logistic regression

      Just to clarify, some ML 'packages' (e.g. AzureML, GoogleML) consider logistic regression to be one of the ML algorithms. FWIW, I've seen (high school) science fair projects use both of those and find that logistic is "the winner" ...

    2. over-simplified

      if the cutoff for dichotomizing one variable is unstable, shouldn't be surprised that a tree based on cutoffs by dichotomizing multiple variables is unstable ?

    3. cutpoints Interestingly,

      insert period after cutpoints

    4. seems

      only appears

    5. effect present,

      effect is present

    6. Not forget

      or better: Do Not Forget ...

    1. if individual completed dataset analyses are not done (not recommended

      ?what is not recommended? ?analyses on individual completed datasets?

    2. non-mono-ton-ical-ly

      nonmonotonically

    3. look look

      duplicated look

    1. wrong

      biased

    2. that the difference between the most different treatments is badly biased

      trying to understand this: maybe "the difference between the treatments with the greatest difference is badly biased"

    3. disease or no disease when severity of disease is on a continuum

      just for readability:

      "disease" or "no disease" when severity of disease is actually on a continuum

  7. Apr 2023
    1. Best way to make model fit data well is to discard much of the data

      Suggested edit to emphasize you don't really suggest this approach?!

      "Best way" to make model fit data well is to discard much of the data!!

  8. Mar 2023
  9. Aug 2022
    1. idea is assume

      the idea is to assume

    2. any needed SAP

      question: after the plan, or after the pivotal analysis, has been completed?

    3. Above the tendency

      *Avoid * the tendency (or "urge"!)