696 Matching Annotations
  1. Last 7 days
  2. Jan 2023

      test comment

    1. somemustuseIDEAprotocol,butmostcanuseasingleroundofelicitation.Whatthey allhaveincommonisthatthey mathematically aggregatejudgments abouttheprobability ofsomeeventorsubjectivedegrees ofbelief,intoasingle,value.

      Will this work for continuous outcomes like the Unjournal is currently asking for?

    1. Interestingly, this also means that the prior for σ is now dependent on the prior for the slope, because

      come back to this, we might be able to put this exlpicitly into the model

    2. This means that the estimate for sigma is the square root of 1 minus the variance of the slope estimate (0.75²). I

      Could/should we make this explicitly part of the model, i.e., constrain this?

    3. prior(normal(0, 0.5), class = "b", lb = -1, ub = 1)

      seems, with brms, you can set lb and ub on classes but not on individual parameters

    4. add_predicted_draws(model_height_weight) %>%

      here we draw 'predicted entries'

    5. add_epred_draws(model_height_weight) %>%

      draws from the slope parameter

  3. Dec 2022
    1. CEARCH discovering a Cause X every three years and significantly increasing support for it.

      This seems like 'assuming the result' ... why every 3 years?

    1. sample_prior = TRUE,

      Does the 'prior predictive simulation' stuff here too

    2. The output shows us that we need to set two priors, one for the Intercept and one for sigma. brms also already determined a default prior for each, but we’ll ignore that for now.

      It's not clear to me what get_prior is doing here, or what its logic is. It would seem to be using the data to suggest priors, which McElreath seems to be against (but the 'empirical bayes' people seem to like)

      Of course, it does at least remind you what objects you need to set priors over

    3. The prior for the slope is a lot easier now. We can simply specify a normal distribution with a mean of 0 and a standard deviation equal to the size of the effect we deem likely, together with a lower bound of 0 and upper bound of 1.

      Update: I was wrong on the below, the SD is not 1 here, because it's the SD for the residual term in the linear model, not the SD for the raw outcome variable.

      Previous comment:...

      I’m ‘worried’ that if you give it data you know has sigma=1, but you allow it to choose any combination of beta and sigma, you may be getting it to do give a weird posterior to both of the parameters, in a way you know can’t make sense, in order to find the most likely parameters for the weird geocentric model you imposed.

      on the other hand I would have thought that it would tend to converge to a sigma=1 anyways as the most likely, as that is ‘allowed’ by your model

      my take is that the cauchy prior you impose in that part is heliocentric; well let me expand on this. I think you know that the true std deviation of the ‘standardized heights from this population’ is 1 what you don’t know is whether it is indeed normal (i.e., whether family = gaussian is right here) thus it might be finding ‘a sigma far from 1 is likely’ under this model, because that makes your ‘skewed’ or ‘fat tailed’ data seem more likely under the normal prior A better approach might be to allow a different distribution with some sort of ‘skew’ parameter, but imposing the sd must be 1

    4. Apparently our prior was still very uninformed because the posterior shows we can be a confident in a much narrower range of slopes!

      so here the priors mattered!

    5. I increased the adapt_delta, as suggested in the documentation, from .8 to .9.

      what does this mean?

    6. he Rhat values did not show this was problematic but

      what are rHat values and where do we see them?

    7. egression model into a simple correlation analysis. That way we can specify a prior on what we think the correlation should be

      to me, in this case, with physl interpretable data, it sounds more difficult to consider correlations. The 'small medium large' thing is from psychometrics I believe

    1. Please note, not all rigor criteria are appropriate for all manuscripts.

      Sciscore seems to have failed to be meaningful here

    2. ScreenIT Sep 27, 2021 SciScore for 10.1101/2021.09.22.461342: (What is this?)Please note, not all rigor criteria are appropriate for all manuscripts.

      Can we use any tools like this? E.g., Statcheck.io (for APA/Psych papers)

      somewhat important

    3. Is the study design appropriate and are the methods used valid? Yes

      as noted before, this yes/no tickboxing is generally not optimal for our case. These things are on a spectrum.

    4. Some details of the methods are lacking. For example, the MUpro provides two methods, it is necessary to specify which method was used in the analysis. The confidence score of each prediction should also be provided. Besides, some results from I-Mutant and MUpro were conflicting, the authors may want to discuss the discrepancy.

      again, the markdown numbering is failing here

    5. Discussion, revision and decision Discussion and Revision Author response We would like to thank the reviewers for their valuable comments. Below we provide pointwise response and the changes made in the revised manuscript. To Dr. Jyotsnamayee Sabat

      Nice, but

      1. I'd like to be able to see this full screen
      2. A heading/table of contents would be very helpful here

      fairly important

    6. PeerRef Dec 15, 2021 Discussion, revision and decision

      I would hope we could replace 'decision' with 'ratings and predictions' or something ... and make those ratings prominent


    7. Author response

      The 'order by recency' is good but sometimes limiting. I think readers would probably prefer to see the 'major comments and discussion' first, before the specific detailed small comments and clarification questions.


    8. Nov 26, 2021 Peer review report Reviewer: Hurng-Yi Wang Institution: Institute of Ecology and Evolutionary Biology, National Taiwan University email: hurngyi@gmail.com Section 1 – Serious concerns Do you have any serious concerns about the manuscript such as fraud, plagiarism, unethical or unsafe practices? No Have authors’ provided the necessary ethics approval (from authors’ institution or an ethics committee)? not applicable Section 2 – Language quality How would you rate the English language quality? Medium quality Section 3 – validity and reproducibility Does the work cite relevant and sufficient literature? No Is the study design appropriate and are the methods used valid? No Are the methods documented and analysis provided so that the study can be replicated? Yes Is the source data that underlies the result available so that the study can be … More Peer review report Reviewer: Hurng-Yi Wang Institution: Institute of Ecology and Evolutionary Biology, National Taiwan University email: hurngyi@gmail.com

      Nice. Is there a way we could put this at the top, or make a quick link to it?

      Ideally, this would have the ratings/rankings/predictions show up first on the page, as some sort of table (and also metadata if we dare to dream),


    9. Read the original source

      This is a bit misleading here. The 'original source' is basically the same stream of text

    10. I agree to change to Verified manuscript.

      what does this mean?

    11. and are shown below.

      these are not shown below. Are graphics possible here? Obviously a direct hyperlink to the revised section of the paper would be convenient here

    12. We would like to thank the reviewers for their valuable comments. Below we provide pointwise response and the changes made in the revised manuscript.

      @gavin @ annabell -- this might read better if each comment quickly linked to the section of the hosted paper and/or the comments were inserted in that part of the hosted paper with hypothes.is

    13. Pt-12:

      what do the prefixes like PT-12 mean here? I guess it's the reviewer number?

    14. The “Analysis of the Mutational Profile of Indian Isolates” should be moved to Materials and Methods.

      The markdown numbering failed here!

    15. Read the full article

      I clicked this link, and it is not coming up, or it's very slow

    16. Article activity feed Version 2 published on bioRxiv

      having trouble interpreting this. The linked version was published on Bioarxiv after the PeerRef? So which version was evaluated?

      OK, I guess the post-PeerRef version is published above ... so this is going from 'newest to oldest'. Maybe there's a way to make that clearer to someone visiting the page for the first time

    17. AgarwalNita Parekh

      why a 'full stop' (period) here after authors' names?

    18. Abstract

      abstract of which version?

    19. In this study we carried out the early distribution of clades and subclades state-wise based on shared mutations in Indian SARS-CoV-2 isolates collected (27 th Jan – 27 th May 2020). Phylogenetic analysis of these isolates indicates multiple independent sources of introduction of the virus in the country, while principal component analysis revealed some state-specific clusters. It is observed that clade 20A defining mutations C241T (ORF1ab: 5’ UTR), C3037T (ORF1ab: F924F), C14408T (ORF1ab: P4715L), and A23403G (S: D614G) are predominant in Indian isolates during this period. Higher number of coronavirus cases were observed in certain states, viz ., Delhi, Tamil Nadu, and Telangana. Genetic analysis of isolates from these states revealed a cluster with shared mutations, C6312A (ORF1ab: T2016K), C13730T (ORF1ab: A4489V), C23929T, and C28311T (N: P13L). Analysis of region-specific shared mutations carried out to understand the large number of deaths in Gujarat and Maharashtra identified shared mutations defining subclade, I/GJ-20A (C18877T, C22444T, G25563T (ORF3a: H57Q), C26735T, C28854T (N: S194L), C2836T) in Gujarat and two sets of co-occurring mutations C313T, C5700A (ORF1ab: A1812D) and A29827T, G29830T in Maharashtra. From the genetic analysis of mutation spectra of Indian isolates, the insights gained in its transmission, geographic distribution, containment, and impact are discussed.

      I really don't like this font, finding it very hard to read, but that's probably a taste thing. Still, I'd like if we could use a font that 'looks more like a journal'.

    20. Pt-13: I want to know how the representative sequences were selected for different states. Is it based on no. of sequences submitted or positivity rate of a particular region? All the Indian isolates available in GISAID for the period 27th Jan – 27th May 2020 were download and considered for analysis. NO state-wise selection was done.

      these authors seem to have use quotation the opposite way I would have done. I would have done

      reviewer's comment here

      My response here (unquoted)

    21. Demographic Analysis of Mutations in Indian SARS-CoV-2 Isolates

      would be nice to have keywords up top

    22. Demographic Analysis of Mutations in Indian SARS-CoV-2 Isolates

      Commenting on the format here

    1. What cause area(s) is/are you interested in working in if there was a role or project that was a good fit? (select all that apply)

      In the view I'm seeing here, the list is very vertically long. Maybe a way to have fewer spaces or 2 columns for less scrolling?

      If you are trying in general to learn from this rather than about specific people, you might have the survey tool randomise the list order

    2. What type of role(s) would you be interested in working in? (select all that apply)

      where does this list come from? 'Research' is rather vague

    3. What obstacles are holding you back from changing roles or co-founding a new project? (select all that apply)

      What is the purpose of this question? It seems like you are suggesting things they might not have thought of here.

    1. Add the SurveyMonkey account’s OAuth token to your .Rprofile file. To open and edit that file, run usethis::edit_r_profile(),

      For me this opened up some other profile. Maybe because I'm working in Rstudio with a Quarto?

      When I just opened the .Rprofile file listed at the root of my repo and where the .Rproj is stored, it worked

    1. 15desire to “pay it forward” for other donors by supporting the matching fund after receiving matching funds. This possibility may be explored infuture research. About a third ofdonors werewilling to support the matching fund with some or all of their donation. This provided enough matching funds to cover the matching funds received by donors, making the micro-matching system self-sustaining.Despite a long history of altruism,including centuries of organized philanthropy, humans have only recently attempted to systematically measure the cost-effectiveness of altruistic endeavors with the goal of doing as much good as possible10,11. The effective altruism movement is growing and has been notably successful in securing large commitments from relatively few people32,33. Effective altruism’s potential for more widespread adoptionis unknown. The sevenstudiesandproof-of-conceptdemonstration presented here are cause for optimism, grounded in a more detailed understanding of altruistic motivation. Today, relatively few donors prioritize effectiveness. But our results suggest that effective giving can be a satisfying complement to giving based on personal feelings, adding a “competence glow”27to the proverbial “warm glow” of giving. Some donors are willing to incentivize bundle donations in others, promoting a chain of giving that is both personally meaningful and effective. The stakes are high, as ordinary people have the power to do enormous good. The limited proof-of-concept demonstrationreported onhere raised funds sufficient to provide 100,700deworming treatments and 17,500 malaria nets, among other benefits. (See Supplementary Materials.) A better understanding of moral motivation, and how to channel it, could dramatically increase the impact of human altruism.MethodsAll reported studies, including the final proofofconcept,were pre-registered, except forStudy 7 (which was a pre-test for the proof of concept). Formore detailed descriptions of the methodsand results, please refer to our Supplementary Materialsavailable at https://osf.io/zu6j8/?view_only=28050663bd6b4b5cae0a73ad8068bc34. Across Studies 1-7w

      I see the code and data here, but I can't find the study materials

  4. Nov 2022
    1. add_predicted_draws

      not sure I get the syntax here. Why is this called add_predicted_draws?

    2. howed us the posterior distributions of the two parameters

      I think you plotted the 'marginal posteriors' for each (for each case, averaging over the posterior for the other). Technically, there is a joint posterior distribution, which you could plot as in those heatmap plots in Kurz.

    3. Apparently our posterior estimate for the Intercept is 154.63

      They call it an 'estimate' in the code but that seems like terminology McElreath would disagree with. That's the maximum a-posteriori value ... but the estimate is the distribution (actually, the joint distribution of the parameters)

    4. Here we see that the posterior distribution

      would be interesting to plot this for 2 different posteriors

    5. Notice that we sample from the prior so we can not only visualize our posterior later, but also the priors we have just defined.

      not sure what this means. Also what does 'run the model mean? Calculate a posterior? With which approach?

    6. whether the chains look good.

      what 'chains? And what does 'look good' mean?

    7. So, our priors result in a normal distribution of heights

      how do you see it's normal?

    8. model_height_prior <- brm(

      repeated code after normalization. Maybe save as 2 separate versions to compare?

    9. - brms’ default and my own

      these both seem to allow negative values for sigma. These don't seem right -- aren't you supposed to do something that implies a strictly positive distribution, like letting the log of sigma be normally distributed?

      (I think they are only positive in the plot because you cut off the x axis)

      Maybe the brms procedure below fixes this in a mechanical way because it sees 'class= "sigma"' .. .but I'm not sure howw

    10. sample_prior = "only",

      what is this doing?

    11. file = "models/model_height_prior.rds"

      what is this saving?

    12. family = gaussian,

      what does family = gaussian do here, over and above the specified priors?

    13. we can simulate what we believe the data to be

      I wouldn't say 'believe the data to be'. We have the data (at least the sample). We are simulating what we believe the population this is drawn from looks like

    14. The sigma prior

      'prior over sigma' -- 'sigma prior' makes it sound like it's a sigma distribution (if that's a thing) rather than a distribution over sigma

    15. We’re not super uncertain about people’s heights, though, so let’s use a normal distribution.

      uncertainty could be expressed in terms of a greater sigma (std deviation) also. So this isn't exactly uncertainty, but something about the shape of the distribution, the amount of tail-uncertainty for a given level of uncertainty closer to the mean

    16. But this is the default prior. brms determines this automatic prior by peeking at the data, which is not what we want to do. Instead, we should create our own.

      but what is it's justification for doing so? the people who designed it must have had something in mind.

    17. parameter refers to, well, sigma

      the standard deviation of heights around the mean

    18. we should start with defining our beliefs, rather than immediately jumping into running an analysis.

      Slight quibble: It doesn't have to be 'our own beliefs' but just 'what you think a reasonable starting belief would be', or 'what you think others would accept'.

      This relates to the discussion of 'epistemological assumption', I believe.

      It could also be a belief (distribution) based on prior data and evidence

    19. a null effect is

      I would say 'a small or zero effect'

    1. he time we can expect to wait between events is a decaying exponential.

      $$P(T>t) = e^{-(events/time)t}$$

    2. This is what “5 expected events” means! The most likely number of meteors is 5, the rate parameter of the distribution.

      I think that's the mode -- does it coincide with the mean here?

    1. Include a field to add data from the “How did you hear about EA UNI NAME” question

      what does this mean?

    2. Attrition rate = % of fellows who do not complete the fellowship, assuming that completion of a fellowship is defined as attending at least 6 of 8 meetings

      I know they track this for the virtual fellowships

    3. The form will also include some formal definition for each category. What should this be?

      maybe consult EA survey on this?

    4. Completed Fellowship, highly engaged

      may be challenging for them to classify this?

    5. This form will consist of a list of all Fellows who filled out the “How you heard about us” question. Each organizer will be prompted to label each fellow as one of the following:

      super important!

    6. Post-Fellowship Engagement Form:

      make the pre/post distinction clearer in the intro

    7. using email automation

      are we automating emails to faculty? That seems possibly problematic

    8. It may not be worth tracking at all.

      Why not?

    9. maybe) Other variables which we might be interested in (Group age, # of organizers, existing group size, etc.).

      This seems important -- identifying 'outcomes' and tracking them

    10. Fellowship application, and regularly track fellowship attendance for every Fellow.

      Can we clearly define or link which 'fellowships' we mean here? Can people be in these groups without doing the fellowship?

    11. to be sent 8/30

      Starting in 2023?

    12. our base

      The database will be an Airtable, I guess?

    13. Participating groups will fill

      Who in the group will have this responsibility?

    14. outreach data

      Define 'outreach'/'outreach data'?

    1. mtcars) integrate(function(x) sin(x) ^ 2, 0, pi)

      Numerical integration?

  5. Oct 2022
    1. Apply the quadratic approximation to the globe tossing data with rethinking::map().

      Here they actually use the Rethinking package instead of brms. Why?

  6. Sep 2022
    1. Why hasn’t such a movement for something like Guided Consumption happened already? Because Guiding Companies, by definition, generate profit for charities instead of traditional investors, a major issue they face is that Guiding Companies cannot access the same investment pool of private equity and angel investors for seed money. One solution to this would be to seek financing from philanthropists, particularly those who are looking to spend their money to advance the same cause area as the Guiding Company. However, the question remains: if Guided Consumption is a more effective means of funding charities than direct donation, why has this not been more fully explored already?   I suspect that the reason stems from a deep-seated psychological separation between the way that people think about the business world, essentially a rather competitive, dog-eat-dog mindset and the kinder, more magnanimous mindset involved in charity work. The notion also seems to violate intuitions about sacrifice involved in charitable contributions, although these intuitions do not hold with the deliberate substitution of traditional stakeholders for charities. I would also note that some further red-teaming can be found in the comments of the longer paper.

      These are good points.

    2. But even if Guiding Companies engage in activities that consumers take issue with regarding traditional firms, such as competitive (i.e., princely) compensation for CEOs, it is not clear why this would cause a consumer to choose a company that enriches shareholder over a company that helps fight global poverty.

      But the pressure not to do this might make the GC's less efficient and thus more expensive

    3. What if selfish motivations make for the best founders/investors/etc.? The efforts of philanthropic investors are cap

      I think this is a big issue and you are not taking it seriously enough. Without profit motives, it may be hard for these companies to stay efficient and well, profitable. Who is 'holding the CEOs' feet to the fire?' At least the conventional wisdom is that altruistically motivated leaders are less hard-headed, less efficient, etc.

    4. the public

      I feel like this already exists enough with Newman's Own etc. I think we should try to focus on GH&D here and maybe some Global Catastrophic Risk prevention public goods. Animal causes: maybe, but only to some demos/products (like vegetarian stuff).

    5. Which Market Sectors?

      I also suggest market sectors where there is some reluctance/repugnance to buying the product or service. The charity aspect will allow some moral licensing. E.g., I forget which charity allowed people to donate in exchange for cutting-in-line at some festival.

    6. low-differentiation sectors, it may be easier to construct a “no-brainer” where a consumer is genuinely ambivalent as to two product

      But are there substantial profits to be had by newcomers in such sectors? The profit margins may be low for such commonplace undifferentiated sectors.

    7. Another approach is to capitalize on virtue-signaling, perhaps through products that could enable a consumer to conspicuously show that they bought through a Guiding Company.

      I strongly agree with this. More conspicuous consumption.

    8. A movement that enables everyday people to help charities without sacrificing anything personally should be much easier than one that demands people give significant things up or even mildly inconveniences people.

      But can we really quantify the benefit?

      1. Charities already hold shares of companies

      2. People already do consider the owners of companies (usually through a political lens ... e.g., "Home Depot owner supports right wing causes so people boycott" or some such

      3. How much more will shopping at a "Guided Consumption owned company" actually lead more to go to the charities?

      4. Will people (over)compensate for this by reducing donations elsewhere?

      5. If the big companies are differentiated in some way (like 'monopolistic competition' suggests, there could be a substantial cost to consumers (and to efficiency) to choosing the 'charity supporting brand'

    9. I am optimistic about the prospects for a movement developing because of what it allows for consumers: they get the same product, at the same price, but profits benefit charities rather than shareholders.

      I think you said this already

    10. to be the most powerful, would likely require a social movement

      why does it 'need a social movement'? That doesn't seem clear to me. It seems like it would benefit from one... but.

    11. although a Guiding Company would likely enjoy a degree of advantage correspondent with a Guiding Company being able to communicate this feature with its customer base.

      not sure why this is an 'although'?

    12. the identity of the entities that benefit from your purchase, often, owners in some form.

      Not entirely true. A lot of companies (e.g., Big Y) advertise themselves as 'American owned'

    13. . This is because charities are more popular than normal investors and

      that's not exactly what the study says, but it's close

    14. would have a competitive advantage

      "Would have" seems too strong. There are reasons to imagine an advantage and other reasons to imagine a disadvantage. I think EA forum prefers 'humble' epistemic statements

  7. Aug 2022
    1. + (1|reader)

      Richard: 2 reasons 1. I get this pooling/regularization effect 2. "I don't really care about reader" so ???

      If reader were orthogonal to everything else I might still put it in because of the unbiasedness 'in a low dimensional setting' (DR: sort of thought it goes the opposite way)

      If I do an idealized RCT with things I changed in exactly the same way I would not get overfitting. I might get error, but not overfitting.

    2. Thinking by analogy to a Bayesian approach, what does it mean that we assume the intercept is a “random deviations drawn from a distribution”? Isn’t that what we always assume, for each parameter in a Bayesian model … so then, what would it mean for a Bayesian model to have a fixed (vs random) coefficient?

      With bayesian mixed you are putting priors on ever coefficient true.

      But also you have an (?additional) random effect .. somewhat more structure.

      Also in LMER stuff we never update to a posterior

    3. Why wouldn’t we want all our parameters to be random effects? Why include any fixed effects … considering general ideas of overfitting and effects as draws from larger distributions?
      1. analogy to existing examples of fields of wheat
      2. or build a nested model and look for sensitivity
    4. How distinct is this from the ‘regularization with cross-validation’ that we see in Machine learning approaches? E.g., I could do a ridge model where I allow only the coefficient on reader to be regularized; this also leads to the same sort of ‘shrinkage’ … so what’s the difference?

      Richard: The L1/L2 E-net approach does something mechanical ... also it can handle a lot of stuff high dimensional, quick and dirty

      RE requires more thinking and more structure

      How to do this "Does this line up with the canonical problems involving fields etc"

    1. ## Correlation of Fixed Effects:

      not sure how to interpret this

    2. Groups Name Variance Std.Dev. Corr ## Chick (Intercept) 103.61 10.179 ## Time 10.01 3.165 -0.99 ## Residual 163.36 12.781 ## Number of obs: 578, groups: Chick, 50

      note the coefficients are not reported, just the dispersion

    1. prior_covariance

      what is the prior for the slopes?

    2. higher confidence region.

      higher confidence in what sense?

    3. “randomly varying” or “random effects”.

      but isn't this what Bayesians assume of every parameter?

    4. This model assumes that each participant’s individual intercept and slope parameters are deviations from this average, and these random deviations drawn from a distribution of possible intercept and slope parameters.

      presumably normally distributed .. or at least with more mass inthe center

    5. It’s the fixed effects estimate, the center of gravity in the last plot.

      this term is confusing for econometricians

    1. and can give rise to subtle biases that require considerable sophistication to avoid.)

      I'm not sure the link refers to the same sort of 'random effects' technique, so the bias discussed there may not apply

    1. I’ll introduce ranks in a minute. For now, notice that the correlation coefficient of the linear model is identical to a “real” Pearson correlation, but p-values are an approximation which is is appropriate for samples greater than N=10 and almost perfect when N > 20.

      this paragraph needs clarification. the coefficient on which linear model?

    2. correlation coefficient of the linear model

      what is the 'correlation coefficient of the linear model'? It's a transformation of the slope

    3. t-tests, lm, etc., is simply to find the numbers that best predict yyy.

      I don't think t-tests estimate slopes or predict anything

  8. Jul 2022
    1. under the assumption that ℋ1H1{\mathcal {H}}_1 is true, the associated credible interval for the test-relevant parameter provides a range that contains 95% of the posterior mass.

      I don't get the 'under the assumption that H1 is true' in this sentence. Isn't this true of the credible interval in any case?

    2. he Bayes factor (e.g., Etz and Wagenmakers 2017; Haldane 1932; Jeffreys 1939; Kass et al. 1995; Wrinch and Jeffreys 1921) reflects the relative predictive adequacy of two competing models or hypotheses, say ℋ0H0{\mathcal {H}}_0 (which postulates the absence of the test-relevant parameter) and ℋ1H1{\mathcal {H}}_1 (which postulates the presence of the test-relevant parameter).

      Bayes Factor is critiqued (datacolada?) because the 'presence of the parameter' involves an arbitrary choice of distribution of what values the parameter would have 'if it were present'.

      And sometimes the H0 is deemed more likely even when the observed parameter 'estimate' falls in this range.

    3. a [100×(1−𝛼)][100×(1−α)][100 \times (1-\alpha )]% confidence interval contains only those parameter values that would not be rejected if they were subjected to a null-hypothesis test with level 𝛼α\alpha.

      With the null hypothesis equal to the point estimate, I think, not a 0 null hypothesis

    1. They found that only 70% of their large (20:1) samples produced correct solutions, leading them to conclude that a 20:1 participant to item ratio produces error rates well above the field standard alpha = .05 level.

      really confused here ... what is the 'gold standard' ... how do they know what is a 'correct solution'? Also, how does this fit into a NHST framework?

    2. f course, the participant to item ratio is not a good benchmark for the appropriate sample size, so this is not enough to demonstrate that the sample size is insufficient. They did find support that this is not enough by sampling data of various sample sizes from a large data s


    3. thumb only involve

      'typically only' ... but they could be better

    4. participants to item ratios of 10:1

      you haven't yet defined this concept

    5. Costello & Osborne (2005).

      Link goes to a different reference. Also (small point), the names should be within the parentheses

    6. Costello & Osborne (2005).

      Link goes to a different reference. Also (small point), the names should be within the parentheses

    7. the number of underlying factors and which items load on which factor

      Would be good to link or define what terms like 'factors' and 'load' mean here

  9. Jun 2022
    1. In other words, the goal is to explore the data and reduce the number of variables.

      That's not 'in other words', it's different. "Reduce the number of variables" can be done in many ways and for different reasons. Latent factors are (I think) something with a specific meaning in psychology and this sort of structural analysis in general.

    2. is to study latent factors that underlie responses to a larger number of items

      But what are 'factors'?

    1. The module also comes with a reviewer reputation system based on the assessment of reviews themselves, both by the community of users and by other peer reviewers. This allows a sophisticated scaling of the importance of each review on the overall assessment of a research work, based on the reputation of the reviewer.

      This seems promising!

    2. By transparent we mean that the identity of the reviewers is disclosed to the authors and to the public

      Not sure this is good. I worry about flattery and avoiding public criticism.

    3. Digital research works hosted in these repositories can thus be evaluated by an unlimited number of peers that offer not only a qualitative assessment in the form of text, but also quantitative measures that are used to build the work’s reputation.

      but who will finance and coordinate this?

    4. One important element still missing from open access repositories, however, is a quantitative assessment of the hosted research items that will facilitate the process of selecting the most relevant and distinguished content.

      What we've been saying

  10. May 2022
    1. Einstein scooped Hilbert by a few days at most in producing general relativity. In that sense

      interesting, I didn't know about this

    2. EA focuses on two kinds of moral issue. The first is effective action in the here and now — maximising the bang for your charitable buck. The second is the very long run: controlling artificial general intelligence (AGI), or colonizing other planets so that humanity doesn’t keep all its eggs in one basket.

      good summary

    3. this issue in acute form

      wait, which issue? He realized that achieving his moral objectives wouldn't make him feel happy. Did that change what he felt he should do or his sense of moral obligation?

    4. duty

      not sure it's always framed as a 'duty'

    5. (You barge past me, about my lawful business, on your mission of mercy. “Out of the way! Your utility has already been included in my decision calculus!” Oh yeah, pal? Can I see your working?)

      good analogy

    6. Another reason is just that other people’s concerns, right or wrong, deserve listening to.

      Is this related to the 'moral uncertainty' and 'moral hedging' ... or is this a fairness/justice argument?

    7. Utilitarianism is an outgrowth of Christianity.1

      This is a really big claim to make here ... needs more support. It kind of goes against religion in that it sets no 'thou shalt not's ... at least the act utilitarianism

    8. Faced with questions of the infinite future, it swiftly devolves into fun with maths.

      good point

    9. What will motivate you if you don’t change the world? Can you be satisfied doing a little? Cultivating the mental independence to work without appreciation, and the willpower to keep pedalling your bike, might be a valuable investment. Ambition is like an oxidizer. It gets things going, and can create loud explosions, but without fuel to consume, it burns out. It also helps to know what you actually want. “To maximize the welfare of all future generations” may not be the true answer.

      I'm not 100% sure what you are saying/suggesting here. Maybe this ends less strongly than it began? What is the 'fuel to consume' you are getting at here? What should it be?

    10. Just by arithmetic, only few will succeed.

      but if each have an independent probability of succeeding, each may still be having a large impact in expected value.

    11. Here’s a more general claim: the more local the issue, the less substitutable people are. Many people are working on the great needs of the world.

      This is possibly true, in some cases, for research work but probably not true for donations. If you donate $4000, lots more children get malaria nets or pills, fewer get severely ill, and on average 1 less children dies ... relative to your not having made that donation.

    12. net contribution

      what do you mean by 'net contribution'? There's a lot of discussion in the donations side of EA about making a counterfactual impact. They focus on the marginal difference you make in the world relative to not having done/donated this. If, absent your donation to Malaria Consortium just as many people would have gotten ill from malaria (because someone else would have stepped in) this would be counted as a 0. So this is already baked in.

    13. Department’s

      capital letters?

    14. My marginal contribution would be small.

      I think you DHJ could possibly make a big contribution. BUT what does this have to do with this essay? What is the point you are making here?

    15. But enough other people are thinking about it already. I trust them to work it out. My marginal contribution would be small.

      Relative to other things and relative to the magnitude of the problem people claim this is ... few people are working on it, it's seen to be neglected.

    16. After a visit to Lesswrong, someone will probably associate EA more with preventing bad AIs than with expanding access to clean drinking water.

      But LW is not EA ... see https://www.lesswrong.com/posts/bJ2haLkcGeLtTWaD5/welcome-to-lesswrong ... doesn't mention EA.

      Also, I think most people who have heard of it still associate EA with the latter, and with the Giving What We Can 10% pledge (we actually have data on this) . Even though the most active EAs are in fact prioritizing longtermism these days.

    17. which makes them a clear badge of identity

      True. This might be why LT-ism is so ascendant in EA especially at university groups

    18. Contributing to the first topic requires discipline.

      Contributing research-wise and through your career, that is. You can always donate to the non-longtermist stuff, and that was at the core of the first-gen EA. And the whole GiveWell thing is about making it easy to know your gift makes the biggest difference per $

    19. More might come out of it than from all your earnest strivings.

      the implication here is that Gaugin's work contributed a lot. I guess?

    20. It’s questionable whether you even have the right.

      interesting ... maybe elaborate? "Right" in what sense?

    21. and admirable practices like steelmanning.

      yes, but are you doing this in the current essay?

    22. They contain contradictions. That makes them rich enough to cope with human life, which is contradictory.

      I think some examples in footnotes or links or a vignette would help here. Because I sort of feel like "no, the old religions really struggle to cope with modern life"

    23. g and pushed it to be more than this

      the link is long ... which bit should I read?

    24. the Axial religions

      I've never heard of this term

    25. moral demands

      also not sure it "imposes demands" ... it just suggests "this would be the best way to behave" I guess

    26. Either someone is maximizing utility, or they’re in the way.

      hmm., who said 'they're in the way?'

      Also, 'max util' is confusing here ... because in economics we think of it as maximizing our "personal" utility function. Maybe a distinction needs to be made at some point to make it clear that this is some weighted adding up.

    27. it imposes extreme moral demands on people. Sell all your possessions and give them to the poor.

      Not sure whom this is. EA doesn't really ask this. The push now is ~try to find more effective/impactful careers. And even the well known GWWC was 'only' advocating a 10% donation rate, and even that 'not for everyone'

    28. Utilitarians reduce all concerns to maximising utility. They can’t be swayed by argument, except about how to maximise utility. This makes them slightly like paperclip maximisers themselves.

      My guess is that the way you make your claim here might be seen as not Scout mindset, not fully reasoning transparent, not arguing in a generous spirit. It's not how people want to discourse on the EA forum anyways; not sure if effectiveideas would be ok with it or not.

      Would utilitarians say "we reduce all concerns to maximising utility"? If so, give a link/evidence to this statement.

    29. hey recycle humans into paperclips

      you definitely need a link here (again, not for EA's but if this is outreach people will be like WTF)

    30. AIs which

      Do all your readers or the ones you are reaching out to know what "AI's" means? All the EA readers will, but still...

    31. The Effective Altruism

      wrong link, perhaps? Lesswrong is not EA per se if I understand, it's Rationalist ... certainly adjacent to EA though. I might link https://www.centreforeffectivealtruism.org/ or https://forum.effectivealtruism.org/ as the canonical EA link

    32. and turns them into numbers.

      links/references here could help to avoid straw man accusations

    33. as preferences,

      I'm not 100% that all schools of utilitarianism treats it as choice or preference based. That is familiar from economics, but I think other schools consider things like 'intensity of pleasure and pain'?

    1. Demonstrate the potential for impact via thought experiments like The Drowning Child (although use this sparingly and cautiously, as people can be turned off by obligation framing).

      I think people are also sometimes turned off or disturbed by having to make these difficult Sophies' choices

  11. Apr 2022
    1. 13.1.1 Other lists and categorizations of tools

      no third level numbering.

      Could also use some updating ... and leveraging wht they have done

    2. Approaches to overcoming the barriers and biases discussed in previous chapters.

      give bolded titles to these categories and link to sections below

    1. The first intervention, surgical treatment, can’t even be seen on this scale, because it has such a small impact relative to other interventions. And the best strategy, educating high-risk groups, is estimated to be 1,400 times better than that. (It’s possible that these estimates might be inaccurate, or might not capture all of the relevant effects. But it seems likely that there are still big differences between interventions.)

      Some rigor might be hslpful here

    1. I created R functions for TOST for independent t-tests, paired samples t-tests, and correlation

      what about binary outcomes?

    2. So my functions don’t require access to the raw data

      there's always a workaround to regenerate the 'equivalent raw data'... but it's annoying

    3. while we are more used to think in terms of standardized effect sizes (correlations or Cohen’s d).

      not so much in Economics, where we often focus on 'non-unitless' outcomes, that have a context-specificinterpretation

    4. After choosing the SESOI, you can even design your study to have sufficient power to reject the presence of a meaningful effect.


    1. your data set (e.g., means, standard deviations, correlations

      your expected data set.

      But simulation also allows you to use prior data... maybe worth mentioning?

    2. think they should power their study, rather than the set of analyses they will conduct

      can you clarify that a bit?

    1. This section was written by David Reinstein and Luke Arundel.

      Annabel Rayner also contributed/is contributing

    2. An example of this is presented in the study by Lichenstein et al., (1978).

      seminal paper; if we want to dig into the evidence we should look at replication, review, post-replication crisis work.

    3. Experiential distance describes one’s proximity to a particular situation or feeling as a result of having seen something or been a part of it. In particular, a greater experiential distance is seen to make it more difficult to imagine a particular situation or feeling, therefore making it harder to empathize with someone going through it. This could create a barrier to giving effectively as most charitable giving is motivated by empathy and sympathy for victims (Lowenstein & Small, 2007). Experiential distance explains why it is easier for an individual to feel empathy for a victim if they have personally experienced the ailment or someone close to them has. As a result, they do not have to imagine the suffering it may have caused because they have directly or indirectly physically experienced it. In other words, the experiential gap is smaller. For example, people living in wealthy nations are more likely to be affected by cancer than malaria, leading to a greater support for that cause.

      This comes up in the context of 'availability bias'

      'Experiential distance' may be our own definition. If so, let's make it clear that this is the case

      "As most charity" ... "as it is claimed that..."

      "Ailment = illness" ... it could be outside of the medical domain .. make it more concise .. And if this is an esxample and not the general point, use a parenthetical "(e.g.)"


  12. Mar 2022
  13. pub-tools-public-publication-data.storage.googleapis.com pub-tools-public-publication-data.storage.googleapis.com
    1. predict the precisionof an iROAS estimate that a proposed experimental design will deliver.

      'predict the precision' ... maybe that's the Bayesian equivalent of a power analysis

    2. design process featuring a statistical power analysis is a crucial step in planning of aneffective experiment.

      this is great -- statistical power!

    3. If it is obvious from this plot that the incrementaleffects have ceased to increase beyond the end of the intervention period, there is no needto add a cooldown period, since doing so only adds more uncertainty to the estimate ofiROAS

      this seems suspect: you are imposing a strong assumption based on casual empiricism, and this probably overstates the precision of the results.

      ideally, some uncertainty over that should be incorporated into the estimation and model, but maybe it's not important in practice?

    4. Due to the marketing intervention, incremental cost of clicks increasedduring the test period.

      This is, apparently, 'amount spent' not a marginal cost thing

    5. shows a similar TBR Causal Effect analysis for incremental ad spend (cost ofclicks) ∆cost(t).

      same thing but for costs

    6. nd the 90%middle posterior intervals of the predicted counterfactual time series, y∗t .

      you have to zoom in, they are hard to see

    7. Bayesian inference is usedto estimate the unknowns and to derive the posterior predictive distribution of y∗t foreach t.

      this is Bayesian!

    8. The simplest possible relationship is given by the regression model,yt = α + βxt + t, t in pretest period

      maybe we want some geo-specific error term?

    9. or TBR, data are aggregated across geos to provide one observation ofresponse metric volume at every time interval for the control and treatment groups:

      the aggregation seems to throw out important variation here ... that might tell us something about how much things 'typically vary by without treatments' ... but maybe I'm missing something

    10. matched market tests, which may compare the behavior of users in a singlecontrol region with the behavior of users in a single test region.

      I'm considering a case with one test region and many control regions. Will this paper still apply?

    11. ime-Based Regression Framework

      Is this equivalent to an 'event study' or a 'difference in difference' as discussed in econometrics?

  14. Feb 2022
    1. This appendix provides a brief introduction to the several types of software and processes used to creating websites such as Increasing Effective Charitable Giving and Researching and writing for Economics students. We aim to encourage others to participate in this collaborative work, and to spin off their own projects. If you would like to provide feedback or ask a question about these projects then using ‘hypothes.is’ is an easy way to do so (please write directly in the html and contact me at daaronr at gmail dot com to let me know you’ve done so).

      Answering some questions about participation:

      How many hours are student researchers required to allocate for the project? It depends on how much you want me to engage with you and talk you through the processes, unboarding etc. I think something like a minimum of 40-50 total hours seems about right, but 100+ hours would be better

      1. Is the project's intended audience the EA community? To an extent, yes. I guess 3 primary audiences.

      i. The EA community interested in learning more about (what they can do to promote effective) charitable giving and relevant attitudes,

      ii. Effective charities and organizations promoting effective giving and action (see the related EA Market testing team)

      iii. Academic researchers (Social science, economists, data scientists, human biology) interested in these issues, coming from a range of perspectives

      1. Which requirements constitute sufficient quality work for student co-authorship?

      It's hard to put this in writing succinctly. If you're continent is well written and reasoning-transparent we can probably integrated into the website and recognize you as the author of a particular section. In terms of peer reviewed academic output (this project is not itself a 'paper' but I am very pro-feedback and evaluation-- see bit.ly/eaunjournal) we have to discuss that more carefully.

    1. 10 Effect of analytical (effectiveness) information on generosity

      todo? maybe incorporate more lab work with clear strong 'analytical thinking' manipulations

    2. Much of this project is being openly presented in the (in-progress) “Impact of impact treatments on giving: field experiments and synthesis” bookdown, a project organised in the dualprocess repo.

      Todo -- integrate these better to remove overlap or make overlap directly synchronized

    1. We consider a range published work (from a non-systematised literature review) including laboratory and hypothetical experiments, and non-experimental analysis of related questions (such as ‘the impact of general charitable ratings).

      Consider: should I focus on the lab and otherwise framed work more?

      See private RP Slack thread here considering Moche et al, 2022

    1. After participants had completed the initial survey, which took most of them at least 10 minutes, we measured their volunteering behaviour by asking them whether they would be willing to fill in a ‘a few extra questions’ for charity (59.3% responded yes) rather than skipping directly to the final questions. The participants were informed that their choice would be completely anonymous and that 5 SEK (around 0.50 euro or 50 US cents) would be donated to a charity organization of their choosing if they would fill out the additional questions.

      This seems like a potentially meaningful measure. We need to read closely to consider the extent to which that 'desire for consistency with questionnaire response' could be driving/biasing this.

  15. Jan 2022
    1. Given the conjugacy of the beta for the binomial,


      In Bayesian probability theory, if the posterior distribution p(θ | x) is in the same probability distribution family as the prior probability distribution p(θ), the prior and posterior are then called conjugate distributions, and the prior is called a conjugate prior for the likelihood function p(x | θ).

      So, here, I guess, the combination of a binomial(\(\theta\)) distribution for the data and a Beta probability for \(\theta\), the probability of each positive outcome, implies that the posterior density will also have a Beta probability.

      However, the posterior density of the difference in the \(\theta\)s is something that would need to be computed.

    2. The statistical hypothesis we wish to investigate is whether the proportion of left-handed men is greater than, equal to, or less than the proportion of left-handed women.

      Note this is not the 'lady tasting tea' case, where true outcome shares are known

    1. Proportion of Funding Available for Program

      The 'user input' here should be something like a mean and a dispersion ... most people won't know what the parameters of the Beta distribution mean.

      if necessary, we could explain what the parameters you input here will do, and have a graph of the distribution of this input to the model

    2. Transfers as a percentage of total costs. GiveDirectly has other costs! So how much of our money is going to people in need? The value is derived here: https://docs.google.com/spreadsheets/d/1L03SQuAeRRfjyuxy20QIJByOx6PEzyJ-x4edz2fSiQ4/edit#gid=537899494 This is calculated by finding the average proportion over the years. TODO: Create a predictive model, fitting a normal and a beta distribution to financials Cell: B5 Units: Unitless (percentage), 0-100%

      I love that this is described, but obviously the display here doesn't work. I asked Causal about cold folding

    3. Arbitrary Donation Size, chosen to be $100,000 by GiveWell

      This doesn't seem to impact anything

  16. Dec 2021
    1. he current results are telling us more about the structure of the model than about the world. For real results, go try the Jupyter notebook!

      I would love for this to be made more user friendly and explained better! I was able to run it but its hard to wrap your head around all the parameters while you are doing it.

    2. minimally

      what do you mean 'minimally sensitive' here? This is a bit confusing ... you are highlighting what seem to be the LEAST important factors, then.

    3. Direct cash transfers

      the scatterplots are somewhat overwhelming ... so much information ... there must be a better way to depict this.