1,724 Matching Annotations
  1. Last 7 days
    1. About Summary Pivotal Questions Live Sessions Resources ▾ Readings Linear WELLBY Analysis DALY-WELLBY Conversion Metaculus Question

      colore on black text here is very hard to read

    1. How many WELLBYs equal 1 DALY?

      check and annotate -- what does e.g., 5 wellbys per DALY mean in context, and how does it compare with what people currently do?

    2. problem under consideration. So I'd resist doing a simple exchange rate."

      This seems like a valid objection, but I think we still phrase the question such that you would give a meaningful answer, or you could give a meaningful answer in this case in terms of the value generated if you were forced to use a single conversion.

    3. PQ1B — Recommended measure for funders

      The discussion might be more valuable, or I would say is likely to be more valuable than the response, particularly for this question.

    4. Composite well-being measure

      let's do better to differentiate this from calibrated well-being. It's not fully clear to someone glancing at this briefly.

    5. How many WELLBYs equal 1 DALY?

      Make sure you can access the literal question from this interface to know exactly what the respondents are answering. If these things get very long, you can use tooltips.

    1. Interactive uncertainty model

      I don't htink this is a stochastic model? Perhaps an extension of this should give these (correlated?) distributions. ... Squiggle-type modeling

      There should also be discplay of the actual equation behind the model, and a folding box or linked page explaining it in more detail

    2. Organizations should distinguish runway decisions from upside options. If a project is valuable only under a fast-funding scenario, that dependence should be explicit rather than hidden inside local rumor.Funders and field builders should prioritize grantmaker capacity, plural donor relationships, legal vehicles, and evaluation infrastructure. These are the bottlenecks that convert paper wealth into usable grants.

      this advice seems on the overly generic side?

    1. The Unjournal evaluates research; it does not publish papers as journal articles, and evaluators do not issue accept/reject decisions. This matters because many author co

      links don't need to be bolded

    2. or policy relevance.

      remove 'or policy relevance' perhaps -- The Unjournal prioritizes research with global impact potential (although that's not what we mainly rate the research on)

    3. or the likely criticism is about taste, importance, novelty, or fit rather than checkable claims

      not sure I understand the logic behind the latter part

    4. r[qA - (1-q)L] + (1-r)[(1-q)A - qL] - k ≥ 0

      notation needs improvement, and it should be explained more -- how derived, how to interpret it? Tooltips and expanding sections could help

    5. Requesting a noisy public test is not the same as disclosing an already-known verifiable fact.

      I suspect another paper has dealt with this question ... 'when noisy signals help the seller' or some such

    6. e reader should not update from p0 after observing the evaluation res

      this needs clarification, I don't quite see why this is the case. Isn't it possible that the author's signal is positive so they submit, but the evaluator reading the paper gets a negative signal?

    7. binary quality Q in {H,L}, w

      is this 'binary' rhe relevant threshold? Where did it com from? is it sort of generalizable? Consider if it misses some important nuance

    8. Public anonymity statistic. The anonymity choice is empirically important. Running python unjournal_anonymity_stats.py on the public data bundle gives 65 anonymous/generic public evaluator identifiers among 113 deduplicated evaluator-paper pairs with quantitative ratings: 57.5%. In the subset matched to published-evaluation status, the share is 63/105: 60.0%, with 7 unmatched title rows. This supports saying "a bit over half" choose anonymous/generic public identifiers, but the denominator should be stated. The wider evaluator_paper_level.csv denominator is not clean for this claim because survey-only rows are assigned generic Evaluator N labels.

      this should be a fold or footnote -- give a quick statistic and footnote yow it was captured

    9. When the answer is unclear, the practical move is not immediate publicity. It is a fit-and-timing conversation, coauthor consen

      too much 'not this but that' AI speak. And 'publicity' is vague here

    10. After the author requests evaluation, update from p_D, not the raw prior p0:p+ = p_D q / [p_D q + (1-p_D)(1-q)] p- = p_D(1-q) / [p_D(1-q) + (1-p_D)q]

      the latex/math is not rendering as well as before we moved this to Codex for editing. Can we recover the better formats and get the best of both worlds?

    11. 2. Is the main obstacle credibility, visibility, field fit, or network access?

      this needs further explanation and clarification -- 'usual channels' should already encompass clarification

    12. e relevant question is not whether public evaluation is always good; it is when a public signal improves expected outcomes relative to waiting, revising privately, or continuing th

      this is the 'AI language of dichotomy' overused

    13. a result that a credible public test strictly helps authors whose default standing sits below the bar — and we are precise about the downside it carries for those just above it;

      The language of this is a bit unclear. Try to make it easier to understand.

    14. r to a public-evaluation venue that pays expert evaluators

      I'm not sure if the fact that evaluators are paid here is relevant to this question - rather than giving these details, you could just say "The Unjournal is the focal example"

    1. Explicit crux Which specific uncertainties — AGI timing, takeoff speed, power-seeking tendency, offense-defense balance, pause feasibility — most shift expert p(doom) estimates?Community solicitation for explicit AI-risk cruxes: uncertainties whose resolution would significantly shift p(doom), including AGI arrival year, takeoff speed, power-seekin

      this is meta -- I don't want meta, or at least put that into an 'opt-in' list

    1. ee our early automated prioritization prototype, which is outside legal research and currently focuses mainly on economics and related work.

      We can swap in here the legal prioritization prototype -- https://uj-prioritization-prototype.netlify.app/legal/ -- please do this -- and note that we're looking for feedback and examples to help improve and train this. Note that we don't envision this prioritization to be mainly driven by AI models -- humans will be making the ultimate decisions -- but these tools can be very helpful in the process.

    1. Comment directly on this page using the Hypothes.is sidebar (the < tab on the right edge). Or use the rating buttons on each paper card — human ratings are how we will calibrate these scores.

      Give people the option to suggest/add content.

    2. Comment directly on this page using the Hypothes.is sidebar (the < tab on the right edge). Or use the rating buttons on each paper card — human ratings are how we will calibrate these scores.

      Let us know if you have any questions about this.

    1. How this was made. Drafted by GPT Pro from existing Unjournal research and discussion (the elasticity-validation survey, the Bray et al. evaluation materials, and the PBM substitution literature), then built and polished into this interactive report in Claude Code. It is currently being reviewed and adjusted by hand. Treat figures and attributions as provisional until that review is complete; the governing evaluation lives on PubPub.

      Make this a folding box - and the header should say AI/human collaboration in some way

      Another folding box should have the standard call out about how we want feedback, and you can use the hypothesis tool for that.

    1. Note: This workshop is in early planning. The framing, evidence base, and participant list are still being developed.

      Still considering how to frame this workshop, and it depends on interest and participation. One frame is directly targeting what we know about plant-based products, who consumes them, and what it suggests for potential substitution and animal welfare. However, that evidence seems to be rather thin, inconclusive, and premature, perhaps. (See links to EA forum posts, etc.) Furthermore, our evaluation of Bray et al. on experimental versus standard quantitative marketing/I.O. estimates of own price elasticities suggests perhaps deep uncertainty. and lack of ability to be confident in these parameters, not to mention cross-price effects and substitution patterns. This potentially motivates a pivot towards focusing on these methodological questions, as well as framing it in terms of "what can we know and what research is worth pursuing."

    1. Thank you for participating in The Unjournal's Plant-Based Substitution Pivotal Questions workshop. Your feedback helps us measure the workshop's impact and improve future workshops.

      Remove this page for now because it makes it seem like the workshop already happened.

    1. A major methodological innovation. The framework is elegant and the estimation strategy is sound. The empirical component would especially benefit from more diverse and reliable samples, and from direct comparisons against existing scale-correction methods so readers can judge incremental value. Logic and communication could be tightened in places — rated lower here than the other dimensions.

      This is not his full evaluation. He gave a very in-depth evaluation, and you've only taken one paragraph here.

    2. The cost of calibration questions The central tension is practical, not theoretical. Prati flags that the evidence rests on a large number of calibration questions. It is unclear how well the correction performs with the realistic two or three CQs — and even two can be a heavy burden in large surveys. He suspects this is “one crucial reason anchoring vignettes have not been implemented at scale in 20 years.” Kaiser rates the work highly but pushes for more diverse, reliable samples and direct comparisons against existing scale-correction methods, so readers can judge the incremental value. His lower marks fall on logic & communication and on claims & evidence.
      1. Firstly, the header does not fully describe the critiques here. It's only one of the critiques.
      2. Secondly, even in this scrollytelling depiction, we probably want a bit more about what the evaluators are saying, going into more than one theme very briefly, because this is the core of The Unjournal's value add.
    3. Two experts, eight criteria

      We probably want a little bit of a transition here between the issue and the issue of measuring individuals' well-being through self-reports and what the Unjournal is now doing in terms of rating the paper, which is also on certain scales that may have subjective components themselves. Funnily enough. Make the distinction clearer here

    4. For decades, economists hesitated to use subjective well-being data for one stubborn reason: people use survey scales differently.

      This probably needs a little bit more context on why we're trying to measure people's well-being and happiness through self-reports.

    5. Estimated from a few extra calibration questions — not a full vignette battery.

      the diagram is not fully explained? what does each dot represent? Should we be giving 'names of people' (or IDs, or types of people) to make that clearer?

    6. data for one stubborn reason:

      I know this is meant for a public audience, but it's a little bit oversimplified. Perhaps we can say it in an equally concise and appealing way, but without making the absolute claims like "for one stubborn reason..." there may have been other reasons too. (Note to AI -- try to make this a persistent pattern in your writing. )

    1. What's in the paper Reconstructed from what the evaluation and author response cite — not the paper's own table of contents. Sections / figures below are only those referenced on this page.

      let's use actual content and structure from the paper!

    1. I believe that we have not been sufficiently cautious when taking bets that could be causing significant direct harm to animals (beyond just the lost funding that could be spent elsewhere).

      This makes me think you're looking at this from a "deontological" standpoint.

    2. But taking such bets is only appropriate if the risk of causing harm is sufficiently small.

      In one sense, that's obviously true, as if the risk of causing harm is high enough, the expected value goes negative.

      But if you're saying "We should not make even known positive expected value bets if the downside risk is too large"that's a judgment call, and it depends on your moral/ethical worldview.

    3. Research is expensive and slow, especially at universities. But we're about to have the luxury to aim higher.

      I'd like to see ambitious research initiatives independent from traditional university/journal processes. If we have the funding, we can build these fields.

    4. designing experimental plansconducting the studiesanalysing the raw data

      I don't fully agree that we need "EA AW community" hands-on involvement in the intermediate and technical steps.

      I think it's more a matter of providing funding and incentives and clearly communicating the goals, priorities, and need for rigor to researchers, at enabling coordination.

      But I agree that academic incentives on their own are not enough to ensure high-quality, credible work focused on animal welfare implications.

      And it would indeed help if the researchers intrinsically cared about animal welfare and thus about producing useful and accurate results. This makes the incentive alignment easier, but I think it can still work even if much of the work is done by people who aren't intrinsically interested in animal welfare or don't think about effectiveness in the same way - as long as they can understand and embody the priorities in their work.

    5. I think entire organisations could and should be founded for this. Until now, this was simply not possible. Research is expensive and slow, especially at universities. But we're about to have the luxury to aim higher.

      At The Unjournal, we are trying to bring together researchers, practitioners, and funders to do this sort of prioritization and coordination to be able to generate, communicate, powerful, useful evidence, robustly assessing its credibility and improving. Something we're trying to implement through our pivotal questions and workshops, e.g., https://uj-cm-workshop.netlify.app/summary ... and I think we're having some successes.

      But I'm not saying we're necessarily the best positioned to do this, and I'd love to work with others or see others move forward on this.

    6. But simply funding the broad field of animal welfare science is likely to create scattered research results that are difficult to translate into action.

      I agree. I strongly believe that some coordination is necessary, and we shouldn't just rely on academic incentives. Large-scale, ambitious evidence and collaborations seem high-value to me.

    7. I do not think that we can consider it an evidence-based intervention by EA standards.

      I would say that this depends on whether you're willing to rely on what might be considered fairly "common sense" priors about the substitution effect in an environment where we have little evidence and it's very hard to collect reliable evidence. Perhaps for any intervention there will be some aspect of the model that we might need to take for granted as just a common sense implication.

      I can appreciate that you might not agree that the price and taste equivalent plant-based meat would substantially crowd out the consumption of conventional meat. I find it harder to imagine that the same would hold for cell-based meat, but this is obviously a judgment call.

      That said, there's a bit of a chicken-and-egg problem here in that I think it's plausible it's reasonable to assume that until cell-based meat is actually in restaurants and supermarket shelves, we won't have a good sense of whether people will buy it as a replacement for conventional animal meat. But the only way for it to actually get on the shelves is by there being substantial investment. So requiring this high degree of certainty might make it impossible for us to consider potentially high-value but risky investments in this sort of innovation.

    8. See also this[20] more recent meta-analysis that came to a similar conclusion about alternative proteins and other meat reduction interventions.

      The evidence is, in fact, all over the map. -- see e.g., the survey here -- https://forum.effectivealtruism.org/posts/3Eh8MbqLwFBsD7GK2/how-much-do-plant-based-products-substitute-for-animal#Existing_Research

      But there are also doubts about whether we can reliably collect evidence in this domain. See https://uj-pba-workshop.netlify.app/context/pbm_fuller_report/ for an (AI-aggregated) synthesis ... Bray et al cast doubt on the reliability of even own-price-elasticity estimates in fairly standard settings -- see our evaluation of this paper here -- https://unjournal.pubpub.org/pub/evalsumbraybray/, and we're working on further discussion about the implications of this for substitution.

      (FWIW, intuitively, I have a strong prior that there would be a fairly strong substitution, even if not one that would completely end animal farming. As Bayesians, our prior beliefs should count for a lot in a domain without substantial evidence.)

    9. WFI itself has highlighted a general lack of research in poultry welfare[18].

      Confirmed. There does seem to be deep uncertainty here substantial lack of measurement

    10. heavily dependent on what harms are included, how they are scored, and how different types of pain are weighed.

      I'd like to see more justification of this bolded claim that it's indeed very sensitive to the assumptions. Can you provide a link or a footnote about this? Are there reasonable specifications under which it goes the other way?

    11. Only one study (STA_16) found that furnished cages had a higher mortality than single tier aviaries.

      The USA and WOR ones would seem to also, although the difference may not attain statistical significance in a conventional sense. But it still contributes to the evidence in that direction.

      I would still admit that the evidence, as you presented it, seems to go the other way, though.

    12. Where errors weren't available, I made a note summarising the difference.

      This table is somewhat hard to read and somewhat hard to see a synthesis of. I think something like a forest plot could be helpful here. And of course, it would be helpful to just report on the actual meta-analytical results.

      What seems important to me is the expected magnitude of the difference in mortality between systems and the implications for the difference in suffering, factoring in the other differences in life quality and health.

    13. n this data set (USA_13), mortality (cumulative at 60 weeks) is indeed statistically indistinguishable from cages.

      In fact, in this case, the aviary mortality is actually lower than conventional cage mortality.

    14. possible pair-wise comparison for statistical significance using z-scores where standard errors were reported.

      Should correct for multiple comparisons, we'll stop. At the same time, I disagree with drawing conclusions from a lack of statistical significance. To make a conclusion about "it's very unlikely that there's any difference large enough to be meaningful", you need to do something like an equivalence test, or report Bayesian posteriors over the difference.

    15. These aren't sudden, painless deaths. Increased vent pecking itself is also a sign of increased environmental stress. Overall, this suggests that hens in the cage-free systems generally experienced more distress.

      This part certainly suggests that, but we should really present the net with some adding up and weighting for the magnitude of the suffering of each event.

    16. Hens in the cage-free system performed the most natural behaviours (flying, perching, dust bathing, foraging) and had stronger leg and wing bones. However, the study also found that cage-free systems hadmore severe foot lesionsmore keel abnormalitiesincreased aggressionincreased mortalityThe mortality in cage-free systems was over twice as high as the others:

      These things and their consequent animal suffering/animal welfare burden could presumably be weighted and aggregated and compared. "More" could be slightly more in one case and far more in another case. What are the aggregate differences for reasonable assumptions here?

    17. Additionally, both studies implanted conductive electrodes in the test animals. It is plausible that this significantly affects how current flows through the shrimp's body.I also feel confused about what a signal from an electrode on a heart or a ganglion actually tells us. The plots of the recorded “power” are hard to interpret without a control signal to assess what the noise floor is.

      AI -- please look this up/clarify

    18. n conclusion, evidence for electrical stunning is extremely limited and we shouldn't feel comfortable recommending anything with confidence.

      I guess that's my take on what you shared too, but how do authors and experts in the field (other than you) interpret it? A bit of steelmanning+feedback could be useful imo.

      AI: look this up, including in comments below, provide sources for other research.

    19. Overall, there are only two scientific studies on the topic of using electric shock on Whiteleg shrimp. Both have limited sample sizes. Both show some recovery from electric shock. Both find that immersion in proper ice slurry leads to a rapid drop in vital signs. Neither is representative of industrial stunning machines[11].

      if you want to rewrite or rework a bit, maybe lead with this, so we can understand what the evidence is getting at.

    20. At lower shock voltage and duration, neural activity decreased on average, but sometimes increase

      ok this informally suggests important heterogeneity to me, suggesting the need for nontrivial sample sizes.

    21. I want to flag that I found parts of the results section hard to parse and sometimes details seemed to contradict each other. But key insights include:

      have you tried to contact the authors? I think that would be high value -- both directly and in terms of field building-- and happy to help facilitate it if I can be of help. They might be particular eager to clarify and happy to hear how their research is valued ... and may also see this as a route to potential future funding.

    22. setting “a significant proportion” of shrimp did not “show signs of recovery”

      Is that really how they presented it? Let's double check. That is extremely vague.

    23. The shrimp recover their ability to move after 5-10 minutes.

      Again, I'm really not sure whether these are good or bad things. Why do we care if they recover their ability to move later if we're normally killing them with this?

      OK this is explained a bit further below .... because they may wake up again before being killed

    24. Based on this data, it is unclear if electric shock followed by ice slurry provides any benefit over ice slurry alone, provided the animals are kept in ice slurry until they are fully dead. (It is unclear how long that would take, though.)And yet, ice slurry is often regarded as “the bad way” to kill shrimp. In fact, Mercy for Animals has been actively campaigning against ice slurry slaughter[3].

      Okay, following the above, I guess the point is that you kind of see similar things happen from ice lorry and electric shock, so it's not clear why one followed by the other is the best. ??

    25. When shrimp first hit the ice slurry, they perform sudden full-body contractions (tail flips), but this also happens if you first cut their head off (check the supplementary material for a video).

      Confused about what this is supposed to mean. Are we considering cutting their head off, or are you saying that if this happens, even though you cut their head off, that means that it probably doesn't indicate anything meaningful, just a sort of knee-jerk response?

  2. Jun 2026
    1. Immersion in ice slurry caused a rapid and massive drop in heart rate “amplitude” within seconds.Returning shrimp to warm water after 5 minutes allows the regular heart activity to return.

      are these things good or bad? Not immediately obvious to me.

    2. We have very limited data on electrical shrimp stunning that doesn't support a confident conclusion as to whether it's good or bad.

      I found your presentation below on this rather convincing. This also comports with what I've heard from other EAs (although perhaps the same circle of conversation). We need better evidence on which of these AW improving technologies actually reduce animal suffering. I'm in some discussions about possibly building and funding an evaluation service for specific tools and approaches (maybe something between The Unjournal and a fast-review journal, also inspired by Rapid Reviews Infectious Diseases).

      Very small sample sizes do not always mean lack of inference. For instance, in very predictable contexts without a lot of noise, like, let's say, Newtonian physics, even a few data points could help us narrow our beliefs substantially. I like Richard McElreath's example about how you can substantially update on the share of a planet that is water by simply, even in the first few random samples, choosing a single point on a planet's sphere.

      More intuitive -- if I ask 4 people to taste a drink and they all wince deeply in pain and disgust, I'm going to be highly confident it tastes bad. If all 4 smile and praise it, I'll be fairly confident that it's at least tolerable.

      But I don't know that that is the case here. There might in fact be a lot of uncertainty and heterogeneity. What I wonder is whether the sample sizes observing the behavior and bioindicators of these fish are very expensive, or whether it could easily be scaled up with just a small amount of money. as a non-biologist, it seems intuitive that it should be cheap, but I might be missing something

    3. e (N = 6 for each intervention) w

      I.e., 6 animals killed with each. See previous comments about sample size. ... what's the within treatment variation etc., and how costly is it per animal? How does this sample size compare with typical measurements in these domains? If the measurements taken are costly, could we get more reliability with cheaper measurements and a larger sample?

    4. This should be the #1 priority for new animal welfare funding, ahead of scaling existing work.

      I think I would indeed lean in this direction, and I'd suggest that in fact most grants I've seen go the other way. (But I may have some biases here, this goes in line with what we're trying to do at the Unjournal, etc. )

      [Consider -- does this post make a clear case for the 'should', demonstrating that the VOI of research here will exceed the expected (?) impact of the best or current 'existing work'?]

    5. Instead, I hope this post inspires lots of people to tackle this major neglected problem.

      v speculative -- this may be highly timely if we think the Anthropic IPO will be driving some money to AW , and donors are evidence driven and 'difference-making risk-averse'

    6. I found that even the most well known (and well funded) interventions had limited evidence, sometimes pointing in the “wrong” direction.

      In terms of the 'expected impact'/'uncertainty of impact' tradeoff heard of Animal Welfare as being in between GH&D and X-risk/GCR/AI Risk.

      Your concerns may be driven by "Difference-making risk aversion" -- see discussion here, for example https://forum.effectivealtruism.org/s/WdL3LE5LHvTwWmyqj/p/9EENSGhiQiKFaRh4t

      Or you might be driven by something more deontological, related to 'do no harm', perhaps?

    7. We have mixed evidence on whether transitioning egg producers to cage-free improves welfare overall.

      Maybe to be fair to note that this is along a specific dimension of transition -- conventional caged to regulatory mandated cage-free. Perhaps better evidence for other certified standards of free-range etc?

    8. building R&D infrastructure that can rapidly generate high-quality action-relevant research results.

      This is something Unjournal is trying to make happen. Come working with https://www.aw-econ.org/about and others. I think, given the very limited investment in this space, I think there are high marginal returns, although some of the value of information will be learning about what is or is not "epistemically possible".

    9. We have evidence that the substitution effect of alternative proteins is weak, at best.

      I think what we have is a substantial lack of evidence in this domain. This is something the Unjournal is trying to remedy. (Unjournal.org, see https://uj-pba-workshop.netlify.app/ for a link to some of our efforts, and https://forum.effectivealtruism.org/posts/3Eh8MbqLwFBsD7GK2/how-much-do-plant-based-products-substitute-for-animal or an earlier take).

      Evidence in this domain is very hard to come by, and there are substantial doubts about the extent to which we can even reliably measure these things. (The aforementioned post gets at this. Also see our an evaluation of the Bray et al. paper, which casts substantial doubt on the ability to even measure simpler things like own price elasticity with conventional methods -- https://unjournal.pubpub.org/pub/evalsumbraybray/ -- I'm working to follow up this evaluation package with more detailed evaluation managers' discussion and dissemination, focusing on the applied issues.

      Following up on this further -- see https://app.notion.com/p/Validation-Evidence-for-Food-Demand-Elasticities-PBM-Pivotal-Question-376e97e2ad3381a898d3ceb589b265f2 and links within for a preview.

      At the same time, there's also just not a lot of economics, social science, and marketing work done that focuses on animal welfare implications (or alternative proteins).

      The aforementioned workshop on associated pivotal questions will be focused not only on what we know in this domain (~substitution effects within and between animal and plant-product consumption) , but also on what we can know with given data, what we might be able to reliably learn with more ambitious data collection exercises, experiments, etc., and what (and what is sort of fundamentally unknowable and does not merit further research investment).

    10. Even some of the most prominent animal welfare interventions have surprisingly weak evidence behind them

      I've been hearing this for a while. It's something that organizations like Animal Charity Evaluators are trying to address, but they don't have the dedicated resources and funding that, say, GiveWell has. Furthermore, there's no comparable academic/national resource base for animal-welfare-relevant research. For example, in the economics animal product space, it seems most of the "agricultural economics" is oriented towards supporting the farm industry. (Perhaps with climate change a secondary priority in some countries )

    1. Is this easier to read and use than the current separate PubPub pages?

      Note -- we'd probably make this additional to PubPub, as the latter goes into standard bibliometrics and information standards.

      (Which in turn, brings the risk of divided attention)

    1. A toy decomposition to make the structure tangible: set the PBM price cut and the diversion shares, and see the implied displacement. These are placeholder ranges for elicitation, not estimates. The point is the wiring, not the numbers.

      Add a folding box presenting and explaining the equations behind this!

    2. A toy decomposition to make the structure tangible: set the PBM price cut and the diversion shares, and see the implied displacement. These are placeholder ranges for elicitation, not estimates. The point is the wiring, not the numbers.

      Let's try to use some referenced values as a starting point, linking them/tooltips. I don't think there's a lot of good research, but still good to start right

    3. How this was made. Drafted by GPT Pro from existing Unjournal research and discussion (the elasticity-validation survey, the Bray et al. evaluation materials, and the PBM substitution literature), then built and polished into this interactive report in Claude Code. It is currently being reviewed and adjusted by hand. Treat figures and attributions as provisional until that review is complete; the governing evaluation lives on PubPub.

      Just confirming this is indeed the status

    4. Anchor paper Bray, Sanders & Stamatopoulos

      maybe we don't want to anchor too much on this --- NB that paper does not involve substitution. "Anchor paper" could be misinterpreted. This is The Unjournal evaluation package that is most strongly connected atm

    5. The evidence base is large and shows recognizable structure across foods and countries. But sharp validation is thinner than the volume of estimates suggests. Bray et al. find standard observational scanner estimates fail badly against a randomized benchmark in their setting.

      this tracks

    1. Legal scholar lead Candidate curation Law and AI partner Animal welfare law Pilot papers Paid labeling Evaluator pool Workshop route

      what are these buttons meant to do? Are they supposed to be links?

    2. How should the model differ between US law reviews and European peer-reviewed legal scholarship?

      Or should we focus mainly on the US context because of the greater 'review gap' as well as the greater role for court jurisprudence in the US. On the other hand US legal scholars may be paid more, overcommitted to lucrative and influential work, and thus less willing to do the evaluations.

    3. the project should restart only if it has legal-scholarship ownership and a narrow pilot.

      reword this. A 'narrow pilot' is what we will do, that's not a precondition

    4. Identify public legal research with unusually high expected value for evaluation.

      This itself would be a useful public good, if we curated it well, with feedback from organizations that wanted to use this.

      Naturally we will do this in a human-AI collaboration, with AI doing much of the initial search and filtering. (see https://uj-prioritization-prototype.netlify.app/ for a prototype for our main stream)

  3. May 2026
    1. how quickly alternatives to animal-source foods must diffuse for the food system to make a meaningful contribution to climate targets. That is directly relevant to public R&D, procurement, regulation, investment, and philanthropic choices being made by organizations working on climate mitigation, food systems, and animal welfare.

      relevant yes. But how do we know it's important for these questions?

    1. PQ1A: What is your probability that linear WELLBY comparisons are reliable enough for comparing interventions in LMICs? Respondents gave a central estimate (0–100%) and a 90% credible interval.

      Note -- I did not intent to have CIs over probabilities. This was an artifact of a changed question and vibe coding. Also investigate whether this was the wording of the question when participants answered it

    Annotators

    1. Germany consumer survey · late 2024 Free GFI Europe consumer survey (late 2024, published 2025): 25% of German adults and 23% of UK adults reported consuming plant-based meat in the last month. 47% of German adults and 41% of UK adults reported already reducing their meat intake or following a meatless diet. 60% in Germany and 56% in the UK reported at least monthly consumption of some plant-based product category (broader than meat). Since only ~5% of German consumers exclusively consume alternative proteins (see src-35), the large majority of the 25% monthly PBM consumers are omnivores. Survey-reported personal consumption is more direct evidence of self-eating than purchase-panel data, which tracks household-level transactions without identifying wh

      this seems to need more digging into!

    2. Together: PBM is roughly 0.1–0.15% of conventional by volume, or 0.16–0.4% by illustrative retail valu

      this seems worth highlighting, even if it's a rough calculation

    1. Background note: a first-pass Claude summary of evidence on PBA penetration and taste-comparability is available for sharing. It is exploratory rather than a vetted literature review.

      shorten this a bit

    2. plant-based burgers are mostly substituting away from beef (not chicken),

      The lower animal welfare burden of beef vs chicken may not be known to all readers

    3. Connect to decisions: Given current evidence, is PBA funding plausibly competitive with corporate campaigns?

      Also mention other questions, such as "will meat taxes improve or worsen animal welfare?" and "Will innovative products such as PBA and cultured meat substitute for farmed animal consumption, or will they mainly be taken up by (existing) vegans and vegetarians"

    4. Quantify uncertainty: What's a reasonable range for the cross-price elasticity between PBAs and chicken, given what we know and don't know?

      This is kind of captured above, but I would do something more here with belief elicitation, interactive updating, and aggregating knowledge.

    5. nd can we conclude anything at all with current methods?

      Rather than "conclude" something like "do currently available methods and data even yield useful insight?"

    6. can we actually conclude about substitution effect

      Conclude is too strong here. I would say, what can we reasonably say about substitution effects and with what confidence?

    7. identification strategies vary considerably in rigor.

      Mention the use of instrumental variables and other strategies here, perhaps in a tooltip. Give specific references in that tooltip.

    8. raising questions about which to trust.

      Add a tooltip here, discussing some of the strengths and limitations of each, using the context and explanations discussed elsewhere . Let me know if you need more context on this.

    9. Different specifications can yield very different elasticity estimates.

      ... (tooltip) Note this is in part due to the aforemationed point that elasticity is not likely to be constant across an individual or market demand curve, and there will also be heterogeneity thus, it matters what parts of the curve you are looking at, and which markets, times, etc.

    10. IV and experimental estimates often diverge in opposite directions from naive OLS.

      rephrase this -- it's not quite right, and confusing

      Also be clear: these are estimates of own price elasticity, although it seems unlikely that cross-price elasticities would be more consistent or robust. And these are price-shifting field experiments. But also note, in a tooltip, some of the critiques of these experiments themselves. Ask me if you need context.

    11. especially in the earlier years when these products were emerging.

      I don't see what this part of the sentence adds. If the data is available in later years, we can focus on that later data. Maybe just leave this out, or mention something like "partly because of the limited availability of these products, and lags in releasing data for research use." -- But That's tooltip details. Also, I want you to ground some of these statements with references and links, mainly in tooltips.

    12. they anticipate lower demand,

      More when they expect demand to be more price sensitive --- have pro or counter-cyclical pricing; Put the details in a tooltip

    13. Why this is hard to measure

      These explanations are taking up too much space and will take up even more when you consider a wider range of approaches.

      Use folding boxes and tooltips more.

    14. everal key challenges complicate this:

      These are key issues with ~traditional econometric (IO and quant. marketing) methods.

      Field experiments (supermarket-level or at school cafeterias etc.) have less of an endogeneity issue, but some of these issues are still present (e.g., short term vs long term), and these are hard to implement at scale and cleanly, and have issues of their own (see the notes/discussion, and sketch these).

      Hypothetical and small-value choice experiments and hypothetical discrete choice surveys have other important limitations (mention these, from the sources and discussion).

    15. the strongest causal evidence.

      moderate this. This is vague. and there are a few kinds of field experiments in addition to this, including price shift experiments (esp. Bray et al), although few if any involving PBA

    16. These measurement challenges mean we should interpret existing estimates cautiously, while still extracting what information we can. The workshop will discuss which methods are most trustworthy and what further research could help.

      this is a bit generic, maybe not necessary

    17. One concrete finding worth engaging: The evidence suggests that the vast majority of PBA purchasers are omnivores, not vegetarians or vegans — one study finds that only around 1% of high-spending plant-based meat alternative households are actually vegetarian. This challenges the intuition that "PBA just captures existing vegans" and raises the stakes for substitution estimation: the counterfactual meat consumption displaced may be much larger than assumed.

      This is probly too strong ... needs caveating and referencing and tooltips.

    18. (chicken vs. beef vs. pork),

      make this 'between different animal products' and 'e.g., chicken vs. beef vs. eggs...' -- relevant for AW when considering issues like the AW impact of meat taxes -- which might shift consumption from beef to chicken, with a higher AW burden -- mention this briefly with further details in a tooltip

    1. US plant-based beef price premium vs conventional beef, category average

      Research and state this. Also for impossible and. Beyond vs conventionally ground beef

    2. Butcher) and the Nordic countries, where per-capita consumption of plant-based foods is high — probably sit above Germany, whic

      Evidence for this claim? Otw State as ,,we. SpeculTE that,,

    1. David Manheim (Technion/ALTER) and Mirjam Capuder (University of Maribor) participated. The session was recorded — all attendees joined knowing this. It covered introductions, a walkthrough of the interactive cost model dashboard, and early framing questions about key modeling uncertainties. Full recording pending participant review before public release.

      mention the insights here? I'm not sure we'll put out htis video either; it's not something interesting to watch , I guess. It was mostly preparation and broad discussion.

    2. ene-edited cell lines are the most under-modeled factor in published TEAs.

      this is a strong claim ("most under-modeled") -- what's it based on? Reasoning transparency please. Provide support and links to this, tooltips etc. I want to make sure this is well-backed before I post and ~"co-sign" it !

    3. technology and reaching different conclusions based on different priors about scale-up timelines and capital availability.

      all claims need more direct supporting evidence ... quotes, links, etc.; tooltips are your friend

    4. 1. The $1–$100/kg spread is real disagreement, not just uncertainty. Named domain experts — Swartz ($25/kg) and Lattanzi ($100/kg) — are 4× apart with tight confidence intervals. This isn't a calibration problem; they're looking at the same technology and reaching different conclusions based on different priors about scale-up timelines and capital availability.

      wait -- are you sharing the beliefs here? we didn't wnt to do that yet!

    5. European Morning Drop-in Fri May 8, 2026 · 9:00–10:00am ET (3–4pm UK · 4–5pm CET) · Zoom Informal drop-in for EU/UK participants who could not stay for the full afternoon session. Primarily attended by European/UK participants (CET timezone). The session was a recorded Zoom — all attendees joined knowing this. It covered introductions and a preview of the hydrolysates and gene editing framing that would open S1. Full recording pending participant review before public release.

      skip/remove this -- no one showed up