464 Matching Annotations
  1. Mar 2024
    1. A nice and easy way to report results of an ANOVA in R is with the report() function from the {report} package:
  2. Feb 2024
    1. Those times are better captured in the ten volumes, 414,825entries, and 1,827,306 quotations that were finally published in 1928.

      The first edition of the Oxford English dictionary was published in 1928 in 10 volumes containing 414,825 entries and 1,827,306 quotations.

    2. Surprisingly, the American author who is quoted most in the OED isnot Mark Twain or Emily Dickinson or Edgar Allan Poe, but rather EdwardH. Knight, a patent lawyer and expert in mechanics who wrote the AmericanMechanical Dictionary and The Practical Dictionary of Mechanics. Knight isthe seventy-fourth-most cited author in the Dictionary, quoted morefrequently than Percy Bysshe Shelley, George Eliot or Ralph Waldo Emerson(who comes in at 116, the next-most quoted American).
    3. Some Americans did write directly to Murray, and these – 196 ofthem – are the ones underlined in the address books. They represent 10 percent of all the Dictionary People with addresses and produced a total of238,080 slips that crossed the ocean before coming to rest on Murray’s deskin the Scriptorium.
    4. Ranking below Thomas Austin, who sent in 165,061 slips, and WilliamDouglas, who sent in 151,982, there is a big drop to the third-highestcontributor Dr Thomas Nadauld Brushfield, who sent in 70,277 slips.

      repetition here from before to introduce mental health...

    5. And yet he desperately needed the help of Subeditors because the task wastoo massive to do alone. Two years into the job, Murray had estimated thathe had sent out 817,625 blank slips to Readers. If they returned them withquotations, and if he spent a minimum of 30 seconds reading each one andallocating it to the correct sense of an entry, it would take him three workingyears to get through a third of the materials gathered.

      By the second year into his editing work on the OED, John Murray estimated that he had sent out 817,625 slips to readers.

      At the average price of $0.025 for bulk index cards in 2023, this would have cost $20,440, so one must wonder at the cost of having done it. How much would this have been in March 1879 when Murray tool over editorship?

      How many went out in total? Who cut them all? Surely mass manufacture didn't exist at the time for them?

      Sending them out would have helped to ensure a reasonable facsimile of having cards of equal size coming back.

    6. Murray received a poignant letter in 1906 fromthe wife of William Sykes of South Devon who had been a one-timeassistant, and faithful Reader and Specialist for twenty-two years, sending in atotal of 16,048 slips: ‘My dear husband died last Friday, the day he receivedyour letter, he was able to read it, and wrote your name in one of the books Iam going to send you eight hours before he died. It took him an hour to writeit, but he made up his mind to do it, and did. The last words he ever wrotewere to you.’ A poignant last line from the impoverished widow reads, ‘I shallsend the books when the probate duty has been paid.’

      William Sykes 16,048 slips over 22 years<br /> (approximately 2 notes per day)

    7. From the moment in March 1879 whenMurray signed the contract with Oxford University Press to be the next Editorof the Dictionary, and he took possession of 2 tons of slips at his house, hisfamily was immediately part of the project (whether they liked it or not)sorting out the slips. Their house was a workplace and the family aworkforce.

      Perhaps one of the first sources of counting slips in weight rather than number!

    8. The most prolific Reader in Europe – we might call him a ‘super-contributor’ – was Hartwig Helwich, a professor at the University of Viennawho wrote out the entire Cursor Mundi onto 46,599 slips. His efforts madethe medieval poem the second-most-frequently cited work in the Dictionaryafter the Bible (though in the current OED, it has dropped to eleventh in thetop sources).

      This practice of writing out everything onto slips sounds like that used later (double check the timing) by the Thesaurus Linguae Latinae in creating their slip corpus for later work.

  3. Nov 2023
    1. Get it right and we will see a lot less of our precious minerals, metals and resources dumped into landfill

      This line specifically stood out to me in this article because it is hard to hear, but also very true to the world we live in. As a world, we toss things out the moment they are no longer viewed as valuable to us but we dont toss things when they are "precious". For example, we buy a new iphone and hold it to a high value but then a year later a new iphone comes out and the old one gets tossed away like it is invaluable. Instead of just tossing things like this we need to be more proactive in recycling valuable and difficult resources that one day we may not have.

    1. the suicide is up by 30% depression rates are skyrocketing 36% of 00:07:57 Americans report feeling lonely frequently 45% of teenagers say they feel despondent and hopeless most of the time the number of people who have no who say they have no close personal friends has gone up by four 00:08:10 times 36% more Americans are not in a romantic relationship uh the number of people Americans who rate themselves in the lowest happiness category has gone up by 50%
      • for: statistics - United States happiness indicators

      • statistics: United States happiness indicators

        • suicide is up by 30%
        • depression is skyrocketing
        • 36% of Americans report feeling lonely frequently
        • 45% of teenagers say they feel despondent and hopeless most of the time
        • the number of people who say they have no close personal friends has gone up by four times
        • 36% more Americans are not in a romantic relationship
        • the number of people Americans who rate themselves in the lowest happiness category has gone up by 50%
  4. Oct 2023
  5. Jul 2023
    1. weakly informative approach to Bayesian analysis

      In [[Richard McElreath]]'s [[Statistical Rethinking]], he defines [[weakly informative priors]] (aka [[regularizing priors]]) as

      priors that gently nudge the machine [which] usually improve inference. Such priors are sometimes called regularizing or weakly informative priors. They are so useful that non-Bayesian statistical procedures have adopted a mathematically equivalent approach, [[penalized likelihood]]. (p. 35, 1st ed.)

    1. Science is not described by thefalsification standard, as Popper recognized and argued.4 In fact, deductive falsification isimpossible in nearly every scientific context. In this section, I review two reasons for thisimpossibility.(1) Hypotheses are not models. The relations among hypotheses and different kinds ofmodels are complex. Many models correspond to the same hypothesis, and manyhypotheses correspond to a single model. This makes strict falsification impossible.(2) Measurement matters. Even when we think the data falsify a model, another ob-server will debate our methods and measures. They don’t trust the data. Sometimesthey are right.For both of these reasons, deductive falsification never works. The scientific method cannotbe reduced to a statistical procedure, and so our statistical methods should not pretend.

      Seems consistent with how Popper used the terms [[falsification]] and [[falsifiability]] noted here

    2. So where do priors come from? They are engineering assumptions, chosen to help themachine learn. The flat prior in Figure 2.5 is very common, but it is hardly ever the best prior.You’ll see later in the book that priors that gently nudge the machine usually improve infer-ence. Such priors are sometimes called regularizing or weakly informative priors.They are so useful that non-Bayesian statistical procedures have adopted a mathematicallyequivalent approach, penalized likelihood. These priors are conservative, in that theytend to guard against inferring strong associations between variables.

      p. 35 where [[Richard McElreath]] defines [[weakly informative priors]] aka [[regularizing priors]] in [[Bayesian statistics]]. Notes that non-Bayesian methods have a mathematically equivalent approach called [[penalized likelihood]].

    3. one assumes the population size andstructure have been constant long enough for the distribution of alleles to reach a steady state

      The population size & structure being "constant" is what [[Richard McElreath]] means by "equilibrium" in \(\text{P}_{0\text{A}}\) (process model zero-A), which corresponds to the null hypothesis

      \(\text{H}_0: \text{``Evolution is neutral"}\)

    4. Andrew Gelman’s

      Per Andrew Gelman's wiki:

      Andrew Eric Gelman (born February 11, 1965) is an American statistician and professor of statistics and political science at Columbia University.

      Gelman received bachelor of science degrees in mathematics and in physics from MIT, where he was a National Merit Scholar, in 1986. He then received a master of science in 1987 and a doctor of philosophy in 1990, both in statistics from Harvard University, under the supervision of Donald Rubin.[1][2][3]

    1. The global illicit trade in wildlife may be worth up to $20,000,000,000 annually and the value of legal wildlife trade in the United States was recently estimated at $2,800,000,000 annually.

      What is the source of this figure?

  6. May 2023
    1. Minimum sample size for external validation of a clinicalprediction model with a binary outcome

      Minimum sample size for external validation of a clinical prediction model with a binary outcome

  7. Apr 2023
    1. people with poor family relationships and no close friends “are ten times more likely to suffer from significant mental health challenges” compared to those with many close family bonds and friendships.
  8. Mar 2023
    1. Basic statistics regarding the TLL: - ancient Latin vocabulary words: ca. 55,000 words - 10,000,000 slips - ca. 6,500 boxes - ca. 1,500 slips per box - library 32,000 volumes - contributors: 375 scholars from 20 different countries - 12 Indo-European specalists - 8 Romance specialists - 100 proof-readers - ca. 44,000 words published - published content: 70% of the entire vocabulary - print run: 1,350 - Publisher: consortium of 35 academies from 27 countries on 5 continents

      Longest remaining words: - non / 37 boxes of ca 55,500 slips - qui, quae, quod / 65 boxes of ca. 96,000 slips - sum, esse, fui / 54.5 boxes of ca. 81,750 slips - ut / 35 boxes of ca 52,500 slips

      Note that some of these words have individual zettelkasten for themselves approaching the size of some of the largest personal collections we know about!

      [18:51]

    1. Statistics collected in hundreds of cities in the United States show that between a third and a half of the school children fail to progress through the grades at the expected rate; that from 10 to 15 per cent are retarded two years or more; and that from 5 to 8 per cent are retarded at least three years. More than 10 per cent of the $400,000,000 annually expended in the United States for school instruction is devoted to re-teaching children what they have already been taught but have failed to learn.

      I think this information is interesting because we are being told that more than 1/3 of school children fail to progress to the next grade. I think we need to incorporate different learning styles because what if the individual doesn't understand the concept the way it is being taught. Many people learn in different ways such as hands on learning, auditory learning, and visual learning. I think the reason 10% of $400,000,000 is going into teaching children what they have learned but have failed to learn is because there maybe something up head in learning that they might need to understand for the future. I have been retaught certain things when I moved up to the next grade level and I think it is to help refresh memory. I think another reason 10% goes to reteaching is because the students didn't understand the concept and needs to be retaught so they can understand for future uses.

  9. Jan 2023
  10. Dec 2022
    1. are already done: E[S]E[(y−y^)2]=E[(y−y^)2]=(y−E[y^])2+E[(E[y^]−y^)2]=[Bias]2+Variance

      bias variance decomposition 两个都定义不清就容易混

    1. Aleatoric music (also aleatory music or chance music; from the Latin word alea, meaning "dice") is music in which some element of the composition is left to chance, and/or some primary element of a composed work's realization is left to the determination of its performer(s). The term is most often associated with procedures in which the chance element involves a relatively limited number of possibilities.
    1. formula

      Consider that: 1. Sine and cosine are orthogonal to each other 2. Hence, you can rewrite -sin(Θ) = cos(Θ + π/2) cos(Θ) = sin(Θ + π/2) 3. Therefore, the angle between the standard basis vectors, and their orientation, are preserved!

  11. Nov 2022
    1. the positivist paradigm

      Based in the belief that there is a tangible reality. Everything can be understood, measured, and/or identified. In this paradigm the researcher should focus on facts and they should look for fundamental laws.

  12. Sep 2022
    1. Sverigedemokraterna

      Number of parliamentary candidates that are connected to criminal biker gangs. This is only for the parliamentary election in 2022.

      Sverigedemokraterna make up 59% of the list; see the table on page 16 of this report.

    2. De politiska partierna

      Number of parliamentary candidates that were connected to criminal biker gangs in the last five elections.

      Sverigedemokraterna make up 58% of the list.

  13. Aug 2022
    1. ReconfigBehSci. (2021, December 9). a rather worrying development- a (local) newspaper “fact checking” the new German health minister simply by interviewing a virologist who happens to have a different view. There’s simply no established “fact” as to the severity of omicron in children at this point in time [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1469037817481334786

    1. we launched a service that’s now used by over a million people around the world who have made nearly 40 million annotations. In higher education, more than 1,200 colleges and universities use Hypothesis. And we’ve grown from a handful of people into a team of more than 35 passionate web builders.

      h. in 2022 has over 1 million users, who made nearly 40 million annotations. Early this year 2 million annotated articles/sites was reached (2175298 is the number the API rerurns today). This sounds like a lot but on its face works out to an average of 40 annotations on 2 articles per user. This suggests to me the mode is 1 annotation on 1 article per user. How many of those 1 million were active last week / month?

    1. Otto Karl Wilhelm Neurath (German: [ˈnɔʏʀaːt]; 10 December 1882 – 22 December 1945) was an Austrian-born philosopher of science, sociologist, and political economist. He was also the inventor of the ISOTYPE method of pictorial statistics and an innovator in museum practice. Before he fled his native country in 1934, Neurath was one of the leading figures of the Vienna Circle.
  14. Jun 2022
    1. First, the so-called normal distribution of statistics assumes that there are default humans who serve as the standard that the rest of us can be accurately measured against.

      "so-called"?! wow! This is a massively divergent viewpoint.

  15. May 2022
    1. The highlights you made in FreeTime are preserved in My Clippings.txt, but you can’t see them on the Kindle unless you are in FreeTime mode. Progress between FreeTime and regular mode are tracked separately, too. I now pretty much only use my Kindle in FreeTime mode so that my reading statistics are tracked. If you are a data nerd and want to crunch the data on your own, it is stored in a SQLite file on your device under system > freetime > freetime.db.

      FreeTime mode on the Amazon Kindle will provide you with reading statistics. You can find the raw data as an SQLite file under system > freetime > freetime.db.

    1. According to a Pew study from last year, only 20 percent of K-12 students in America study a foreign language (compared with an average of 92 percent in Europe), and only 10 states and the District of Columbia make foreign-language learning a high school graduation requirement.

      use of statistics

    2. According to the Modern Language Association, enrollment in college-level foreign-language courses dropped 9.2 percent from 2013 to 2016.

      Use of statistics

  16. Apr 2022
    1. ReconfigBehSci. (2022, January 24). @STWorg @FraserNelson @GrahamMedley no worse- he took Medley’s comment that Sage model the scenarios the government asks them to consider to mean that they basically set out to find the justification for what the government already wanted to do. Complete failure to distinguish between inputs and outputs of a model [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1485625862645075970

    1. the Institute of Medicine had released a landmark report on patientsafety, To Err Is Human. The report found that as many as 98,000 Americanswere dying each year as a result of preventable medical errors occurring inhospitals—more people than succumbed to car accidents, workplace injuries, orbreast cancer. And some significant portion of these deaths involved mistakes inthe dispensing of drugs.

      Some might see the 98,000 preventable medical error deaths reported by the Institute of Medicine in To Err is Human (1999) now and laugh at the farcical number of deaths due to coronavirus since 2020, a large proportion of which could have been prevented due to better communication and coordination?

      What if a more pragmatic anthropological viewpoint could be given to the current fractured state of American politics? If anthropologists are taught not to make value judgements on the way other cultures have come to live their lives, but simply to appreciate and report on them accurately, then perhaps we should leave those on the far right who believe in top down, patriarchal rule to their devices?

      What if we nudged (forced) them all to actually live by their own rules by enforcing them to the nth degree? Republican politicians can only get away with badmouthing abortion or homophobic viewpoints because their feet are not held to the fire when those issues impinge upon their own families or even themselves. They have the wealth and the power to flout the laws and not face the direct consequences personally. Would their tunes change if forced by their own top down patriarchal perspectives applying to them?

    1. ReconfigBehSci. (2021, February 1). @MaartenvSmeden @richarddmorey 2/2 Having conducted experiments on lay understanding of arguments from ignorance, in my experience, people intuitively understand probabilistic impact of factors, such as quality of search, that moderate strength. Rather than build on that, we work against it with slogan! [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1356228495714746370

    1. ReconfigBehSci. (2021, February 1). @islaut1 @richarddmorey I think diff. Is that your first response seemed to indicate the evidence was the search itself (contra Richard) so turning an inference from absence of something into a kind of positive evidence ('the search’). Let’s call absence of evidence “not E”. 1/2 [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1356215051238191104

    1. ReconfigBehSci. (2021, February 2). @MichaelPaulEdw1 @islaut1 @ToddHorowitz3 @richarddmorey @MaartenvSmeden as I just said to @islaut1 if you want to force the logical contradiction you move away entirely from all of the interesting cases of inference from absence in everyday life, including the interesting statistical cases of, for example, null findings—So I think we now agree? [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1356530759016792064

    1. ReconfigBehSci. (2021, February 1). @islaut1 @richarddmorey I think of strength of inference resting on P(not E|not H) (for coronavirus case). Search determines the conditional probability (and by total probability of course prob of evidence) but it isn’t itself the evidence. So, was siding with R. against what I thought you meant ;-) [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1356216290847944706

    1. ReconfigBehSci. (2021, February 1). @MaartenvSmeden @richarddmorey you absolutely did (and I would have been disappointed if you hadn’t ;-)! It was a general comment prompted by the fact that the title of the article you linked to doesn’t (as is widespread), and I actually genuinely think this is part of the “problem” in pedagogical terms. 1/2 [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1356227423067664384

    1. Maarten van Smeden. (2021, February 1). Personal top 10 fallacies and paradoxes in statistics 1. Absence of evidence fallacy 2. Ecological fallacy 3. Stein’s paradox 4. Lord’s paradox 5. Simpson’s paradox 6. Berkson’s paradox 7. Prosecutors fallacy 8. Gambler’s fallacy 9. Lindsey’s paradox 10. Low birthweight paradox [Tweet]. @MaartenvSmeden. https://twitter.com/MaartenvSmeden/status/1356147552362639366

    1. ReconfigBehSci. (2021, February 2). @MichaelPaulEdw1 @islaut1 @ToddHorowitz3 @richarddmorey as this account is focussed on COVID, maybe time to move the discussion elsewhere- happy to discuss further if you want to get in touch by email—U.hahn" "https://t.co/HOGwHragEb [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1356529368630239232

    1. (6) ReconfigBehSci on Twitter: “@MichaelPaulEdw1 @islaut1 @ToddHorowitz3 @richarddmorey @MaartenvSmeden and not just misguided (as too simplistic) but part of the problem....” / Twitter. (n.d.). Retrieved February 24, 2021, from https://twitter.com/SciBeh/status/1356528429211021319

    1. Youyang Gu. (2021, May 25). Is containing COVID-19 a requirement for preserving the economy? My analysis suggests: Probably not. In the US, there is no correlation between Covid deaths & changes in unemployment rates. However, blue states are much more likely to have higher increases in unemployment. 🧵 https://t.co/JrikBtawEb [Tweet]. @youyanggu. https://twitter.com/youyanggu/status/1397230156301930497

    1. Denise Dewald, MD 🗽. (2021, August 12). Here are some modeling predictions for the delta variant from COVSIM (group at North Carolina State): PLEASE CHECK THIS OUT - RESOURCES TO SHARE WITH YOUR SCHOOL DISTRICT School-level COVID-19 Modeling Results for North Carolina for #DeltaVariant https://t.co/zU5hB9bKlY [Tweet]. @denise_dewald. https://twitter.com/denise_dewald/status/1425626289399009288

    1. A New York Times article uses the same temperature dataset you have been using to investigate the distribution of temperatures and temperature variability over time. Read through the article, paying close attention to the descriptions of the temperature distributions.

      Unfortunately, like most NYT content, this article is behind a paywall. I'm partly reading this as I plan to develop a set of open education resources myself and the problem of how to manage dead/unavailable links looks like a key stumbling block.

    1. Tyler Black, MD. (2021, December 10). Statistics Canada has been asking kids about mental health during the pandemic. Initially, after the first 5 months (with school shutdowns, summer break, lots of restrictions), more kids said they were better than worse, most reported no change. 86% “No change or better” [/1] https://t.co/3shKtrxEVU [Tweet]. @tylerblack32. https://twitter.com/tylerblack32/status/1469380405451100162

  17. Mar 2022
    1. In 1925, Ronald Fisher advanced the idea of statistical hypothesis testing, which he called "tests of significance", in his publication Statistical Methods for Research Workers.[28][29][30] Fisher suggested a probability of one in twenty (0.05) as a convenient cutoff level to reject the null hypothesis.[31] In a 1933 paper, Jerzy Neyman and Egon Pearson called this cutoff the significance level, which they named α {\displaystyle \alpha } . They recommended that α {\displaystyle \alpha } be set ahead of time, prior to any data collection.[31][32] Despite his initial suggestion of 0.05 as a significance level, Fisher did not intend this cutoff value to be fixed. In his 1956 publication Statistical Methods and Scientific Inference, he recommended that significance levels be set according to specific circumstances.[31]

      The lofty p=0.5 is utter bullshit. It was just an arbitrary, made-up value with no real evidence behind it.

    1. only 2.5 sigma

      That's 99.38% chance of being correct, yet that's considered "weak". Would that we could do that in medicine or the social sciences.

  18. Feb 2022
  19. Jan 2022
    1. Consider, as well, the extent to which the tools of abstraction are themselves tied up in the history of the trans-Atlantic slave trade. As the historian Jennifer L. Morgan notes in “Reckoning With Slavery: Gender, Kinship, and Capitalism in the Early Black Atlantic,” the fathers of modern demography, the 17th-century English writers and mathematicians William Petty and John Graunt, were “thinking through problems of population and mobility at precisely the moment when England had solidified its commitment to the slave trade.”Their questions were ones of statecraft: How could England increase its wealth? How could it handle its surplus population? And what would it do with “excessive populations that did not consume” in the formal market? Petty was concerned with Ireland — Britain’s first colony, of sorts — and the Irish. He thought that if they could be forcibly transferred to England, then they could, in Morgan’s words, become “something valuable because of their ability to augment the population and labor power of the English.”This conceptual breakthrough, Morgan told me in an interview, cannot be disentangled from the slave trade. The English, she said, “are learning to think about people as ‘abstractable.’

      This deserves to be delved into more deeply. This sounds like a bizarre stop on the creation of institutional racism.

      How do these sorts of abstraction hurt the move towards equality?

    1. Poland’s Jewish population numbered well over three million before the war. At the end of the war, some 380,000 Polish Jews had survived, most of whom had fled in 1939 into the former Soviet Union. By 1950, only about 45,000 Jews were left living in Poland.
    1. An over-reliance on numbers often leads to bias and discrimination.

      By their nature, numbers can create an air of objectivity which doesn't really exist and may be hidden by the cultural context one is working within. Be careful not to create an over-reliance on numbers. Particularly in social and political situations this reliance on numbers and related statistics can create dramatically increased bias and discrimination. Numbers may create a part of the picture, but what is being left out or not measured? Do the numbers you have with respect to your area really tell the whole story?