 Jan 2023

grist.org grist.org

Around 40 million homes (or roughly 35 percent of all U.S. houses) use a gas stove to make food.
This is less than I thought it would be.


s4be.cochrane.org s4be.cochrane.org

Highlevel view of the 3 different types of heterogeneity (clinical, methodological, statistical). I used these definitions as the basis for some Anki cards

 Dec 2022

projects.iq.harvard.edu projects.iq.harvard.eduHandouts1

I came here to get the handout for Markov chains mentions in Lecture 31: Markov chains  Statistics 110. Lectures give a great intuition behind the equations, their motivation, and their limitations.

 Nov 2022

www.cs.ucr.edu www.cs.ucr.edu

Dr. Miho Ohsaki reexamined workshe and her group had previously published and confirmed that the results are indeed meaningless in the sensedescribed in this work (Ohsaki et al., 2002). She has subsequently been able to redefine the clustering subroutine inher work to allow more meaningful pattern discovery (Ohsaki et al., 2003)
Look into what Dr. Miho Ohsaki changed about the clustering subroutine in her work and how it allowed for "more meaningful pattern discovery"

Eamonn Keogh is an assistant professor of Computer Science at the University ofCalifornia, Riverside. His research interests are in Data Mining, Machine Learning andInformation Retrieval. Several of his papers have won best paper awards, includingpapers at SIGKDD and SIGMOD. Dr. Keogh is the recipient of a 5year NSF CareerAward for “Efficient Discovery of Previously Unknown Patterns and Relationships inMassive Time Series Databases”.
Look into Eamonn Keogh's papers that won "best paper awards"

http://www.cs.ucr.edu/~eamonn/meaningless.pdf Paper that argues cluster time series subsequences is "meaningless". tl;dr: radically different distributions end up converging to translations of basic sine or trig functions. Wonder if constructing a simplicial complex does anything?
Note that one researcher changed the algorithm to produce potentially meaningful results


cccrg.cochrane.org cccrg.cochrane.org

PDF summary by Cochrane for planning a metaanalysis at the protocol stage. Gives guidance on how to anticipate & deal with various types of heterogeneity (clinical, methodological , & statistical). Link to paper
Covers  ways to assess heterogeneity  courses of action if substantial heterogeneity is found  methods to examine the influence of effect modifiers (either to explore heterogeneity or because there's good reason to suggest specific features of participants/interventions/study types will influence effects of the intervention.  methods include subgroup analyses & metaregression

Statistical heterogeneity is the term given to differences in the effects of interventions and comesabout because of clinical and/or methodological differences between studies (ie it is a consequenceof clinical and/or methodological heterogeneity). Although some variation in the effects ofinterventions between studies will always exist, whether this variation is greater than what isexpected by chance alone needs to be determined.
If the statistical heterogeneity is larger that what's expected by chance alone, then what does that imply? That there's either clinical or methodological heterogeneity within the pooled studies.
What's the impact of the presence of clinical heterogeneity? The statistical heterogeneity (variation of effects/results of interventions) becomes greater than what's expected by chance alone
What's happens if methodological heterogeneity is present? The statistical heterogeneity (variation of effects/results of interventions) becomes greater than what's expected by chance alone


www.cisco.com www.cisco.com

Quadrants I and II: The average student’s scores on basic skills assessments increase by21 percentiles when engaged in noninteractive, multimodal learning (includes using textwith visuals, text with audio, watching and listening to animations or lectures that effectivelyuse visuals, etc.) in comparison to traditional, singlemode learning. When that situationshifts from noninteractive to interactive, multimedia learning (such as engagement insimulations, modeling, and realworld experiences – most often in collaborative teams orgroups), results are not quite as high, with average gains at 9 percentiles. While notstatistically significant, these results are still positive.
I think this is was Thomas Frank was referring to in his YT video when he said "direct handson experience ... is often not the best way to learn something. And more recent cognitive research has confirmed this and shown that for basic concepts a more abstract learning model is actually better."
By "more abstract", I guess he meant what this paper calls "noninteractive". However, even though Frank claims this (which is suggested by the percentile increases shown in Quadrants I & II), no variance is given and the authors even state that, in the case of Q II (looking at percentile increase of interactive multimodal learning compared to interactive unimodal learning), the authors state that "results are not quite as high [as the noninteractive comparison], with average gains at 9 percentiles. While not statistically significant, these results are still positive." (emphasis mine)
Common level of signifcances are \(\alpha =.20,~.10,~.05,~.01\)


stats.stackexchange.com stats.stackexchange.com

The random process has outcomes
Notation of a random process that has outcomes
The "universal set" aka "sample space" of all possible outcomes is sometimes denoted by \(U\), \(S\), or \(\Omega\): https://en.wikipedia.org/wiki/Sample_space
Probability theory & measure theory
From what I recall, the notation, \(\Omega\), was mainly used in higherlevel grad courses on probability theory. ie, when trying to frame things in probability theory as a special case of measure theory things/ideas/processes. eg, a probability space, \((\cal{F}, \Omega, P)\) where \(\cal{F}\) is a \(\sigma\text{field}\) aka \(\sigma\text{algebra}\) and \(P\) is a probability density function on any element of \(\cal{F}\) and \(P(\Omega)=1.\)
Somehow, the definition of a sigmafield captures the notion of what we want out of something that's measurable, but it's unclear to me why so let's see where writing through this takes me.
Working through why a sigmaalgebra yields a coherent notion of measureable
A sigmaalgebra \(\cal{F}\) on a set \(\Omega\) is defined somewhat close to the definition of a topology \(\tau\) on some space \(X\). They're both collections of subcollections of the set/space of reference (ie, \(\tau \sub 2^X\) and \(\cal{F} \sub 2^\Omega\)). Also, they're both defined to contain their underlying set/space (ie, \(X \in \tau\) and \(\Omega \in \cal{F}\)).
Additionally, they both contain the empty set but for (maybe) different reasons, definitionally. For a topology, it's simply defined to contain both the whole space and the empty set (ie, \(X \in \tau\) and \(\empty \in \tau\)). In a sigmaalgebra's case, it's defined to be closed under complements, so since \(\Omega \in \cal{F}\) the complement must also be in \(\cal{F}\)... but the complement of the universal set \(\Omega\) is the empty set, so \(\empty \in \cal{F}\).
I think this might be where the similarity ends, since a topology need not be closed under complements (but probably has a special property when it is, although I'm not sure what; oh wait, the complement of open is closed in topology, so it'd be clopen! Not sure what this would really entail though 🤷♀️). Moreover, a topology is closed under arbitrary unions (which includes uncountable), but a sigmaalgebra is closed under countable unions. Hmm... Maybe this restriction to countable unions is what gives a coherent notion of being measurable? I suspect it also has to do with BanachTarski paradox. ie, cutting a sphere into 5 pieces and rearranging in a clever way so that you get 2 sphere's that each have the volume of the original sphere; I mean, WTF, if 1 sphere's volume equals the volume of 2 sphere's, then we're definitely not able to measure stuff any more.
And now I'm starting to vaguely recall that this what sigmafields essentially outlaw/ban from being possible. It's also related to something important in measure theory called a Lebeque measure, although I'm not really sure what that is (something about doing a Riemann integral but picking the partition on the yaxis/codomain instead of on the xaxis/domain, maybe?)
And with that, I think I've got some intuition about how fundamental sigmaalgebras are to letting us handle probability and uncertainty.
Back to probability theory
So then events like \(E_1\) and \(E_2\) that are elements of the set of subcollections, \(\cal{F}\), of the possibility space \(\Omega\). Like, maybe \(\Omega\) is the set of all possible outcomes of rolling 2 dice, but \(E_1\) could be a simple event (ie, just one outcome like rolling a 2) while \(E_2\) could be a compound(?) event (ie, more than one, like rolling an even number). Notably, \(E_1\) & \(E_2\) are NOT elements of the sample space \(\Omega\); they're elements of the powerset of our possibility space (ie, the set of all possible subsets of \(\Omega\) denoted by \(2^\Omega\)). So maybe this explains why the "closed under complements" is needed; if you roll a 2, you should also be able to NOT roll a 2. And the property that a sigmaalgebra must "contain the whole space" might be what's needed to give rise to a notion of a complete measure (conjecture about complete measures: everything in the measurable space can be assigned a value where that part of the measurable space does, in fact, represent some constitutive part of the whole).
But what about these "random events"?
Ah, so that's where random variables come into play (and probably why in probability theory they prefer to use \(\Omega\) for the sample space instead of \(X\) like a base space in topology). There's a function, that is, a mapping from outcomes of this "random event" (eg, a role of 2 dice) to a space in which we can associate (ie, assign) a sense of distance (ie, our sigmaalgebra). What confuses me is that we see things like "\(P(X=x)\)" which we interpret as "probability that our random variable, \(X\), ends up being some particular outcome \(x\)." But it's also said that \(X\) is a realvalued function, ie, takes some arbitrary elements (eg, events like rolling an even number) and assigns them a real number (ie, some \(x \in \mathbb{R}\)).
Aha! I think I recall the missing link: the notation "\(X=x\)" is really a shorthand for "\(X(\omega)=x\)" where \(\omega \in \cal{F}\). But something that still feels unreconciled is that our probability metric, \(P\), is just taking some real value to another real value... So which one is our sigmaalgebra, the inputs of \(P\) or the inputs of \(X\)? 🤔 Hmm... Well, I guess it has the be the set of elements that \(X\) is mapping into \(\mathbb{R}\) since \(X\text{'s}\) input is a small omega \(\omega\) (which is probably an element of big omega \(\Omega\) based on the conventions of small notation being elements of big notation), so \(X\text{'s}\) domain much be the sigmaalgrebra?
Let's try to generate a plausible example of this in action... Maybe something with an inequality like "\(X\ge 1\)". Okay, yeah, how about \(X\) is a random variable for the random process of how long it takes a customer to get through a grocery line. So \(X\) is mapping the elements of our sigmaalgebra (ie, what customers actually end up experiencing in the real world) into a subset of the reals, namely \([0,\infty)\) because their time in line could be 0 minutes or infinite minutes (geesh, 😬 what a life that would be, huh?). Okay, so then I can ask a question like "What's the probability that \(X\) takes on a value greater than or equal to 1 minute?" which I think translates to "\(P\left(X(\omega)\ge 1\right)\)" which is really attempting to model this whole "random event" of "What's gonna happen to a particular person on average?"
So this makes me wonder... Is this fact that \(X\) can model this "random event" (at all) what people mean when they say something is a stochastic model? That there's a probability distribution it generates which affords us some way of dealing with navigating the uncertainty of the "random event"? If so, then sigmaalgebras seem to serve as a kind of gateway and/or foundation into specific cognitive practices (ie, learning to think & reason probabilistically) that affords us a way out of being overwhelmed by our anxiety or fear and can help us reclaim some agency and autonomy in situations with uncertainty.


en.wikipedia.org en.wikipedia.org

the moments of a function are quantitative measures related to the shape of the function's graph
Vaguely recall these "uniquely determined" some (but not all) functions. Later on, the article says all moments from \(0\) to \(\infty\) do uniquely determine bounded functions. Guess you can't judge a book (or graph) by it's cover; you have to wait moment by moment for it to reveal itself

 Apr 2022

digital.autocare.org digital.autocare.org

Supply chains were disrupted early in the pandemic, with about half of companies reporting supply chain/sourcing related disruptions in April 2020 (top three: 47%
creating note for key stat
Tags
Annotators
URL


twitter.com twitter.com

Tom Whipple on Twitter. (n.d.). Twitter. Retrieved 29 October 2021, from https://twitter.com/whippletom/status/1442226972491333641

 Jul 2021

engl201.opened.ca engl201.opened.ca

,W¶VXSWR\RXWKHVWDWLVWLFLDQSURJUDPPHUGHVLJQHURUGDWDVFLHQWLVWWRGHFLGHKRZWRWHOOWKHVWRU\
This is a comment on the whole concept really, but the best thing I ever did for myself in terms of gaining a better understanding of data and how to interpret it was to take a research and methods design class, and of course statistics as well. It helped me understand why researchers choose certain ways to represent data, and understand that to the untrained eye, data can be manipulated to seemingly prove almost any point. It is our responsibility to be clear and honest in our presentation of data. Kind of a "with great power comes great responsibility" moment. Because unfortunately, if you throw some statistics around people assume you must know what you are talking about, and often take it at face value without doing their own research, so it is incredibly easy to mislead and misinform the masses in this way.

 Feb 2021

www.washingtonpost.com www.washingtonpost.com

The Quest for Truth
The quest for Truth is everywhere and not limited to the economic topics linked here. This is just a topic that started a thought process where I had access to a convenient tool (Hypothesis) to bookmark my thoughts and research.
Primary thought is: The Quest for Truth. Subcategories would provide a structured topic for the thought. In this case the subcategory would be: US Economy, Inflation
The TRUTH is a concept comprised of inconsistencies and targets that frequently move.
Targets (data, methods, people, time, semantics, agenda, demographic, motive, means, media, money, status) hold a position in time long enough to fulfill a purpose or agenda. Sometimes they don't consciously change, but history over time shines light and opens cracks in original narrative that leads to new truth's, real or imagined.
Verifying and validating certain Truth is very difficult. Why is That?

 Dec 2020

imaging.mrccbu.cam.ac.uk imaging.mrccbu.cam.ac.uk

Rules of thumb on magnitudes of effect sizes
Rules of thumb on magnitudes of effect sizes

 Jun 2020

Local file Local file

Informal mentorship was captured using the following retrospective question from Wave 3 of the AddHealth data: "Other than your parents or stepparents, has an adult made an important positive difference in your life at any time since you were 14 years old?" Based on this question, I created a binary indicator for mentorship coded 1 if the young person had an informal mentor and 0 if they did not. Respondents were then asked "How is this person related to you?", and given response options like "family,""teacher/counselor,""friend's parent,""neighbor,"and "religious leader.
Defining informal mentorship in the survey data

Middleincome subsample 3,158
Middleincome subsample for analysis was 3,158

1. "Middleincome" is defined as anyone living in a household making twothirds to double the median income (Pew Research Center, 2016). In 1994, the median income for a family of four was $46,757(US Bureau of Statistics, 1996). Thus, "middleincome" families would be those making between $30,860 and $93,514. Because I only have data available in $25,000 increments, I am defining middleincome families as those making between $25,000 and $100,000 a year in Wave 1.
Middleincome = families making $25k$100k a year in Wave 1

Defining low,middle, and highincome groupsDue to the limitation in the data described above, all incomes had to be converted in to categorical responses, with the smallest possible category size of $25,000 dollars. This created five categories for all incomes:
Defining income groups: under $25k, $25k$49999, $50k$74999, $75k$99999, and $100k+.

Wave 1 income was collected as a continuous variable, with an average of $45,728, (N=15,351, SD=$51,616). Lowincome respondents (with incomes below $25,000) had an average of $9,837 (N=3,049, SD=4,633). Wave 4 income was recorded as a categorical variable, however, where respondents indicated if they made under $5,000, between $5,000 and $10,000, between $10,000 and $15,000, etc. These categories were of different sizes, getting larger as the income grew larger. Therefore, in order to create comparable measures between Wave 1 and Wave 4, both incomes were converted to 5 groups, (1) household income of less than $25,000, (2) household income of $25,000 to $49,999, (3) household income of $50,000 to $74,000, (4) household income of $75,000 to $99,000, and (5) household income of over $100,000
Upward mobility (dependent variable); data surrounding household incomes of Wave 1 and Wave 4

stratum. This sampling method yielded a sample of 20,745 students in 7thto 12thgrade, with oversampling of some minority racialethnic groups, students with disabilities, and twins(Harris, 2018). Data were also collected from the parents of the inhome survey respondents, with an 85% success rate (Chen & Chantala, 2014).Wave 1 participants also reported their home address, which was then linked to a number of state, county, and Census tractlevel variables from other sources. The present study used the school survey data, the inhome interview data, the parent survey data, and the data that was linked to state, county, and censustracts, as described above. This study also used data from two subsequent waves of inhome interviews, specifically waves 3 and 4 (no new information relevant to the present study was collected in Wave 2). For each subsequent wave, AddHealth survey administrators recruited from the pool of Wave 1 respondents, no matter if they had responded to any wave since Wave 1. The present study used Wave 1 data for information about the youth’s socioeconomic status, social capital and other related variables. This wave collected from 1994 to 1995, when most respondents were between11 and 19 years old (n=20,745 youth) (Harris, 2013).This study also used information from the third wave of inhome interview data, namely all questions on informal mentoring. This wave wascollected in 2001 and 2002 when the youth (N=15,197) were 18 to 26 years old. The fourth wave of data was collected in 2008 and 2009, when the respondents were 25 to 33 years old (n=15,701). Data from the fourth wave wereused to calculate economic mobility, the key dependent variable for this study.
Data source

DataTo address these questions, this study used three wavesofthe restricteduse version of the National Longitudinal Study of Adolescent Health (AddHealth). AddHealth is a multiwave longitudinal, nationally representative study of youth who have been followed since adolescence through to adulthood. The AddHealth data were collected by sampling 80 high schools stratified across region, school type, urbanicity, ethnic mix, and school size during the 19941995 academic year. Fiftytwo feeder schools(commonly middle schools whose students were assumed to go to these study high schools)were also sampled, resulting in a total of 132 sample schools. (Chen & Chantala, 2014, Harris, 2013). When sample high schools had grades 7 to 12, feeder schools were not recruited, as the lower grades served the role of feeding in younger students (Chen, 2014). Seventy nine percent of schools approached agreed to be in the study (Chen & Chantala, 2014). An inschool survey was then administered to over 90,000 students from these 132 schools. This survey was given during a single day within a 45to 60minute class period (Chen & Chantala, 2014). Subsequent recruitment for inhome interviews was done by stratifying students in each school by grade and sex and then randomly choosing 17 students from each
Data source

Figure 1: Potential Ways MentorsCanPromote Mobility
Figure depicts effects of mentors providing social support and social capital

The third function mentors play in promoting upward mobility for young people is the direct effect the provision of social capital (both bridging and bonding capital) has on building blocks of mobility(Ellwood et al., 2016). Bonding capital from a mentor who is also a teacher could foster feelings of school connectedness, which has been demonstrated to lead to academic engagement and ultimately, educational attainment (Ashtiani & Feliciano, 2018; Li, Lerner, & Lerner, 2010). An employer could have a similar effect by providing bonding capital. If a young person feels connected to the workplace or mission of the work place through their mentoring relationships with their employer, they are likely to have higherjob satisfaction and more opportunities for promotion (Ghosh &Reio 2013). Bridging capital can also have a direct effect on key links in the chain. Studies have shown that bridging mentors (commonly teachers and school personnel) were likely to promote educational attainment and employment
Social capital (bridging and bonding) can "foster feelings of school connectedness, which has been demonstrated to lead to academic engagement and ultimately, educational attainment"; similar in workplaces, bonding with mentors in settings can create sense of connectedness with setting overall

Those who report feeling emotionally supported have higher rates of academic competence (Sterrett, Jones, Mckee, & Kincaid, 2011) and strong academic outcomes (Wentzel, Russell & Baker, 2016). Additionally, adults who have achieved upward mobility are more likely to report instrumentally supportive relationships than those who were not mobile (Chan, 2017). Clearly, social support has a direct influence on someof thebuilding blocks of mobility
Social support leads to higher rates of academic competence, strong academic outcomes; has a direct influence on some of the building blocks of mobility

compensate for the lack of other resources their peers have, such as expansive connected social networks.
Youth from disadvantaged neighborhoods make greater strides than moreresourced peers when mentored by someone outside the family; can potentially compensate for lack of other resources in youth's life

A young person's neighborhood context is associated with their chance of being mentored and their chance of being economically mobile. Young people living in underresourced neighborhoods are also unlikely to be upwardly mobile (Chetty & Hendren, 2016a; Chetty, & Hendren, 2016b; Chetty, Hendren, Kline & Saez, 2014b; Goldsmith, Britton, Reese, & Velez, 2017). Lowincome children are more likely to live in neighborhoods with higher crime and drug use (Abelev, 2009). Young people from these neighborhoods are more likelytohave lower tests scores (McCullock & Joshi, 2001), drop out of high school, and be unemployed (Ainsworth, 2002). This neighborhood effect is cumulative: the more time spent in under
 Neighborhood is associated with chance of being mentored
 youth in underresourced neighborhoods are more unlikely to be upwardly mobile
 in these neighborhoods, likely to have higher crime and drug rates, lower test scores, drop out of high school, and be unemployed

young people from more advantaged homes and communities as more likely to have an informal mentor.
Youth in more advantaged homes are more likely to have an informal mentor

Black nonHispanic youth and girls are most likely to be mentored (Bruce & Bridgeland, 2014) as are youth who have a twoparent home with educated parents (Erickson et al., 2009) and not on public assistance (McDonald & Lambert, 2014). Place matters, as having lived in safe neighborhoods (MirandaChan, Fruiht, Dubon, WrayLake, 2016) and neighborhoods withhigher rates of white, employed individuals not receiving public assistance and living above the poverty line (McDonald & Lambert, 2014) are all associated with a greater chance of reporting a mentor. A young person’s participation in hobbies, organizations, and religious services also leads to higher rates of informal mentorship (Thompson & Greeson, 2017; Schwartz, Chan, Rhodes, & Scales, 2013). Individual qualities such as prosocial behavior (Hagler, 2017), a secure attachment style (Zinn, Palmer, & Nam, 2017), and a likeable personality (Erickson et al., 2009) are associated with having a natural mentor, as does having more friends
Typical mentorship demographics

In one study, a lowincome child was twice as likely to graduate college when mentored. This is in contrast to previous literature that demonstrates consistent but small associations between informal mentoring and college completion for middleincome children (Reynolds & Parrish, 2018). This suggests that youth from lowincome families benefit more from mentorship than those who may have a plethora of positiveresources in their life
Lowincome families benefit more from mentorship; one study suggests that mentored lowincome children are 2x as likely to graduate college

For instance, much attention has been paid to informal mentoring and educational outcomes: mentored youth are more likely to feel connected to their school (Black, Grenard, Sussman, & Rohrbach, 2010), have better grades (Chang et al., 2010), attend college (DuBois & Silverthorn, 2005a; Reynolds & Parrish, 2017) and receive a bachelor’s degree (MirandaChan, Fruiht, Dubon, & WrayLake, 2016; Erickson, McDonald, Elder, 2009). Cumulatively, these studies, along with a 2018 metaanalysis (Van Dam et al.) suggest a strong and consistent relationship between having an informal mentor and positive educational outcomes.
Informal mentors can result in and influence positive educational outcomes, help promote ability to "feel connected to their school"

Literature has established that informal mentoring is most commonly associated with psychosocial outcomes such as lower stress levels, higher life satisfaction, and lower rates of depression (DuBois & Silverthorn, 2005a; Chang et al., 2010; Munson & McMillen, 2009) and socioemotional outcomes, including improved social skills, perceived social support, and higher selfesteem (Van Dam et al., 2018; MirandaChan et al., 2016).These associations are strong and consistent across studies, suggesting that informal mentoring is positively correlated with positive psychosocial and socioemotional outcomes.
Informal mentoring is positively correlated with positive psychosocial and socioemotional outcomes

Informal mentoring relationships are also more prevalent than formal ones. One study found that 62% of youth had an informal mentoring relationship, compared to just 15% who reportedhaving a formal mentoring relationship(Bruce & Bridgeland, 2014). There are similar differences in prevalence when asking adults if they have mentored young people: 67% of those who reported mentoring someone in the past year did so informally, while only 31% did so through a formal program, (Oosthuizen, 2017). While coming from a lowincome family is one of several risk factors associated withlower exposure toinformal mentors, it is clear that many of these youth are still able to identify caring adults in their lives
 62% of youth had an informal mentoring relationship
 15% reported formal mentoring relationship
 67% of adults claimed to have informally mentored someone in last year
 31% did so in a formal program
 even lowincome family youth can identify caring adults in their lives

Persistent immobility also disproves the idea of the U.S. being a land of equal opportunity. Since the term "the American Dream" was first coined in 1931, it has become a persistent cultural ethos, a wish list of sorts, with a consistent main tenet being the idea that each generation can achieve more than their parents (Samuel, 2012). Yet we know this tenet of the American Dream is no longer true: the chances that a child earnsmore than their parents has decreased in the past 40 years, especially for lowincome families
chances of earning more than parents has decreased in past 40yrs for lowincome families

he associations between childhood poverty andupward mobility are cumulative: each year of childhood spent in poverty lowers an individual's chances of being upwardly mobile, as they are less likely to be consistently employed or in school
Each year in childhood poverty = less likely to be upwardly mobile, consistently employed/in school

Children who experienced any childhood poverty are less likely to be economically mobilethan their middleincome peers(Chetty et al., 2016c; Mitnik et al., 2015) and are more than five times likelier to remain poor in adulthood than to make it to the top income quintile
Any childhood poverty = less likely to be economically mobile, 5x likelier to remain poor in adulthood

Even a child who spent just one year in poverty is less likely to have a high school diploma, a key step towards economic success
1 yr of poverty already = less likely to have a high school diploma

In 2016, 18% of American children were living in poverty, defined fora household of four as living with an annual income of less than $24,755(Semega, Fontenot & Kollar, 2017). Although this is just one snapshot in time, up to 39% of allAmerican children will experience povertyat some point during theirchildhood(Ratcliffe, 2015). Childhood poverty is linked to low educational attainment, socioemotional issues,and development delays. Poor families are likelier to be exposed to food insecurity, homeless, and unsafe neighborhoods. They are also likelier than their middleincome peers to have poorer health and access to health care
In 2016,
 18% of American children lived in poverty
 poverty = less than $24,755
 up to 39% of all American children will experience poverty
 childhood poverty is linked to low educational attainment, socioemotional issues, and development delays
 poor families more likely to be exposed to food insecurity, homelessness, and unsafe neighborhoods
 more likely to have poorer health and access to health care

There are over 13 million children and adolescents in poverty in the United States today.
13 mil children and adolescents live in poverty in US

In 2016, close to onefifth of American children wereliving in poverty (Semega, Fontenot & Kollar, 2017). These millions of children are likely to remain poor throughout their lives, and are less likely to be upwardly mobile than their middleincome peers (Ratcliffe, 2015; Mitnik, Bryant, Weberb & Grusky, 2015).
1/5 of American children were living in poverty in 2016; likely to remain poor and less likely to be upwardly mobile

Lowincome youth, however, were less likely to have an informal mentor, and only 45% of those who were mentored had the type that could promote mobility.
Statistical finding: lowincome youth likely did not have an informal mentor, and only 45% of those with one were able to have mobility.



Because subject matter expertise goes a long way towards helping you spot interesting patterns in your data faster, the best analysts are serious about familiarizing themselves with the domain. Failure to do so is a red flag. As their curiosity pushes them to develop a sense for the business, expect their output to shift from a jumble of false alarms to a sensiblycurated set of insights that decisionmakers are more likely to care about.
Analysts have domain expertise or knowledge at least.

While statistical skills are required to test hypotheses, analysts are your best bet for coming up with those hypotheses in the first place. For instance, they might say something like “It’s only a correlation, but I suspect it could be driven by …” and then explain why they think that. This takes strong intuition about what might be going on beyond the data, and the communication skills to convey the options to the decisionmaker, who typically calls the shots on which hypotheses (of many) are important enough to warrant a statistician’s effort. As analysts mature, they’ll begin to get the hang of judging what’s important in addition to what’s interesting, allowing decisionmakers to step away from the middleman role.
More formal and detailed version of above. Besides, the difference of being important and being interesting should be noted too. Maybe search for a thread.

For example, not “we conclude” but “we are inspired to wonder”. They also discourage leaders’ overconfidence by emphasizing a multitude of possible interpretations for every insight.
Data analysts are the inspiration team.

Analysts are data storytellers. Their mandate is to summarize interesting facts and to use data for inspiration.
This is actually what i do in my reviews too, so i may define myself as a qualitative analyst now.

Excellence in analytics: speed The best analysts are lightningfast coders who can surf vast datasets quickly, encountering and surfacing potential insights faster than those other specialists can say “whiteboard.” Their semisloppy coding style baffles traditional software engineers — but leaves them in the dust. Speed is their highest virtue, closely followed by the ability to identify potentially useful gems. A mastery of visual presentation of information helps, too: beautiful and effective plots allow the mind to extract information faster, which pays off in timetopotentialinsights. The result is that the business gets a finger on its pulse and eyes on previouslyunknown unknowns. This generates the inspiration that helps decisionmakers select valuable quests to send statisticians and ML engineers on, saving them from mathematicallyimpressive excavations of useless rabbit holes.
Analysts are more of a digger, they carelessly and fast dig into data, maybe find some directions, which then will be studied elaborately by statisticians and then MLs to create sustainable and automated solutions.

Performance means more than clearing a metric — it also means reliable, scalable, and easytomaintain models that perform well in production. Engineering excellence is a must. The result? A system that automates a tricky task well enough to pass your statistician’s strict testing bar and deliver the audacious performance a business leader demanded.
What machine learners/ AIs do is to scale a statistically rigorous solution to a systemwide, complex problem.

In other words, they use data to minimize the chance that you’ll come to an unwise conclusion.
Role of statisticians


outline.com outline.com

The pvalue says, “If I’m living in a world where I should be taking that default action, how unsurprising is my evidence?” The lower the pvalue, the more the data are yelling, “Whoa, that’s surprising, maybe you should change your mind!”
In a simpler context, it means the occurrence of default (null) situation is of very low probability.

 Dec 2019

www.washingtonpost.com www.washingtonpost.com

“Every data point was altered to present the best picture possible,”Bob Crowley  Lessons Learned interview  8/3/2016Tap to view full document Bob Crowley, an Army colonel who served as a senior counterinsurgency adviser to
Juke the stats.

 May 2019

www.reddit.com www.reddit.com

Brook Lopez this season had more blocks than Kevin Garnett had in his best season and more 3 pointers than Kobe Bryant had in his best season...
Mindblowing
Tags
Annotators
URL


www.gwern.net www.gwern.net

statistical modelling problems  relevant to measurement

 Apr 2019

statistics.laerd.com statistics.laerd.com

statistics.laerd.com statistics.laerd.com

There are two tests that you can run that are applicable when the assumption of homogeneity of variances has been violated: (1) Welch or (2) Brown and Forsythe test. Alternatively, you could run a KruskalWallis H Test. For most situations it has been shown that the Welch test is best. Both the Welch and Brown and Forsythe tests are available in SPSS Statistics (see our Oneway ANOVA using SPSS Statistics guide).
ANOVA is robust against violation of the assumption of equal variances, but...

However, platykurtosis can have a profound effect when your group sizes are small. This leaves you with two options: (1) transform your data using various algorithms so that the shape of your distributions become normally distributed or (2) choose the nonparametric KruskalWallis H Test which does not require the assumption of normality.
ANOVA is robust against violation of normality, but...

 Mar 2019

statistics.laerd.com statistics.laerd.com

Testing for Normality using SPSS Statistics


www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov

We performed some manipulation checks to examine the internal validity of the perceptualcognitive skill tests and any learning effects as a result of watching the same video clips multiple times

 Feb 2019

www.sciencedirect.com www.sciencedirect.com

Due to our emotional distress measure having little prior validation, and our physical distress measure being entirely new, we first provide data to support the appropriateness of the two measures.
An example of survey validation using Crombach's alpha.


statistics.laerd.com statistics.laerd.com

You may believe that there is a relationship between 10,000 m running performance and VO2max (i.e., the larger an athlete's VO2max, the better their running performance), but you would like to know if this relationship is affected by wind speed and humidity (e.g., if the relationship changes when taking wind speed and humidity into account since you suspect that athletes' performance decreases in more windy and humid conditions).
An example of partial correlation.

 Nov 2017

medium.com medium.com

Developers are an important demographic. Apple says they are the biggest segment of Macbook Pro users, which means they spend a lot of money. And they’re a demographic underserved by Chromebooks today.


docs.statwing.com docs.statwing.com

Heteroscedasticity
Heteroscedasticity is a hard word to pronounce, but it doesn't need to be a difficult concept to understand. Put simply, heteroscedasticity (also spelled heteroskedasticity) refers to the circumstance in which the variability of a variable is unequal across the range of values of a second variable that predicts it.

 Apr 2017

bangordailynews.com bangordailynews.com

The annual drop in Maine wood demand since 2014 would fill that imaginary 1,770mile caravan. The loss equals about 350 fewer truckloads of wood a day, every day of the year.

 Mar 2017

bangordailynews.com bangordailynews.com

A typical acre of blueberry barrens will yield about 2,000 to 4,000 pounds of berries, depending on pollination and other factors.

 Feb 2017

twitter.com twitter.com

Toxic air now kills almost as many people as high cholesterol and even more than excessive salt or being overweight.
Tags
Annotators
URL

 Jan 2017

static1.squarespace.com static1.squarespace.com

especially those of figure and number, of which men have so clear and distinct ideas
Add in some Lemos here as well. And some Mark Twain.

 Feb 2016

bangordailynews.com bangordailynews.com

He expects that the logging project near Quimby’s land will likely generate about $755,250 at the state’s average sale price, $50.35 per cord of wood. The land has about 1,500 harvestable acres that contain about 30 cords of wood per acre, or 45,000 cords, but only about a third of that will be cut because the land is environmentally sensitive, Denico said. The Bureau of Parks and Lands expects to generate about $6.6 million in revenue this year selling about 130,000 cords of wood from its lots, Denico said. Last year, the bureau generated about $7 million harvesting about 139,000 cords of wood. The Legislature allows the cutting of about 160,000 cords of wood on state land annually, although the LePage administration has sought to increase that amount.

 Jan 2016

blogs.scientificamerican.com blogs.scientificamerican.com

P(BE) = P(B) X P(EB) / P(E), with P standing for probability, B for belief and E for evidence. P(B) is the probability that B is true, and P(E) is the probability that E is true. P(BE) means the probability of B if E is true, and P(EB) is the probability of E if B is true.

The probability that a belief is true given new evidence equals the probability that the belief is true regardless of that evidence times the probability that the evidence is true given that the belief is true divided by the probability that the evidence is true regardless of whether the belief is true. Got that?

Initial belief plus new evidence = new and improved belief.

 Oct 2013

rhetoric.eserver.org rhetoric.eserver.org

The things that happen by chance are all those whose cause cannot be determined, that have no purpose, and that happen neither always nor usually nor in any fixed way.
This is not how statistic work.
