I've mastered being trans and all *I've* learned about gender is that it's kind of stupid (sometimes bad stupid, like hegemony, and sometimes good stupid, like monster movies)
Imogen Binnie
I've mastered being trans and all *I've* learned about gender is that it's kind of stupid (sometimes bad stupid, like hegemony, and sometimes good stupid, like monster movies)
Imogen Binnie
Spinning is creating an environmen t of increasing innocence. Innocence doesnot consist in simply "not harming." This is the fallacy of ideologies of nonvio-lence. Powerful innocence is seeking and naming the deep mysteries of intercon-nectedness. It is not mere helping, defending, healing, or " preventive medicine ."It must be nothing less than successive acts of transcendence and Gyn/Ecologicalcreation. In this creation, the beginning is not " the Word." The beginning is hear-ing. Hags hear forth new words and new patterns of relating. Such hearing forthis behind, before, and after the phallocratic "creation." [Pp . 413-14
The innocence of Astraea.
Although the technique was originally developed for analysis of two-color single laser measurements
...by a team that included the first author 29 years earlier!
After fluorescence compensation, some cell populations will have low means and include events with negative data values
Because compensation attempts to remove the background autofluorescence, if a cell that's negative for the marker the immunofluorescence is binding to and also exhibits less autofluorescence than average, it will have a negative value after compensation (and this population will have a low mean both before and after compensation).
Flow Cytometry Instrumentation and Measurements
Overview from Handbook of Experimental Immunology
Log-Normal Distribution of Single Molecule Fluorescence Burstsin Micro/Nano-Fluidic Channels
answers why distribution of fluorescence in flow cytometry data tends to be lognormal
How serotonin shapes moral judgment and behavior
Possible mechanisms of how serotonin increases the aversion to harming others.
I shall largely speak ofmice, but my thoughts areon man, on healing, on life and its evolution.Threatening life and evolution are the two deaths,death of the spirit and death of the body. Evolu-tion, in terms ofancient wisdom, is the acquisitionof access to the tree of life. This takes us back tothe white first horse ofthe Apocalypse which withits rider set out to conquer the forces thatthreaten the spirit with death. Further in Revela-tion (ii.7) we note: 'To him who conquers I willgrant to eat the tree' of life, which is in theparadise of God' and further on (Rev. xxii.2):'The leaves of the tree were for the healing ofnations.'This takes us to the fourth horse of theApocalypse (Rev. vi.7): 'I saw ... a pale horse,and its rider's name was Death, and Hadesfollowed him; and they were given power over afourth ofthe earth, to kill with the swordand withfamine and with pestilence and by wild beasts ofthe earth' (italics mine). This second death hasgradually become the predominant concem ofmodern medicine. And yet there is nothing in theearlier history of medicine, or in the preceptsembodied in the Hippocratic Oath, that precludesmedicine from being equally concerned withhealing the spirit, and healing nations, as withhealing the body. Perhaps we might do well toreflect upon another of John's transcriptions(Rev. ii. 1): 'He who conquers shall not be hurtby thesecond death.'
Wow - I have not read many papers which are so... Biblical.
spectrally compen-sated
See Roederer, 2001, Spectral Compensation for Flow Cytometry: Visualization Artifacts, Limitations, and Caveats
https://onlinelibrary.wiley.com/doi/10.1002/1097-0320(20011101)45:3%3C194::AID-CYTO1163%3E3.0.CO;2-C
raw flow cytometry standard (FCS) files
This is just a data format, essentially an array of detection events.
Ito et al., 2019)
Summary of the 3 previous studies.
Zhao et al., 2019
PD-1 blockade. Includes study of what differentiates successful from less successful treatment.
Schalper et al., 2019
Also a PD-1 blockade
(Cloughesy et al.,2019
PD-1 blockade
(Reardonet al., 2016
Targeting 3 immune checkpoint molecules + a combination therapy yielded a meaningful survival difference.
Bloch et al., 2013)
Many tumors express B7-H1, which triggers apoptosis in activated T cells.
(Wainwrightet al., 2014
Shows that GBM relies on immune checkpoint molecules IDO, CTLA-4, and PD-L1. (IDO attracts regulatory T cells; CTLA-4 is expressed by T cells and CD80 on dendritic cells interacts with it during T cell activation; PD-L1 on a tumor binds to PD-1 on a T cell to tell the T cell not to kill the tumor cell.)
orthotopic
ie, a brain tumor cell line implanted in brain tissue.
2014
GBM evades NK cell response in early tumor formation by overexpression of galectin-1.
Zhou et al., 2015
GBM secretes periostin, a signaling protein involved in cell adhesion, wound healing, and the endothelial-mesenchymal transition. Overexpression of periostin also recruits immunosuppressive immune cells.
Wainwright et al., 2012
GBM expresses indoleamine 2,3 dioxygenase (IDO). IDO regulates the function & expansion of regulatory T cells. GBM recruits a ton of regulatory T cells that express GITR and seem to inhibit the immune response.
Crane et al., 2014
Many different tumors secrete the protein LDH5, which causes many healthy myeloid cells to produce NKG2D ligands, which causes NK cells to be less aggressive toward NKG2D ligands (down-regulates NKG2D receptors), which allows tumors expressing NKG2D ligands to get away scot-free.
Huettner et al., 1997
"Interleukin 10 (IL-10) is a cytokine with a broad spectrum of immunosuppressive activity" and is produced by GBM cells.
Maxwell et al., 1992
transforming growth factor-\(\beta\)2 is produced by GBM and is immunosuppressing
CD45R/B220+
CD45R/B220 appears prior to apoptosis (is the tumor shielding itself with a bunch of zombie T cells?)
From historic data from June 2021 to October 2021, when Delta was dominant in Scotland, we have estimated that 75% of admissions within 14 days of a positive test were admitted for SARS-CoV-2. This percentage was constant over this five-month period.
this data is not right-censored
proportional hazard models
has the usual shortcoming that proportional hazard assumptions are biologically unlikely. we'll see if this becomes material later.
We used S gene status as a surrogate for Delta and Omicron VOCs, with S gene positive status indicating Delta whereas S gene negative indicated Omicron
appears sensible when Delta & Omicron are the two VOCs. Alpha also has S gene dropout. BA.2 does not have this deletion, but that version of Omicron was not detected in the UK as of 12/7 (https://www.theguardian.com/world/2021/dec/07/scientists-find-stealth-version-of-omicron-not-identifiable-with-pcr-test-covid-variant)
Bernd Bodenmiller
U of Zurich
Jacob Harrison Levine
Columbia/Sloan Kettering. Computational/systems bio.
Ste ́phane Chevrier
University of Zurich
Hence, wordembeddings may serve as a means to extract implicit gender associations from a large text corpussimilar to how Implicit Association Tests [11] detect automatic gender associations possessed bypeople, which often do not align with self reports
This sidesteps the findings that IAT results don't seem to correlate with biased behavior. (https://qz.com/1144504/the-world-is-relying-on-a-flawed-psychological-test-to-fight-racism/ for a summary)
gender stereotypes present in broader society
Fails to distinguish between a mathematical understanding of bias and the commonplace one.
would exhibit little gender bias becausemany of its authors are professional journalists
This seems to represent a misunderstanding of what the embedding represents - news articles just need to quote more librarians who use 'she' and more philosophers who use 'he' to generate this. The writing need not be stereotyped - all that's necessary is for the world to be biased.
owever, none of these papers haverecognized how blatantly sexist the embeddings are
Was this really true - especially given that a key example embedding in the word2vec paper is about gender?
removegender stereotypes, such as the association between the wordsreceptionistandfemale, while maintaining desired associations such as between the wordsqueenandfemale
This seems like an ill-specified task? We'll see.
disturbing exten
"Disturbing" implies some element of surprise, which seems unwarranted. (Doesn't make it less important, but the results aren't at all surprising based on the source texts.)
Lewis et al., 2020)
BART
(Peters et al., 201
ELMo
Jacob Andreas
Small NLP lab (6-8)
From Berkeley NLP group.
Just got PhD in 2018.
Maxwell Nye
4th yr PhD student
Belinda Z. Li
MIT, 2nd yr PhD student
Initialization
This talks about the initialization for the transducers, but how is the domain model initialized? That is, where do the priors come from?
Evaluation of sentiment transfer is difficult and is still an openresearch problem (Mir et al., 2019)
...because in its full generality measuring sentiment requires a complete understanding of social interaction, and is highly subculturally specific. This is not just an open research problem, it seems impossible without GAI.
BLEU score on the test setwhich contains 100K parallel sentences.
BLEU score is an unintuitive metric here - wouldn't some way of measuring how well we discovered the codebook be better?
Note that the loss used in previous work does not include the negative entropy term,−Hq. Ourobjective results in this additional “regularizer”, the negative entropy of the transduction distribution,−Hq. Intuitively,−Hqhelps avoid a peaked transduction distribution
This is critical.
we introduce further parameter tying between the two directions of transduction: the same encoder isemployed for bothxandy, and a domain embeddingcis provided to the same decoder to specifythe transfer direction, as shown in Figure 2
Ooh, this is interesting.
herefore, we use the samearchitecture for each inference network as used in the transduction models, and tie their parameters
This seems like a lot of text on one of the simpler ideas in the paper?
LELBO
From eq. 3 in the Kingma & Welling Auto-encoding Variational Bayes paper (https://arxiv.org/pdf/1312.6114.pdf)
emissions are one-to-one
Not generally true of an HMM - but maybe this is an assumption that is often made when doing inference? I don't know the usual HMM inference techniques.
Markov assumption on the latent sequence
But because the sequence is latent and corresponds arbitrarily to outputs, this doses not actually seem like a strong independence assumption to me.
logp(X,Y;θx| ̄y,θy| ̄x) = log∑ ̄X∑ ̄Yp(X, ̄X,Y, ̄Y;θx| ̄y,θy| ̄x)(2)
This is nothing fancy, just marginalizing the distribution.
p(X, ̄X,Y, ̄Y;θx| ̄y,θy| ̄x) =(m∏i=1p(x(i)| ̄y(i);θx| ̄y)pD2( ̄y(i)))(n∏j=m+1p(y(j)| ̄x(j);θy| ̄x)pD1( ̄x(j)))(1)
bitext
General term for merged text of source and translation/transduction.
amortized variational inference (Kingma &Welling, 2013)
another plausible ancestor paper
typicallyenforce overly strong independence assumptions about data to make exact inference tractable
The three dangerous statistical assumptions:
the noisy channel model (Shannon, 1948)
Here Shannon establishes a relationship between the amount of noise and the maximum transmission efficiency under an error-correcting code that compensates for the noise with high confidence.
Style transfer hashistorically referred to sequence transduction problems that modify superficial properties of text –i.e. style rather than content
It is not totally clear what is in this category, is it. I wonder if the metaphor of "style" is limiting our imagination for how these models are used?
e.g.the HMM
Hidden Markov Model.
A Hidden Markov Model in its full generality assumes very little about the data it generates - I wonder what this means in this context?
ICLR 2020
ICLR talk video here: https://papertalk.org/papertalks/4014
δ(i)= arg minδ,||δ||2≤logpθ(y(i)|x(i);H(i)+δ)
It's not clear to me how \(p_{\theta}\) is conditioned on the modified hidden state.
g(y(i)t−1,M(i);θ)
Is there a typo here? Is it not \(y_{\lt t}\) instead of \(y_{t-1}\)?
(1)
This is just regular Transformer training.
LMLE(θ) =N∑i=1logpθ(y(i)|x(i))
This is what we're aiming to maximize: the sum of the log likelihoods of the observations under the model parameterized by \(\theta\).
(Is there a bit of a problem here? In situations where x translates to multiple good y's, the ground truth probability p(y|x) is lower. Are we overweighting situations with fewer right answers?)
We maximize the conditional loglikelihoodlogpθ(y|x)for a givenNobservations{(x(i),y(i))}Ni=1as follows
Training over a dataset of N paired sequences.
which hurts its generalization tounseen inputs, that is known as the “exposure bias” problem
There's some question about to what extent this is a first-order problem and to what extent this is just difficulty in generalization. See: Quantifying Exposure Bias for Open-Ended Language Generation, He at al. & Generalization in Generation: A Closer Look at Exposure Bias, Schmidt.
Multiplyingthe output probabilities of this predictor withG’soriginal probabilities and then renormalizing yieldsa model for the desiredP(X|a)via Bayes’ Rule.
Here's the key math. \(P(X|a) = P(a|X) * P(X) / P(a)\). Since we already have the constraint that a probability adds to 1, we can consider P(a) to be just a normalizing factor and we don't actually have to know/derive it.
Dan KleinUC Berkeley
Wide-ranging NLP contributions, very busy as a last author.
Kevin YangUC Berkeley
Other work includes: the breakthrough deep learning antibiotic discovery paper, several other molecular ML papers, and an algorithmic improvement for beam search (applied to machine translation). In 2nd or 3rd year of PhD as of 2021. Undergrad & MSc at MIT. Regina Barzilay = MIT adviser, Dan Klein = UCB adviser.
The remaining 35 cards would allow Bob’s aces to hold
Alice's aces.
Now if both players knew each other’s cards, they would agreethat if the last card is a 3 or 8 of any suit, Bob wins, otherwise Alice wins.
No - Bob wins if the river is a heart, and loses otherwise.
Alice has a made hand of a pair of aces and Alice has a drawto a straight.
Alice does not have a draw to a straight.
USING GOSSIPS TO SPREAD INFORMATION:THEORY AND EVIDENCE FROM TWO RANDOMIZEDCONTROLLED TRIALS
Key takeaway: community members can identify people with high centrality in their social network.
[while]
This excerpt is similarly misleading, with that one "[while]" replacing a paragraph of text. (Again, yes Darwin was sexist, but these quotes misrepresent the text, and Darwin's stances).
The western nations of Europe . . . now so immeasurably surpass their former savage progenitors[that they]stand at the summit of civilization. . . . [T]he civilised races of man will almost certainly exterminate,and replace, the savage races throughout the world.
I'm not going to defend Charles Darwin and I think we can all agree racism and sexism inhere to his worldview, but the quote is from The Descent of Man, not Origin of Species, and those ellipses are very misleading - they cut out more than 20 pages. As excerpted, it looks like Darwin is advocating genocide, but I don't think anyone would conclude that from the quotes in context.
Unlike this constant value — which isour expectation if there were nodegree correlations — the solid line increases from near 300 for low degree individuals to nearly 820 forindividuals with a thousand friends confirming the network’s positive assortativity.
Would be interesting to analyze the source of this. My first guess would be that it's generational, reflecting different usage patterns, rather than connectedness being "causal" (though I'm not positive I can nail down what the difference is).
A naive approach to counting friends-of-friends
Interesting - the linear model seems like a more naive approach to me. The approach described here doesn't become sensible until you tie in degree assortivity, which is not that intuitive, and not nearly strong enough to justify \(k^2\).
The second-largest connected component only has just over 2000 individuals
Who is this second-largest component!?
And has it been joined to the main component since 2011?
to read
to read
gradient
General observation about gradient descent unrelated to this paper - if you look at the partial derivative of any particular parameter, it's got two components, which, slightly metaphorically, correspond to how strongly it was activated and how much influence it had over the error. This differs from my introspective feeling about how my own learning works, where the understanding I'm most sure of, the part of my model which is most strongly activated, is not the thing that changes the most. It introspectively (and so unreliably) feels like I'm more likely to try to spin up a patch for the error - take an under-activated part of the model, shift its outbound connections to try to better match the shape of the error, and crank up the weight going into it. Gradient descent sort of only punishes mistakes rather than rewarding growth (though maybe that sense is just a consequence of the arbitrary choice of sign: minimizing loss vs. maximizing its opposite).
Basically, what if we looked for a highly activated cell that didn't affect the outcome much one way or another and see if we could make it affect the loss more and better? Once a cell gets a tiny weight going into it and tiny weights in its output stream, is there any hope for it to matter? How often does this sort of "self-pruning" zero-ing happen? Are these metaphors at all sensible?
e decided to use the Wasserstein GAN objec-tive [1] since it led to good visual results
But what else did you try??? More data on what doesn't work please!
point view
typo -> "point of view"?
We will refer toPas the perceptual latent space
It seems like we could choose an arbitrary layer in the critic and use its input as this latent space - what choices of layer make this "perceptual"?
z∼ N(0,1)
General GAN question - what happens if we make a different assumption about the z distribution?
The weights and biases in the first layer can scale and translate this along any dimension into any normal distribution, but what breaks if it's uniform instead? If we make our z distribution multimodal, does that help to learn multimodal domains better? Like ImageNet instead of faces, or non-centered objects, or scenes with more than one thing, all of which StyleGAN has a relatively hard time with?
current best methods are based on deeplearning (DL).
Recent survey: https://arxiv.org/pdf/1902.06068.pdf
G(x,z;θ) :Ry×Rz7→Rx
Should this be G(y,z; theta)?
https://github.com/google-research/lag
Doesn't seem to be currently public as of 5/12/20.
The problem we address, while close to the formula-tion of the single-image super-resolution problem, is in factrather different.
Predicting the highest likelihood higher-resolution image vs. finding a probability distribution of higher-resolution images.
Ian Goodfellow
GAN originator, adversarial attacks.
Peyman Milanfar
Head of Computational Imaging team at Google. Looks like he was PI of a lab at UCSC that produced some very highly cited superresolution work in the 2000s.
Survey paper on image filtering.
Recent work focused on superresolution & denoising.
David Berthelot
At Google since 2013. But on a research tear since ~2016-7.
Important work: Boundary Equilibrium GAN (https://arxiv.org/abs/1703.10717), MixMatch (semi-supervised learning algorithm) (https://arxiv.org/abs/1905.02249).
When we look at humans, we see them as plotters or schemers or competition. But when we look at puppies, or kittens, or other animals, none of that social machinery kicks in. We're able to see them as just creatures, pure and innocent things, exploring an environment they will never fully understand, just following the flow of their lives.
When I look at myself, when do I apply the schemer paradigm and when the kitten?
The unmade bed and unkempt piles of laundry, mail, and dirty dishes say, you’re not the kind of person who deserves to be cared for.
Ouch, is this true?
Note that the ‘supremum’in the definition ofck|ℓis actually a ‘maximum’.
These seems like a subtle point, and it's not obvious to me at first blush that it should be so. Why couldn't this be irrational, say? There are an infinite number of families in this class of families.
k|ℓ-separated
Translation: The family is \(k|l-separated\) if any \(k-l\) elements of the ground set, and any \(l\) other ones, you can find a set in the family that has all of the \(k-l\) elements and none of the \(l\) other ones.
Seems like the notation would be better with this interpretation in mind. n|n-separated isn't as obviously silly as 0|n-separated. And then 1|1-separated would translate into the usual definition for separable, instead of 2|1.
As a trivial example, for every finitesetXwith|X| ≥k, any union-closed family onX, containing the union-closed family{A⊆X:|A| ≥ℓ} ∪ {∅}, isk|ℓ-separated.
If it's got every set of size \(l\), then no matter what \(k\) elements you pick, the set that contains exactly the first \(l\) is in there.
Is that the trivial example, or the only example? It's not the only example: {01234, 12345, 23450, 34501, 45012, 50123} is 5|4 separated (despite not having all the size 4 sets in it), but not 6|4 separated.
any union-closed family is 1|1-separated
Because n|n separation doesn't say anything meaningful (there's nothing to check for absence).
In epidemiology, A is commonly referred to as exposure or treatment.
Is it a Harvard convention that A is the treatment variable and Y is the outcome? Or a health care policy convention?
as early as the 1850’s by John Snow
...in the course of founding epidemiology by isolating the cause of cholera to contaminated water supplies. (Compared cholera cases in houses supplied by two different water companies, before and after one switched the source of their water.)
See also this overview from the Columbia school of public health: https://www.mailman.columbia.edu/research/population-health-methods/difference-difference-estimation
See directed acyclic graphs (DAGs)
Is this a Bayesian Network?
.
homework question: what makes diff-in-diff different from non-randomized trials (like, when trying to figure out the effect of surgical interventions) more generally?
only permit two treatment histories
This seems normal, why do we say so? Maybe the usual machinery accounts for people dropping out of the treatment group? Or are we preparing the way for later considerations about when there are a bunch of treatments of different populations that hit at different times?
unit
units are the members of groups (control/treated).
ββ\beta coefficient, on the other hand, is specific to the regression estimator
is it? we could use a different estimator than OLS, like something that tries to compensate for heteroscedasticity (WLS, GLS), and still be looking for \(\beta\).
units in the control group
Oops, nope, units and groups are not the same thing. Units are in groups.
covariate X
OK, this doesn't mean the simple thing I hoped it might.
.
I can make guesses, but it seems like it might be useful to say why we might choose one or the other, or under what circumstances incorporating the extra data is helpful or not. (It's not obvious to me that more observations necessarily helps - seems like it increases the odds that an exogenous factor is going to hit one group and not the other. And if some of these are ongoing factors and not shocks that dissipate, the performance of each group is decreasingly useful for estimating the other as time goes on.)
.
Yeah, this chart makes me sure that we're assuming that the treatment has no effect on the untreated group & the lack of treatment has no effect on the treated group.
.
From context, I'm guessing this paragraph is trying to teach me a difference between the notation used in this exposition & how it might 'usually' be used? I don't understand what new information was introduced.
Arrow of time
This also seems like it ought to be part of the definition of \(Y^{a}\) rather than an assumption.
many statistical methods: parametric, non-parametric and everything in between.
The distinction between parametric & non-parametric methods is pretty far over my head. Wikipedia gives me a very gross understanding, but how something could be in-between is DEFINITELY over my head.
Instead of a regression coefficient, we can define the target estimand as the difference between potential outcomes under treatment versus no treatment
Yeah! This seems much more straightforward! Why were we talking about a regression coefficient before? So confused.
Pre-intervention times
this notation suggests that \(t=2\) is a pre-intervention time or \(T_{0}\), but below it stands in for the post-intervention period.
notation
Know it's usual practice to do notation conventions first, but I'd prefer in an expository situation like this to see it defined as it comes up - I think I could pick up the definitions better in context. Would also help because there's more notation along the way than this table defines.
Y(t)=(1−A)⋅Y0(t)+A⋅Y1(t)
1) there must be a simpler way to write this.
2) couldn't this just be part of the definition of \(Y^{a}\) rather than an additional assumption?
observe the potential outcomes both with treatment and with no treatment, estimating the ATT would be easy. We would simply calculate the difference in these two potential outcomes for each treated unit, and take the average
I have some fundamentally different understanding about the nature of truth here that's making this hard to read for me - if we were to observe the potential outcomes, how would they still be potential? if we calculate the effect, how is it that we're still estimating the ATT and not calculating its actual value?
ATT≡E[Y1(2)−Y0(2)∣A=1]
I know this is not a very complex formula, but it still seems unnecessarily complex for such a simple idea. It also obscures the fact that \(Y^{1}(2) | A=1\) is a fact that we have and \(Y^{0}(2) | A=1\) is an estimation. (right?)
t=2t=2t = 2
magic number, ick.
.
These definitions I think I get, but they don't seem to fit the paragraphs above.
In the example I'd want the estimand to be "How much did California's inpatient spending change due to the new laws?"
in the post-treatment period
I'm not sure why this is italicized or included in the sentence- when else would one expect treatment to have an effect?
In this example, the target estimand might be a regression coefficient, ββ\beta, that quantifies the differential change in California spending after the new law compared to the change in Nevada spending. We could use the ordinary least squares estimator to get an estimate, ^ββ^\hat{\beta}, from observed data.
This is sensitive to the time interval around \(T_{0}\) that we choose to include in the analysis, which makes it seem to me like it's not helping to answer the question of how much the intervention helped.
Finally, we define a method to estimate this using data, such as a linear model fit to inpatient spending in California and Nevada before and after the law change.
I'm definitely all turned around by now - we estimate a question that's answerable from the data by building a model that approximates the data?
mathematical function (or algorithm)
just "function" would do, right?
.
animation could be clearer here, the nubs on the arrows are a little ugly and if the treated/control labels & graphs faded out, the brackets wouldn't have to fly so far.
covariates
I think the meaning here is ambiguous? We don't need a covariate to be continuous like it is in ANOVA, right?
estimand
This italics looks like estimand is being defined, but it isn't actually defined here, we just get an example. From context and Latin roots, I think it's "the thing we're estimating".
I'm also not clear how this question is statistical, or more statistical than the previous question - the change in the difference in spending in California vs. Nevada before and after the law changed seems like an observed fact.
estimate causal effects of non-randomized interventions
Followup homework: diff-in-diff looks closely related to regression discontinuity - can it be viewed as a generalization of it? Are the observations in this explanation transferable?
Observed outcome
hopefully this'll be clear from context later, but what is the output of \(Y\) exactly? A function from populations (eg CA, NV) to outcomes?
t=T0+1,…,T
I get from context that \(T_{0}\) is when the intervention happens, but what's \(T\)?
t=1,…,T0
this notation seems weird - why not \(T_{0}-n\)?
difference
subtract? divide? depends on the context?
I'll probably find out in a bit.
How to cut a bagel into two equal halves that are linked together.
Probably not a useful conjecture for getting at FUNC.
numer of 1
"number of 1s"? Not sure that makes sense, any row or column can have at most 3 1s in it, all rows and columns are small if \(n\geq 5\).
tridiagonal matrix
One which can be non-zero only on the main diagonal, the diagonal above, and the diagonal below.
Pointer to Erdos-Ko-Rado and an extension by Hilton & Milner - could look at those to get ideas for tools?
Plus some history that's better covered in the Bruhn & Schaudt survey.
A couple of strengthenings of FUNC by looking at something even more general than a partial order, that seem not to be true (but maybe could be adjusted and re-opened in useful ways).
Don't think this direction is super likely to give leverage on FUNC because it moves things to a land with verrrry little structure.
Let n={0,1,...,n−1}n={0,1,...,n−1}n=\left\{0,1,...,n-1\right\}, n>1n>1n>1, and ={Fi⊂n+1,i∈n}F={Fi⊂n+1,i∈n}\mathcal F=\left\{F_i\subset n+1,\, i\in n \right\} such that, for any i∈ni∈ni\in n, i∈Fi⊂i+1,(∗)
I find this notation very confusing, but breaking it down:
The rest of the question seems to imply the way to think about this is as a sort of truth table for an order? We have n items and each \(F_{i}\) tells you which elements are \(\leq i\).
It's reflexive because each \(F_{i}\) contains \(i\), antisymmetric because no \(F_{i}\) contains anything larger than \(i\), but it's not necessarily transitive. So it's not necessarily a partial order.
But you can take the partial order given by a lattice (or a intersection-closed family of sets) and represent it with one of these things.
without loss of genrality suppose thatyis the only member ofg+∗(B) such that{y}is not a component ofg∗(B)
this seems like we're losing a lot of generality! but "only" isn't necessary for the rest of the argument.
herefore, somex∈Ω+∗(B) must be Ω−abundant
I don't follow this leap.
≤2|Ω+∗(g+∗(B)∪{x})|
I don't see how this half of the inequality is motivated.
Ω∗(g+∗(B)) is complete
This isn't true under the assumptions we've made - the subcase 2 condition and the case 2 condition together means this isn't complete.
(Not that I'm clear why its completeness matters here. The previous statement, that everything in the powerset is present, is true, but because \(\Omega\) with each set intersected with some set is also union-closed, not because of completeness.)
CASE I:g+∗(B)∩Ak+16=∅
the thrust of case I is correct - if we have a minimum set that overlaps with everything in \(\Omega - A_{k+1}\) and it also overlaps with \(A_{k+1}\), then the minimum set that overlaps with everything in \(\Omega\) is the same size.
maximum
minimum? only? \(g_{*}^{+}\) only has one value.
≤
Does this mean to be an =?
Lemma 17
True, but written strangely. (No need for the ceiling operator on the right side, or the log operator - it's just n+1.)
This lemma is important because if for someB⊆U(Ω),Ω∗(B) is complete and|Ω+∗(B)|<⌈log2|Ω∗(B)|⌉then somex∈Bbelongs to Γ(Ω). This is due to the fact that if|Ω+∗(B)|<⌈log2|Ω∗(B)|⌉then Ω∗(B)contains more components than the number of elements in the power-set℘(Ω+∗(B)).
This doesn't follow. Each element of \(\Omega_{*}(B)\) is not, in general, distinct.
32
Think this inequality is the other way around by the earlier definition?
Lemma 16
In other words, with m elements in the ground set, the most sets we can have in the family is 2^m. Yep!
We seekB⊆U(Ω) such that Ω∗(B) is union-closed but at most onecomponent of Ω∗(B) is the emptyset.
Either there's an element or elements that are in every single set in \(\Omega\), or B is the whole universe of \(\Omega\).
5)
In general, this is just B (so long as everything in the ground set is in some member of \(\Omega\), which it might as well be, or you could pick a smaller ground set).
Ω+∗(B) is complete}
We haven't defined what it means for this thing to be complete, though. It's a set, not an ordered tuple of sets.
mply that there existsn∈Nsuch thatA, B, C∈℘[n]
I don't think this follows without a change to the definition of \(\lambda\).
λ(A, B) = 1 =⇒ |A|=|B|
This doesn't follow from the definition without some additional constraints. Some of the \(A_{i} \setminus C_{i}\) might coincide.
Ex: \((\emptyset),(1),(2),(12) \) reduces to \((\emptyset),(1)\) if you take \(C_{3}=C_{4}=(12)\).
LetA={A1, A2, ..., An} ∈℘[n] andB∈℘[∞]. Consider the functionλ:℘[∞]×℘[∞]→ {0,1}defined as such:λ(A, B) = 1 if there exists setsC1, C2, ..., Cn⊂U(A)(not all empty) such thatB={Ai\Ci:Ai∈A,1≤i≤n}; (16)otherwiseλ(A, B) = 0.
\(\lambda\) answers: is union-closed family B formed by taking A and taking away elements from some of its sets?
Lemma 9For eachn∈N,Ω(1)∈℘[n]andΩ(2)∈℘[n+ 1]there existsA, B⊂T∞such thatΩ(2)\{B}∈℘[n]andΩ(1)∪{A}∈℘[n+ 1]
If one union-closed family has one more set than another, then you can transform one into the other by adding or subtracting that set.
Lemma 8IfΩ∈℘[∞]then there existsA∈Ωsuch thatAisΩ−prime.
Union-closed families have a basis.
Ω−prime
Other writers call this a basis set.
Therefore, if we prove the FC for the elements of℘(n, m), m≥1 we can be sure that wehave established the result for all valid finite union-closed collections.
Or in other words, the labels that we stick on the ground set don't matter.
(9)
And this is "the set of all abundant elements in the union-closed family \(\Omega\)"
(8)
So this is just "all union-closed families".
T(F) = inf{1≤k≤n:F ∩ Mk6=∅}.
The size of the smallest set in \(\mathcal{F}\).
I believe that v6 of this paper has unpatchable flaws.