35 Matching Annotations
  1. Sep 2021
    1. Pθ(A)=∏j>ipaij(1−p)1−aij

      Maybe you describe this in an earlier section, but what does theta mean? Also I would make it more clear that this equation tells you the probability that A is a realization of an ER with probability p.

    2. (n2)

      do you ever mention what n-choose-2 means? You don't have to explain it, but just put somewhere the phrase "n-choose-2" so the reader has something to google. It is difficult to google the explanation of math terms if you don't know what they are called.

    3. Above, we visualize the network using a heatmap. The dark squares indicate that an edge exists between a pair of nodes, and white squares indicate that an edge does not exist between a pair of nodes.
      1. What is the point of the figure on the right? With 50 nodes it's so cluttered that it doesn't really provide any additional information.
      2. Is the relative distance between nodes in the figure on the right meaningful? Like, is 27 being on the perimeter meaningful?
      3. If you wanted to convey information with the figure on the right, why not have a network of only 10 nodes? That would be a lot easier for a reader to manually check. I'm just kind of trusting that this figure is correct, because checking 50 rows is a bit much.
    4. Given a probability and a number of nodes, we can easily describe the properties we would expect to see in a network if that network were ER. For instance, we know how many edges on average the nodes of an ER nework should have. We can reverse this idea, too: given a network we think might not be ER, we could check whether it’s different in some way from a network which is ER.

      This is phrased strangely, below is an attempt at cleaning it up:

      "Given a network, we can use the number of nodes to try and find a probability which would result in a similar ER network. Similarly, given a network we think might not be ER, we can use the simple properties of an ER network to check."

    1. network model

      You are defining significant vocab in sections that you said non-technical people might want to skip. Will these terms only appear in technical sections? Or will people who skipped this not know what's going on later?

    2. Throughout many of the succeeding sections, we will attempt to make the content accessible to readers with, and without, a more technical background. To this end, we have added sections with trailing asterisks (*). While we believe these sections build technical depth, we don’t think they are critical to understanding many of the core ideas for network machine learning. In contrast with unstarred sections, these sections will assume familiarity with more advanced mathematical and probability concepts.

      I don't like how this is set up. As a reader, the idea of skipping entire sections that you have claimed are important kind of sucks. I'd still like to know what Equivalence Classes are, even if I wouldn't understand the math. I suggest for each technical section having a paragraph summarizing the ideas in a way a non-technical person would understand, then following with the full explanation

    3. this model.

      What is "this model"? How does it make these networks? You just say that a model "can make networks". Is the same model used for all of these figures? What makes it simple?

    4. undirected (meaning, edges connect nodes both ways)

      Define this better. "Connecting both ways" is hard to envision. I'd mention the figure with "Three small, simple networks" as being being examples.

    1. Utility: The model of interest possesses the level of refinement or complexity that we need to answer our scientific question of interest, Estimation: The data has the level of breadth to facilitate estimation of the parameters of the model of interest, and Appropriateness: The model is appropriate for the data we are given.

      make each item in list a full sentence (to be consistent with the previous list). Having it all be one sentence, but capitalize several times, is weird to read.

    2. We learn from uncertainty and simplicity: When we do statistical inference, it is rarely the case that we prioritize a complex, convoluted model that mirrors our data suspiciously closely. Instead, we are usually interested in knowing how faithfully a simpler, more generally applicable model might describe the network. This relates directly to the concept of the bias-variance tradeoff from machine learning, in which we prefer a model which isn’t too specific (lower bias) but still describes the system effectively (lower variance).

      Give social media example (you did for the first two)

    3. deterministic, rather than stochastic

      not sure if the difference between these two will be covered earlier, but for the sake of clarity I would put what the stochastic version of the social network would look like

    4. In the context of network science, this means that even if we have a model we think describes our network very well, it is not the case that the model we select actually describes the network precisely and correctly.

      I'd rephrase this in a way that distinguishes more between having a model that describes a representation well, and one that describes the governing random network well

    5. o determine the probability of the coin landing on heads, we will assume that the outcome of the coin is random, and that each time we toss the coin, we are realizing a random variable xx\mathbf{x}.

      This example would help to understand realizations better, I would include it/a similar example in the previous paragraph

    6. Our realizations are not nnn disparate observations in ddd-dimensions; a realization in network machine learning is the full network itself, consisting of nodes, edges, and potential network attributes.

      I'm having a hard time understanding what a "realization" is from this and the previous paragraph. Above you say how it is like a sample of an existing network, but here you say that it is the full network?

  2. Apr 2021
    1. You combine these all into a single classifier

      What does this look like? As in, do each of the classifiers contribute a number that is averaged? Is it like random forests? An example would be really helpful in understanding.