heatmap
range -1 +1
heatmap
range -1 +1
Since most of these metrics already exist in networkx, youโll just pull from there. You can check the networkx documentation for details.
use graspologic
community
estimated community
features
these features
for reference, gives an indication of how clustered the network is, and works by counting the proportion of triangles out of all edges that share a node.
refer to previous definition
triangles
clarify earlier that we always mean closed after
statistics
for the 4 statistics we show
103501035010^{350}
cite # atoms in universe, number of chess games, number of go games on a standard go board
How do you define a distance between one node and another node?
How do you divide a network by the number 6
largest
mention strongly and weakly connected (in a sentence)
๐๐ผ,๐ต๐พ),(๐ต๐พ,๐๐ป),(๐๐ป,๐),(๐๐ป,๐ต๐),(๐,๐ต๐
use 1 letter acronyms for each borrough
Island
,
๐ด
for undriectd graphs
figure
move MH
see
change layout so they are not linear
degree
do directed case too
matrix
do directed case too
๐๐๐๐๐๐(๐)=โ๐=1๐๐๐๐=โ๐:๐๐๐=1๐๐๐+โ๐:๐๐๐=0๐๐๐=โ๐:๐๐๐=11+0
remove
adjacency
weighted
๐jj
for example
matrix
hollow
trite
why?
In the context of this book, you will usually only worry about the undirected case, or when the presence of an arrow implies that the other direction exists, too. A network is undirected if an edge between node ๐ii and node ๐jj implies that node ๐jj is also connected to node ๐ii. For this reason, you will usually omit the arrows entirely, like you show below:
we
red
super not salient
you look
we see
The
An Adjacency Matrix not The Adjacency Matrix
Matrix
there are n! different adjacency matrices, and we are looking at 1 of them.
Laplacian
mention directed graphs
and mention all the ones in graspologic
This lets you make inferences with ๐ฟ๐๐๐๐๐๐๐๐ง๐๐LnormalizedL^{normalized} that you couldnโt make with ๐ฟLL.
true?
code
there should be a function in graspologic to convert an adjacency to a laplacian and back.
below
?
AttributeError Traceback (most recent call last)
fix
4.1.2. The Incidence Matrixยถ Instead of having values in a symmetric matrix represent possible edges, like with the Adjacency Matrix, we could have rows represent nodes and columns represent edges. This is called the Incidence Matrix, and itโs useful to know about โ although it wonโt appear too much in this book. If there are ๐nn nodes and ๐mm edges, you make an ๐ร๐nรmn \times m matrix. Then, to determine whether a node is a member of a given edge, youโd go to that nodeโs row and the edgeโs column. If the entry is nonzero (111 if the network is unweighted), then the node is a member of that edge, and if thereโs a 000, the node is not a member of that edge. You can see the incidence matrix for our network below. Notice that with incidence plots, edges are (generally arbitrarily) assigned indices as well as nodes. Click to show from networkx.linalg.graphmatrix import incidence_matrix cmap = sns.color_palette("Purples", n_colors=2) I = incidence_matrix(nx.Graph(A)).toarray().astype(int) fig, axs = plt.subplots(1, 2, figsize=(12,6)) plot = sns.heatmap(I, annot=True, linewidths=.1, cmap=cmap, cbar=False, xticklabels=True, yticklabels=True, ax=axs[0]); plot.set_xlabel("Edges") plot.set_ylabel("Nodes") plot.set_title("Incidence matrix", fontsize=18) ax2 = nx.draw_networkx(G, with_labels=True, node_color="tab:purple", pos=pos, font_size=10, font_color="whitesmoke", arrows=True, edge_color="black", width=1, ax=axs[1]) ax2 = plt.gca() ax2.text(.24, 0.2, s="Edge 1", color='black', fontsize=11, rotation=65) ax2.text(.45, 0.01, s="Edge 0", color='black', fontsize=11) ax2.set_title("Layout plot", fontsize=18) sns.despine(ax=ax2, left=True, bottom=True) Copy to clipboard /tmp/ipykernel_4749/905225500.py:4: FutureWarning: incidence_matrix will return a scipy.sparse array instead of a matrix in Networkx 3.0. I = incidence_matrix(nx.Graph(A)).toarray().astype(int) Copy to clipboard When networks are large, incidence matrices tend to be extremely sparse โ meaning, their values are mostly 0โs. This is because each column must have exactly two nonzero values along its rows: one value for the first node its edge is connected to, and another for the second. Because of this, incidence matrices are usually represented in Python computationally as scipyโs sparse matrices rather than as numpy arrays, since this data type is much better-suited for matrices which contain mostly zeroes. You can also add orientation to incidence matrices, even in undirected networks, which weโll discuss next. 4.1.3. The Oriented Incidence Matrixยถ The oriented incidence matrix is extremely similar to the normal incidence matrix, except that you assign a direction or orientation to each edge: you define one of its nodes as being the head node, and the other as being the tail. For undirected networks, you can assign directionality arbitrarily. Then, for the column in the incidence matrix corresponding to a given edge, the tail node has a value of โ1โ1-1, and the head node has a value of 000. Nodes who arenโt a member of a particular edge are still assigned values of 000. Weโll give the oriented incidence matrix the name ๐NN. Click to show from networkx.linalg.graphmatrix import incidence_matrix cmap = sns.color_palette("Purples", n_colors=3) N = incidence_matrix(nx.Graph(A), oriented=True).toarray().astype(int) fig, axs = plt.subplots(1, 2, figsize=(12,6)) plot = sns.heatmap(N, annot=True, linewidths=.1, cmap=cmap, cbar=False, xticklabels=True, yticklabels=True, ax=axs[0]); plot.set_xlabel("Edges") plot.set_ylabel("Nodes") plot.set_title("Oriented Incidence matrix $N$", fontsize=18) plot.annotate("Tail Node", (.05, .95), color='black', fontsize=11) plot.annotate("Head Node", (.05, 1.95), color='white', fontsize=11) ax2 = nx.draw_networkx(G, with_labels=True, node_color="tab:purple", pos=pos, font_size=10, font_color="whitesmoke", arrows=True, edge_color="black", width=1, ax=axs[1]) ax2 = plt.gca() ax2.text(.24, 0.2, s="Edge 1", color='black', fontsize=11, rotation=65) ax2.text(.45, 0.01, s="Edge 0", color='black', fontsize=11) ax2.set_title("Layout plot", fontsize=18) sns.despine(ax=ax2, left=True, bottom=True) ax2.text(-.1, -.05, s="Tail Node", color='black', fontsize=11) ax2.text(.9, -.05, s="Head Node", color='black', fontsize=11) Copy to clipboard /tmp/ipykernel_4749/2332982598.py:4: FutureWarning: incidence_matrix will return a scipy.sparse array instead of a matrix in Networkx 3.0. N = incidence_matrix(nx.Graph(A), oriented=True).toarray().astype(int) Copy to clipboard Text(0.9, -0.05, 'Head Node') Copy to clipboard Although we wonโt use incidence matrices, oriented or otherwise, in this book too much, we introduced them because thereโs a deep connection between incidence matrices, adjacency matrices, and a matrix representation that we havenโt introduced yet called the Laplacian. Before we can explore that connection, weโll discuss one more representation: the degree matrix.
put in appendix
ax2.text(0, 0.2, s="Nodes 0 and 2 \nare connected", color='black', fontsize=11, rotation=63)
add nodes 0 and 1 are connected
The most
A
1.1.2. Viewing Networks Statisticallyยถ
Network machine learning v. network data science
1.1. What Is A Network?ยถ
What is Network Machine Learning
features
give example, but then show something else
For instance, each node might have a set of features attached to it: extra information that comes in the form of a table. For instance
2 for instances
refer to this in preface, don't put it here
Acknowledgements
add all the other crap he included
models
i wouldn't use the word 'model' in this book
The theoretical underpinnings of the techniques described in the book
i'd put in appendix
network data science
maybe make a venn diagram including networks, data science, and machine learning
Network Machine Learning and Youยถ
start with something awesome that happened in the world because of network data science?
a statistical object
avoid 'statistics' everywhere
This book assumes you know next to nothing about how networks can be viewed as a statistical object
about how you can learn from network data.
not 'viewed as a stats object'
With the nodes as employees, edges as team co-assignment between employees, and node covariates as employee performance, you can isolate groups within your company which are over or underperforming employee expectations.
move to end before brains
So
delete 'so'
Representations of
dos and donts
, but are often more generalizable
remove
Connection
provide intuition or move to appendix
๐nn nodes
permutation invariant
this book
first in sentence
2
1
without
so we estimate them?
๐ฯ\rho
let's mention Rho too
5.5.3. Correlated Network Modelsยถ
put time-series after this
๐ (1)R(1)R^{(1)} and
let these be the identity matrix. i think you can just QR these, and then inverse QR R^3
score matrices
clarify that they will be dfeind below
Letโs try this first by looking at the latent position matrices for ๐(1)A(1)\mathbf A^{(1)} and ๐(2)A(2)\mathbf A^{(2)} from the random networks for Monday and Tuesday first:
they are implicit and not necessarily the same, rather, only X'X is the same.
๐
M
๐โคP=XXโคP = XX^\top!
"!"
Pseudoinverse
go end to end
array([0.74745577, 0.49019859])
too many sig digs
f you think about it, however, you can think of this adjacency vector as kind of an estimate for edge probabilities. Say you sample a nodeโs adjacency vector from an RDPG, then you sample again, and again. Averaging all of your samples will get you closer and closer to the actual edge connection probabilities.
not quite
The average values were .775 and .149 - so, pretty close!
say we computed
took a long time
computtion
๐ธ๐ ERER
ER_n(p) or ER random network every time
their paper is called โOn a two-truths phenomenon in spectral graph clusteringโ - itโs an interesting read for the curious)
up
- in 2018 -
,
Adjacency
combine with the below
useful information
careful
them
explicitly refer to the figure
Heck
goes earlier
Laplacian
never write 'the laplacian' always write "normalize" (etc) Laplacian
two
is this a rank 2 matrix?
Stochastic Block Model
no caps
Spectral Embedding
no caps
โโคโฆโฅโฅ[โ๐ขโย ๐1โ]+๐2โกโฃโขโขโ๐ขโย 2โโคโฆโฅโฅ[โ๐ขโย ๐2โ]+...+๐๐โกโฃโขโขโ๐ขโย ๐โโคโฆโฅโฅ[โ๐ขโย ๐๐โ]
confusing
-
close
in MSE
ninety-degree
maybe not so correct, eg, 90 degrees in high dimensions does not specify the other angles?
Singular Value Decomposition
lower case
and another which rotates them bac
i wouldn't say this
[x11"x22โฑ"xnn]\begin{align*} \begin{bmatrix} x_{11} & & & " \\ & x_{22} & & \\ & & \ddots & \\ " & & & x_{nn} \end{bmatrix} \end{align*}
makes me think this is diagonal
youโll
the alg
submatrices
not them
break a single matrix
decompose a matrix
or factorize?
default
for undirected graphs
Laplacians
such a thing as directed laplacians
The
comes out of nowhere?
started
present tense
The
make more high level
5.2.7. Network models for networks which arenโt simpleยถ
make part of ER section
whether youโre using a neural network, or a decision tree, or whether your goal is to classify observations or to predict a value using regression -
those 'or's are not parallel
machine learning
CS and physics
6.3.1. The Coin Flip Exampleยถ
starred? maybe mention MLE in section title?
๐โpโp^*
ptilde
two plots look almost nothing alike
i wouldn't say that
โ๐(๐=๐ด)=โ(๐11=๐11,๐12=๐12,...,๐๐๐=๐๐๐)=โ๐,๐โ๐(๐๐๐=๐๐๐)
don't forget punctuation
(in network statistics, often just one)
not quite
we can never hope to understand the true distribution of ๐A\mathbf A.
strong
โจ๐ฅโย โ๐ฆโย ,๐ฅโย โ๐ฆโย โฉ
remove
With ๐ฅโย xโ\vec x defined as above in the sum example, we would have that: โ๐=13๐ฅ๐=6.12โi=13xi=6.12\begin{align*} \prod_{i = 1}^3 x_i = 6.12 \end{align*} if we were to use ๎ต={1,3}I={1,3}\mathcal I = \{1,3\}, then: โ๐โ๎ต๐ฅ๐=3.4
redundant
ted to multiply all the elements of ๐ฅโย xโ\vec x, we would write: โ๐=1๐๐ฅ๐=๐ฅ1ร๐ฅ2ร...ร๐ฅ๐โi=1dxi=x1รx2ร...รxd\begin{align*} \prod_{i = 1}^d x_i = x_1 \times x_2 \times ... \times x_d \end{align*} Where รร\times is just multiplication like you are probably used to. Again, we have the exact same indexing conventions, where: โ๐โ[๐]๐ฅ๐=โ๐=1๐๐ฅ๐=๐ฅ1ร๐ฅ2ร...ร๐ฅ๐
redundant
In the general case, for a set ๎ฟS\mathcal S, we would say that ๐โ๎ฟ๐ร๐SโSrรcS \in \mathcal S^{r \times c} if (think this through!) for any ๐โ[๐]iโ[r]i \in [r] and any ๐โ[๐]jโ[c]j \in [c], ๐ ๐๐โ๎ฟsijโSs_{ij} \in \mathcal S
overkill?
. We Like previously, there are two types of RDPGs: one in which ๐XX is treated as known, and another in which ๐XX is treated as unknown.
delete
Symbolically
this is just 1 word.
such
such as
300300300
be consistent. let's stick with 100 students
, since they are undirected; if we relax the requirement of undirectedness (and allow directed networks) ๐ตBB no longer need be symmetric.
delete
simple
undirected
Imagine that we are flipping a fair single coin. A fair coin is a coin in which the probability of seeing either a heads or a tails on a coin flip is 1212\frac{1}{2}. Letโs imagine we flip the coin 202020 times, and we see 101010 heads and 101010 tails. What would happen if we were to flip 222 coins, which had a different probability of seeing heads or tails? Imagine that we flip each coin 10 times. The first 10 flips are with a fair coin, and we might see an outcome of five heads and five tails. On the other hand, the second ten flips are not with a fair coin, but with a coin that has a 4545\frac{4}{5} probability to land on heads, and a 1515\frac{1}{5} probability of landing on tails. In the second set of 101010 flips, we might see an outcome of nine heads and one tails. In the first set of 20 coin flips, all of the coin flips are performed with the same coin. Stated another way, we have a single group, or a set of coin flips which are similar. On the other hand, in the second set of twenty coin flips, twenty of the coin flips are performed with a fair coin, and 10 of the coin flips are performed with a different coin which is not fair. Here, we have two clusters of coin flips, those that occur with the first coin, and those that occur with the second coin. Since the first cluster of coin flips are with a fair coin, we expect that coin flips from the first cluster will not necessarily have an identical number of heads and tails, but at least a similar number of heads and tails. On the other hand, coin flips from the second cluster will tend to have more heads than tails.
but a simplified version of this after the description.
happens
but below equation on 1 line
ER network
ER random network
nework
network
Also, we let ๐๐๐=0aii=0\mathbf a_{ii} = 0, which means that all self-loops are always unconnected
Also, we assume it is not possible for anyone to be friends with themselves, which means that a_ii = NaN for all i.
When ๐>๐i>ji > j, we allow ๐๐๐=๐๐๐aij=aji\mathbf a_{ij} = \mathbf a_{ji}. This means that the connections across the diagonal of the adjacency matrix are all equal, which means that we have built-in the property of undirectedness into our networks
too complicated.
We assume here that edges are undirected meaning that if there is an edge from i to j then there is also an edge from j to i
Erdรถs Rรฉnyi network
no such thing as an ER network
The Erdรถs Rรฉnyi model formalizes this relatively simple model with a single parameter:
change something, either this sentence or the table, iid belongs somewhere
model
use a different word
If ๐ดAA and ๐ดโณAโณA'' are members of different equivalence classes; that is, ๐ดโ๐ธAโEA \in E and ๐ดโณโ๐ธโฒAโณโEโฒA'' \in E' where ๐ธ,๐ธโฒE,EโฒE, E' are equivalence classes, then โ๐(๐ด)โ โ๐(๐ดโณ)Pฮธ(A)โ Pฮธ(Aโณ)\mathbb P_\theta(A) \neq \mathbb P_\theta(A'').
delete
(๐ด
otherwise, ...
๐ธEE
use different notation
For the case of one observed network ๐ดAA, an estimate of ๐ฮธ\theta (referred to as ๐ฬย ฮธ^\hat\theta) would simply be for ๐ฬย ฮธ^\hat\theta to have a 111 in the entry corresponding to our observed network, and a 000 everywhere else. Inferentially, this would imply that the network-valued random variable ๐A\mathbf A which governs realizations ๐ดAA is deterministic, even if this is not the case.
MLE
variaable
typo
parameters
distribution
,
above figure goes before section header
To
directed with loops for simplicity
think
first think about nodes
first
first edge?
edges are not ordered
Fundamentally
and it borrows strength
Now, the only question is how to actually pull the separate latent positions for each network from this matrix.
clarify
big
do we want to use 4 different symbols for the 4 different numbers?
rows
blocks
you donโt have to rotate or flip your points to line them up across embeddings.
assuming no multiplicity of eigenvalues
big circle
big red circle. i prefer no outlines
bad
not very bad
โ
remove '-' here
rotations of each other
i don't think this is english
none of the embeddings are rotations of each other
not quite right. talk to pedigo.
two
d_i
So
ask pedigo
In this example they have eight (the number of columns in our combined matrix), but remember that in general weโd have ๐ร๐mรdm \times d. T
the problem is that they are not in the same subspace
averaging
where did .5 go
great
often a great idea (for example....)
node
these embeddings should look different
he edges of
remove
Remember
new section heading: averaging
Grouping the Networks Separatelyยถ
different ways to do joint analysis
normal
removr
Below, we turn our list of networks into a 3-D numpy array, and then do the above multiplication to get a new 3D numpy array of scores matrices. Because we embedded into two dimensions, each score matrix is 2ร22ร22 \times 2, and the four score matrices are โslicesโ along the 0th axis of the numpy array.
generate multiple networks
Any
explain how to get each V first
Combining the networksยถ
add omni as another one
Combining the classifiersยถ
remove
Multiple Adjacency Spectral Embeddingยถ
3.5.1
communities
we are learning representations and we can do whatever we want with them, we illustrate with clasification just to show something,
product
find a better word
Model
clarify label swapping
Adjacency Spectral Embedding
only under some strict models. clarify
For example, say you have nine networks of normal brains and nine networks of abnormal brains.
nine networks from each population.
To illustrate this point, we simulate 18 brains using the below code.
The normal and abnormal brains are all drawn from stochastic block models, but they come from different distributions. If you look at the code, youโll see that the normal brains were set up to have strong connections within communities, whereas the abnormal brains were set up to have strong connections between communities. Below is a plot of the adjacency networks for every normal and every abnormal brain weโve created.
this paragraph goes below this figure?
types
groups
create
estimate
algorithm
please no
โ
(0,1)
(๐ฅโย โค๐๐ฅโย ๐
define transpose notation
2
maybe 1
We will call the matrix ๐PP the probability matrix whose ๐๐กโithi^{th} row and ๐๐กโjthj^{th} column is the entry ๐๐๐pijp_{ij}, as defined above. Stated another way, ๐=(๐๐๐)P=(pij)P = (p_{ij})
this goes before any of the actual models, it is the fundamental object associated with any independent edge model
๐
for each model, write out the P matrix.
We showed above the model for the random node-assignment variable, ๐๐โฯฯโ\vec{\pmb \tau}, above.
remove
process
variable
๐11
connect to tau better. and remove the \mathcal{E}
R.V.
write out random variable
50
what happend to the other 100?
Likelihood
can you also write out
P = tau' B tau
use this later to show that you can always rewrite the above as
P = x' x, where x = tau * sqrt(B)
โ
little k
๐โ
not a parameter, given realization of a random variable
known
forgot an asterisk
Logically
not logic
unique
remove
cluster
group
222
write out numbers < 10
๐ธ๐EiE_i,
maybe add the union of these equivalence classes is A_n
1. and 2.
remove periods
if it is totally unambiguous what ๐ฮธ\theta refers to,
remove
As there are no diagonal entries to our matrix ๐ดAA, this means that there are ๐2โ๐=๐(๐โ1)n2โn=n(nโ1)n^2 - n = n(n - 1) values that are definitely not 000 in ๐ดAA, since there are ๐nn diagonal entries of ๐ดAA (due to the fact that it is hollow)
wrong, might not be 0
don't talk about AdjMat as the network
To summarize
before this, put the 'formally, ...." stuff
๐ดย is anย ๐ร๐ย matrix withย 0s andย 1s,๐ดย is symmetric,๐ดย is hollow
this should be formal if you write 'formally'
a_ij \in {0,1}, a_ij = a_ji, a_ii = 0, \forall i,j \in [n]
|๎ฑ||E||\mathcal E|
m, and then use it, not m(A)
This means that all the networks are hollow.
hollow is a property of the adjacency matrix
right
appropriate/ justified / reasonable
learn
explore and exploit
If you care about whatโs under the hood you should also have a reasonable understanding of college-level math, such as calculus, linear algebra, probability, and statistics.
add, 'but you won't need it to understand the content, except certain sections that we will highlight as 'advanced material'
http://docs.neurodata.io/graph-stats-book/intro.html
hyperlink
Johns Hopkins University
the NeuroData lab at
for example
capitalize.
for each example, define the nodes and edges
Network Machine Learning
no caps