 Mar 2020

cs231n.github.io cs231n.github.io

Since Neural Networks are nonconvex
neither convex neither nonconvex
"The fact that J has multiple minima can also be interpreted in a nice way. In each layer, you use multiple nodes which are assigned different parameters to make the cost function small. Except for the values of the parameters, these nodes are the same. So you could exchange the parameters of the first node in one layer with those of the second node in the same layer, and accounting for this change in the subsequent layers. You'd end up with a different set of parameters, but the value of the cost function can't be distinguished by (basically you just moved a node, to another place, but kept all the inputs/outputs the same)."
Tags
Annotators
URL

 Apr 2018

wiki.c2.com wiki.c2.com

ConvexHull
In mathematics, the convex hull or convex envelope of a set X of points in the Euclidean plane or in a Euclidean space (or, more generally, in an affine space over the reals) is the smallest convex set that contains X. For instance, when X is a bounded subset of the plane, the convex hull may be visualized as the shape enclosed by a rubber band stretched around X. Wikipedia
Tags
Annotators
URL

 Jul 2016

arxiv.org arxiv.org

Ax=b
Apparently \(Ax = b\) is not required, but is used as a technical prop in subsequent proof construction.
If no linear constraints are found, either A, b can be viewed as zero, or can be viewed as the smallest affine set that includes S. In both cases, this effectively makes the constraint trivial.

A
How does one go about choosing (a good) A, especially since it seems to not be necessary to have a nontrivial (A,b) pair?

x()=2Feas, that is,1(x)< , establishing the claim
So it appears that we don't need to get feasible points at each iteration?
Answer: correct, this is pointed out later in a couple of pages, where it is stated \(x\) need not be feasible, but \(x'\) will be.

int(dom(f))
Why isn't \(x \in dom(f)\) sufficient?
Ah, I think it is sufficient for \(t' = f(x')\), but we also need \(x \in int(dom(f))\) for the subsequently mentioned subdifferentials.

(x0;t0)2bdy(dom(f))
Confirmed that this should be \(bdy(epi(f))\), not \(bdy(dom(f))\)

because(x;t) =1(x)
Why?

a value we now know to be positive and nite
Where is the proof that if \( \alpha_1 (x) \) is infinite, then \( \alpha_2 (x,t) \) is finite?
I now see that it starts at the beginning of page 3, but without any forward reference it seems that it is being skipped.

Approximating2(x;t) is the same as approximating the solutionto() = 0,
Not completely clear to me why this is the case.

Assume eis a known feasible point contained in int(S\dom(f)).
How to pick? Perhaps a reference to strategies, or note that strategies will be discussed later.
