70 Matching Annotations
  1. Nov 2022
    1. latent variable

      latent variables are variables that you cannot be observed

    2. One way to detectoverfitting inpractice is toobserve that themodel has lowtraining risk buthigh test risk duringcross validation

      overfitting = high acc during training and low acc during testing

    3. Model Fitting

      how well a model is learning

    4. cross-validation

      technique used to evaluate how well your model is doing

    5. validation se

      I have always been confused by the validation set. It is a set used to provide a glimpse of how your model will react to the data. Usually you take a portion of the training set to create the validation set

    6. egularization

      technique used to reduce overfitting

    7. overfitting

      overfitting = during training the error is small where as during testing it is large

    8. Another phrasecommonly used forexpected risk is“population risk”

      From what I know, population risk is the number of individuals at risk. Is it samething as expected risk ?

    9. independent and identicallyindependent andidenticallydistributed distributed

      what is a set of example here? I am thinking of it as features rather than anything else. But features are dependent upon one another so I am not sure what this means

    10. Affine functions areoften referred to aslinear functions inmachine learning

      affine faction = linear function

    11. Training or parameter estimation

      adjust predictive model based on training data.

      In order to find good predictors do one of two things: 1) find the best predict based on some measure of quality (known as finding a point estimate) and 2) using bayesian inference

    12. Prediction or inference

      predict on unseen test data. 'inference' can mean prediction for non-probabilist models or parameter estimation

    13. goal of learning is to find a model and its corresponding parame-ters such that the resulting predictor will perform well on unseen data

      important

    14. noisy observation

      real-life data is always noisy

    15. example or data point

      I thought rows were observations or instances?

    16. we do not expect the iden-tifier (the Name) to be informative for a machine learning task

      This is a good reminder that only query the columns or data that are relevant to the exercise

    17. features, attributes, or covariates
    18. What dowe mean by good models?

      This is a great question. I usually think of models as algorithms

    19. good models should perform well on unseendata

      The main idea in implementing a machine learning model

  2. Oct 2022
    1. This derivation iseasiest tounderstand bydrawing thereasoning as itprogresses.

      The reasoning of the derivative?

    2. Exampleof a convex set

      an easy example to identify convex sets. One way to determine a convex sex is to keep in mind that if at given given point within the set if a line segment is within the set it is convex

    3. Lagrange multiplier

      Lagrange aims to find the local minima and maxima of function

    4. The step-size is alsocalled the learningrate.

      when implementing a neural net, the learning rate is a hyper parameter that controls how much to adjust the weights by w.r.t to the gradient

    5. We use theconvention of rowvectors forgradients

      so a matrix? or just rows like this : [a b c ]?

    6. minx f (x)

      This is important. For all optimization problems, the end goal is to minimize the function

    7. Linear Program

      interesting example. Seeing how the linear programs can be plotted.

    8. elative frequencies of eventsof interest to the total number of events that occurred

      isn't this the definition of mean?

    9. abducted by aliens

      lol

    10. Theorem 4.3. A square matrix A ∈ Rn×n has det(A) 6 = 0 if and only ifrk(A) = n. In other words, A is invertible if and only if it is full rank

      refer to section 2.6.2 for rank definition

    11. pA(λ) := det(A − λI)
    12. (−1)k+j det(Ak,j )a cofactor
    13. det(Ak,j ) is calleda minor
    14. det(A) =n∑k=1(−1)k+j akj det(Ak,j )
    15. Adding a multiple of a column/row to another one does not changedet(A)
    16. Swapping two rows/columns changes the sign of det(A)
    17. det(λA) = λn det(A)
    18. If A is regular (invertible), then det(A−1) = 1det(A)
    19. (4.7)

      determinant of 3x3 matrix

    20. (4.6)

      determinant of 2x2 matrix

    21. is invertibleif and only if det(A) 6 = 0

      Invertible: det(A) \(\neq 0\)

    22. determinant of a square matrix A ∈ Rn×n is a function that maps A

      determinant

  3. Sep 2022
    1. rotation matrix

      coordinates of rotation in the form on basis vectors

    2. rotation

      linear mapping that rotates a plan by angle \(\theta\) with respect to origin

      if angle \(\theta\) > 0 rotate counterclockwise

    3. orthogonal basis

      $$<b_{i}, b_{j}> = 0, i \neq j$$

    4. orthogonal complement

      Let W be a subspace of a vector space V. Then the orthogonal complement of W is also a subspace of V. Furthermore, the intersection of W and its orthogonal complement is just the zero vector.

    5. normal vector

      vector with magnitude 1, \(||w|| = 1\) and is perpendicular to the surface

    6. Gram-Schmidt process

      concatenate basis vector (non-orthogonal and unnormalized) into a matrix, apply gaussian eliminate and obtain an orthonormal basis

    7. Orthonormal Basi

      basis vectors = subset of vectors linearly independent if orthonormal basis -> orthogonal basis

    8. 3.32

      distance of orthogonal matrix

    9. ‖Ax‖2 = (Ax)>(Ax) = x>A>Ax = x>Ix = x>x = ‖x‖2

      this is an important proof of dot product for an orthogonal matrix

    10. Orthogonal Matrix

      $$AA^T = I = A^TA \Rightarrow A^{-1} = A^T$$ orthonormal columns

    11. 〈x, y〉

      this is equal to 1, which does not meet the requirement of orthogonality

    12. (Orthogonality

      if \(<x,y> = 0\) and \(||x|| = ||y|| = 0\)<br /> any two lines that are perpendicular - 90 degree angle

    13. cos ω

      used to find angle between vector

    14. x, y) 7 → d(x, y)

      if x and y are two points in a vector space then, you can find the distance

    15. d(x, y) := ‖x − y‖ =√〈x − y, x − y

      so a Euclidean distance is a distance from point x to point y, so the shortest path (a straight line). I don't understand the difference between distance and Euclidean distance. Isn't distance also a dot product? how would you do the calculation?

    16. inner product returns smallervalues than the dot product if x1 and x2 have the same sig

      this is interesting

    17. atisfies (3.11) is called symmetric, positive definite

      symmetric positive definite

    18. (3.9)

      The inner product must be positive definite, symmetric and bilinear. test for inner product: let v = (1,2) -> <v,v> = (1)(1) - (1)(2) - (2)(1) + 2(2)(2) = 1 - 2 -2 + 8 = -3 + 8 = 5 (symmetric, bilinear and positive)

      test for dot product: as per (3.5) the right side does not equal the left side

    19. se the dot product defined in (3.5), we call(V, 〈·, ·〉) a Euclidean vector space

      euclidean vector space

    20. The pair (V, 〈·, ·〉) is called an inner product space

      inner product space

    21. positive definite, symmetric bilinear mapping Ω : V ×V → R is calledan inner product on V
    22. positive definite if positive definite∀x ∈ V \{0} : Ω(x, x) > 0 , Ω(0, 0) = 0
    23. symmetric if Ω(x, y) = Ω(y, x)

      a symmetric matrix was: (A^(-1))^T = (A^(T))^-1

    24. x>y =n∑i=1xiyi

      inner product and dot product interchangeable here

    25. 3.4

      distance from the origin of a vector

    26. Positive definite: ‖x‖ > 0 and ‖x‖ = 0 ⇐⇒ x = 0
    27. Triangle inequality: ‖x + y‖ 6 ‖x‖ + ‖y‖
    28. Absolutely homogeneous: ‖λx‖ = |λ|‖x‖
    29. A norm on a vector space V is a function