24 Matching Annotations
  1. Last 7 days
    1. If we’re not careful, learning algorithms will generalize based on the majority culture, leading to a high error rate for minority groups. Attempting to avoid this by making the model more complex runs into a different problem: overfitting to the training data, that is, picking up patterns that arise due to random noise rather than true differences. One way to avoid this is to explicitly model the differences between groups, although there are both technical and ethical challenges associated with this.

      Challenging to address high error rate for minority groups

    2. machine learning might perform worse for some groups than others is sample size disparity. If we construct our training set by sampling uniformly from the training data, then by definition we’ll have fewer data points about minorities. Of course, machine learning works better when there’s more data, so it will work less well for members of minority groups, assuming that members of the majority and minority groups are systematically different in terms of the prediction task.

      minorities under-represented in data used for training leads to performance issues

    3. A telling example of this comes from machine translation.

      Would this issue persists if two human translators (one translating English -> Turkish, the other translating Turkish -> English) were involved instead?

    4. Absent specific intervention, machine learning will extract stereotypes, including incorrect and harmful ones, in the same way that it extracts knowledge.

      ML emphasizes societal stereotypes.

    5. photography technology involves a series of choices about what is relevant and what isn’t, and transformations of the captured data based on those choices.

      Further discussions can be had about cameras that capture infra-red (thermal/night vision) and ultra-violet or x-ray wavelengths. Often these images are "colorized" based on the preference of some graphic artist (e.g., colours added to black-hole images, Cosmic micro-wave background, etc.). The colour scheme used can also be subjective.

    6. Race is not a stable category; how we measure race often changes how we conceive of it, and changing conceptions of race may force us to alter what we measure.

      Another, perhaps more contemporary, example has to do with gender. Gender-based data from early 2000s and mid 2020's would look very different!

    7. In fact, measurement is fraught with subjective decisions and technical difficulties.

      "How to Lie with Statistics" is a classic book that explores measurement problems as well as display issues.

    8. Kate Crawford points out that the data reflect the patterns of smartphone ownership, which are higher in wealthier parts of the city compared to lower-income areas and areas with large elderly populations.

      I have similar gripe about the multi-factor authentication systems (MFAs) being used ubiquitously.

    9. The figure below shows the stages of a typical system that produces outputs using machine learning.

      This directed graph is similar to ones used for mathematical modelling, except "Individuals" usually replaced by "Computations"

    10. Amazon argued that its system was justified because it was designed based on efficiency and cost considerations and that race wasn’t an explicit factor.

      Defending prejudiced systems often uses blindness as a resort

    11. younger defendants are statistically more likely to re-offend, judges are loath to take this into account in deciding sentence lengths, viewing younger defendants as less morally culpable.

      Important example of using experience and intuition over data

    12. But there are serious risks in learning from examples. Learning is not a process of simply committing examples to memory. Instead, it involves generalizing from examples: honing in on those details that are characteristic of (say) cats in general, not just the specific cats that happen to appear in the examples. This is the process of induction: drawing general rules from specific examples—rules that effectively account for past cases, but also apply to future, as yet unseen cases, too. The hope is that we’ll figure out how future cases are likely to be similar to past cases, even if they are not exactly the same.

      flaws in only relying on examples.

    13. We cannot hand code a program that exhaustively enumerates all the relevant factors that allow us to recognize objects from every possible perspective or in all their potential visual configurations.

      The need for "learning" over "branching"

    14. In many head-to-head comparisons on fixed tasks, data-driven decisions are more accurate than those based on intuition or expertise.

      Data-driven decisions triumph over intuition or experience. However, intuition and experience plays an important role in addressing or resolving unusual (outlier) situations.

    1. a supervised deep learning algorithm will generally achieve acceptableperformance with around 5,000 labeled examples per category and will match or20 CHAPTER 1. INTRODUCTIONexceed human performance when trained with a dataset containing at least 10million labeled examples.

      Does this sentence need a qualifier about the type of task?

    2. Thefield of deep learning is primarily concerned with how to build computer systemsthat are able to successfully solve tasks requiring intelligence, while the field ofcomputational neuroscience is primarily concerned with building more accuratemodels of how the brain actually works.

      key difference between deep learning and computational neuroscience.

    3. A comprehensive history of deep learning is beyond the scope of this textbook.

      There is a lot of mention of biological and neurological sciences in the history of deep learning. However, a group that may be should get some recognition in this field are the computational chemists. Computational chemists, since the 70's, have been using sophisticated techniques (LCAO, density functional theory, coupled cluster theory, etc.) to develop models for multi-particle systems. These have aspects of parameter fitting, updating weights through convolution layers, and feedback loops for updating models.

    4. there is no single correct value for thedepth of an architecture, just as there is no single correct value for the length ofa computer program. Nor is there a consensus about how much depth a modelrequires to qualify as “deep.”

      This is interesting. How important is the concept of 'depth' in deep learning?

    5. This is because the system’s understanding of the simpler concepts can be refinedgiven information about the more complex concepts.

      A feedback loop that updates the prior based on new information to eventually reach a good posterior.

    6. Deep learningsolves this central problem in representation learning by intro-ducing representations that are expressed in terms of other, simpler representations.

      Are the simpler representations some sort of 'building blocks'?

    7. Example of different representations: suppose we want to separate twocategories of data by drawing a line between them in a scatterplot. In the plot on the left,we represent some data using Cartesian coordinates, and the task is impossible. In the ploton the right, we represent the data with polar coordinates and the task becomes simple tosolve with a vertical line.

      The 'representations' displayed here are just transformations of the dataset. With multidimensional data, it is perhaps also important to recognize how the data was generated, and if there are causal hints as to which representation to utilize.