6 Matching Annotations
  1. Aug 2022
  2. Mar 2020
    1. Here’s a very simple example of how a VQA system might answer the question “what color is the triangle?”
      1. Look for shapes and colours using CNN.
      2. Understand the question type with NLP.
      3. Determine strength for each possible answer.
      4. Convert each answer strength to % probability
    2. Visual Question Answering (VQA): answering open-ended questions about images. VQA is interesting because it requires combining visual and language understanding.

      Visual Question Answering (VQA) = visual + language understanding

    3. Most VQA models would use some kind of Recurrent Neural Network (RNN) to process the question input
      • Most VQA will use RNN to process the question input
      • Easier VQA datasets shall be fine with using BOW to transport vector input to a standard (feedforward) NN
    4. The standard approach to performing VQA looks something like this: Process the image. Process the question. Combine features from steps 1/2. Assign probabilities to each possible answer.

      Approach to handle VQA problems:


  3. Feb 2020
    1. Of the three primary color channels, red, green and blue, green contributes the most to luminosity.

      Green colour vs red and blue (RGB)