Preview: quick before and after
Check out the Preview
section to see how much better the blog post images are when generated by DALL·E 2 for $45
Preview: quick before and after
Check out the Preview
section to see how much better the blog post images are when generated by DALL·E 2 for $45
Here’s a very simple example of how a VQA system might answer the question “what color is the triangle?”
Visual Question Answering (VQA): answering open-ended questions about images. VQA is interesting because it requires combining visual and language understanding.
Visual Question Answering (VQA) = visual + language understanding
Most VQA models would use some kind of Recurrent Neural Network (RNN) to process the question input
The standard approach to performing VQA looks something like this: Process the image. Process the question. Combine features from steps 1/2. Assign probabilities to each possible answer.
Approach to handle VQA problems:
Of the three primary color channels, red, green and blue, green contributes the most to luminosity.
Green colour vs red and blue (RGB)