2 Matching Annotations
  1. Last 7 days
    1. Why testing is much harder than "computer use" Screenshots, video verification, and the "I know it works" merge moment

      The 'I know it works' merge moment captures something real: human engineers have a holistic intuition about whether a change is safe that current agents lack. Video-based verification is a fascinating workaround — using visual confirmation of a running application as a proxy for correctness. This suggests the testing problem for async agents is fundamentally different from unit tests: it requires environmental validation, not just logical assertion.

  2. Oct 2018
    1. One of the men being beaten in this video is speaking Foulfoulde, which is a commonly spoken language in the Far North region of Cameroon.

      Different to image verification, with video we can check on languages which could help us determine an location