8 Matching Annotations
  1. Apr 2026
    1. One time, during a security fix, the model's code introduced a non-obvious DoS vector. Well, obvious from the perspective of how the code would be deployed, but not from the code itself. That's exactly why reading each change was so important. Once the issue was pointed out, the model produced code that both addressed the security issue and avoided the DoS.

      this is a core issue: the algogen has no concept of 'deployment' and only has the code itself. Even for simple things, not just security like here, it will not be able to look at the intention of a project outside the project. This a better anchor for human in the loop, the connection to reality / intention?

    2. I do know the code very well by way of careful reading of the code, the relevant libraries' documentation, and the proposed changes during the code's creation. But that safety comes down to human discipline. It is entirely possible (probable?) to take the easy road and trust the model to do the right thing.

      n:: having a human in the loop for vibe coding entirely comes down to discipline. Which is a recipe for it not happening. (and for me points to having a fixed starter set of instructions etc, as opposed to coming up with them each time).

  2. Nov 2025
    1. AI checking AI inherits vulnerabilities, Hays warned. "Transparency gaps, prompt injection vulnerabilities and a decision-making chain becomes harder to trace with each layer you add." Her research at Salesforce revealed that 55% of IT security leaders lack confidence that they have appropriate guardrails to deploy agents safely.

      abstracting away responsibilities is a dead-end. Over half of IT security think now no way to deploy agentic AI safely.

    2. When two models share similar data foundations or training biases, one may simply validate the other's errors faster and more convincingly. The result is what McDonagh-Smith describes as "an echo chamber, machines confidently agreeing on the same mistake." This is fundamentally epistemic rather than technical, he said, undermining our ability to know whether oversight mechanisms work at all.

      Similarity between models / training data creates an epistemic issue. Using them to control each other creates an echo chamber. Vgl [[Deontologische provenance 20240318113250]]

    3. Yet most organizations remain unprepared. When Bertini talks with product and design teams, she said she finds that "almost none have actually built it into their systems or workflows yet," treating human oversight as nice-to-have rather than foundational.

      Suggested that no AI using companies are actively prepping for AI Act's rules wrt human oversight.

    4. We're seeing the rise of a 'human on the loop' paradigm where people still define intent, context and accountability, whilst co-ordinating the machines' management of scale and speed," he explained.

      Human on the loop vs in