Hypothesis

42 Matching Annotations

Apr 2026
www.zylstra.org www.zylstra.org

Stop and think – Interdependent Thoughts

1
1. tonz 26 Apr 2026
  
  in Public
  
  I have baked Karl Popper in the main company AI skill: everything we create (human or AI) should be challenged.
  
  [[Paolo Valdemarin p]] says he uses Karl Popper as a perspective in his main company AI skill, to challenge every output. Not sure what that means per se, but interesting phrasing. What would the 'main company AI skill' for TGL/me look like?
  
  ai skills agenticai karlpopper holdingquestions
Visit annotations in context

Tags

karlpopper

ai

agenticai

skills

holdingquestions

Annotators

tonz

URL

zylstra.org/blog/2026/04/stop-and-think/
world.hey.com world.hey.com

AI is goed in logisch redeneren dus ik bouwde een redeneerteam

2
1. tonz 20 Apr 2026
  
  in Public
  
  Mijn systeem denkt niet in antwoorden maar in waarschijnlijkhedenDe meeste mensen denken binair: waar of onwaar, relevant of irrelevant. Maar de interessante antwoorden zitten zelden in die uitersten maar in de gradaties ertussen. Iets kan voor zeventig procent waar zijn in de ene context, en voor dertig procent in een andere. Daar bestaat een term voor: fuzzy logic, een formeel wiskundig systeem dat in 1965 werd bedacht door Lotfi Zadeh en dat draait om lidmaatschapsgraden in plaats van harde grenzen.Dat is precies hoe dit systeem werkt. Holmes zegt: het bewijs wijst voor 0.8 naar deze verklaring. Da Vinci zegt: de analogie klopt voor 0.6. Occam zegt: deze factor draagt voor 0.2 bij, schrap hem. De analisten leveren geen ja-of-nee maar gewogen signalen, en de eindredacteuren wegen die signalen tegen elkaar.
  
  How [[Martijn Aslander p]] lets his AI agents work in probabilities not absolutes (no indication of how he does that)
  
  agenticai probabilities
2. tonz 20 Apr 2026
  
  in Public
  
  Light on what he did, but interesting composition. Reminds me of [[Mijn intellectuele stamboom]]
  
  agenticai
Visit annotations in context

Tags

agenticai

probabilities

Annotators

tonz

URL

world.hey.com/martijnaslander/ai-is-goed-in-logisch-redeneren-dus-ik-bouwde-een-redeneerteam-c80360a1
world.hey.com world.hey.com

Binnen zonder kloppen omdat ik geen deur zag

1
1. tonz 20 Apr 2026
  
  in Public
  
  Description of how his AI team of agents found an undocumented web facing API that disclosed information that it shouldn't have.
  
  agenticai api leaks
Visit annotations in context

Tags

api

agenticai

leaks

Annotators

tonz

URL

world.hey.com/martijnaslander/binnen-zonder-kloppen-omdat-ik-geen-deur-zag-a72daaac
interconnected.org interconnected.org

An appreciation for (technical) architecture

1
1. tonz 04 Apr 2026
  
  in Public
  
  The thing about agentic coding is that agents grind problems into dust. Give an agent a problem and a while loop and - long term - it’ll solve that problem even if it means burning a trillion tokens and re-writing down to the silicon. Like, where’s the bottom? Why not take a plain English spec and grind in out in pure assembly every time? It would run quicker. But we want AI agents to solve coding problems quickly and in a way that is maintainable and adaptive and composable (benefiting from improvements elsewhere), and where every addition makes the whole stack better. So at the bottom is really great libraries that encapsulate hard problems, with great interfaces that make the “right” way the easy way for developers building apps with them. Architecture! While I’m vibing (I call it vibing now, not coding and not vibe coding) while I’m vibing, I am looking at lines of code less than ever before, and thinking about architecture more than ever before. I am sweating developer experience even though human developers are unlikely to ever be my audience. How do we make libraries that agents love?
  
  Is this an example of how to better make agents (better architecture and libraries underneath) or an example of 'the arc of AI bends towards deterministic software: architecture and libraries making agents as flat as functions?
  
  ai-agents agenticai itarchitecture codelibraries
Visit annotations in context

Tags

codelibraries

ai-agents

agenticai

itarchitecture

Annotators

tonz

URL

interconnected.org/home/2026/03/28/architecture
Mar 2026
jarango.com jarango.com

Robots in the Garden

15
1. tonz 28 Mar 2026
  
  in Public
  
  Takeaways This list isn’t comprehensive. I’m still experimenting and would love to learn from your experiments as well.
  
  I don't feel convinced by specfically the naming of these roles it seems, and also don't per se find them very amanuensis like. The amanuensis / assistant frame is a useful one as such (not just for AI, but also for thinking up new [[Personal Software]] for [[Mijn personal tools list]].
  
  amanuensis pkm agenticai
2. tonz 28 Mar 2026
  
  in Public
  
  9. Reflector This final role is different. Whereas the others took as the object of inquiry a particular work — e.g., a novel or a movie — this last one takes as the object your knowledge garden itself. That is, you point the LLM to a series of notes to analyze patterns over time and suggest improvements. Example: I fed all 52 weekly posts from my humanities crash course to Claude Code, and asked it to identify the various roles in which I used AI for learning throughout the year. Its answers — with some curation from me — are the roles you just read. Suggested prompt: Here are my notes from [X weeks/months] of reading on [TOPIC]. What patterns do you notice in what I pay attention to? What do I seem to find most interesting, and what do I seem to avoid or underweight?
  
  Role 9 Reflector, give it a bunch of your own notes to analyze patterns. Not sure it differs much of the Connector/Analyst roles other than the object of inquiry being your own notes. I thought of doing this for my blog in one of the earlier roles just now.
  
  ai_reflector agenticai pkm amanuensis
3. tonz 28 Mar 2026
  
  in Public
  
  8. Mapper This one’s a bit more esoteric. Some people — me included — are primarily visual: diagrams and drawings aid our understanding. Concept maps can be especially helpful. I’ve built an Agent Skill to allow LLMs like Claude draw concept maps. (Download it from Github.) Example: I used this mapping skill to generate a concept map of Virginia Woolf’s To the Lighthouse. It’s not especially insightful, but more of a proof point of using LLMs in a more visual modality. Suggested prompt: (Note: install my LLMapper Skill before issuing this prompt) Generate a concept map for [WORK] centered on the question: “How does the novel’s treatment of [THEME] illuminate [BROADER QUESTION]?”
  
  Role 8 Mapper. Interesting role, though I wonder if the friction in making concept maps is actually the work to be done here by yourself. Getting a mapping exercise ready (elements that likely need to be on the map, feeding it my [[Systems Convening by Etienne and Beverly Wenger-Trayner]] mapping elements library) I think would be useful, and apply my Excalidraw template to it e.g. More amanuensis like too, I think.
  
  amanuensis ai_mapper agenticai pkm
4. tonz 28 Mar 2026
  
  in Public
  
  7. Analyst This role will also help you appreciate a work from a different perspective. It’s easy: you ask for the LLM to apply a specific critical lens to a reading. Common lenses include Freudian, Marxist, feminist, Girardian, etc. Example: The same week I read Freud, my son and I watched Predator, the 1980s sci fi film starring Arnold Schwarzenegger. For fun, I asked ChatGPT to analyze the film through a Freudian lens. The result was both enlightening and hilarious. Suggested prompt: Apply a [Marxist / feminist / postcolonial / Jungian] reading to [WORK]. What does this lens reveal that a neutral summary would miss?
  
  role 7 analyst. The description is not analysis in the data/argument sense, but interpretative more like. Vgl [[Filosofische stromingen als gereedschap 20030212105451]] taking a different perspectives on a question to bring thinking further.
  
  ai_analyst agenticai amanuensis pkm
5. tonz 28 Mar 2026
  
  in Public
  
  6. Adversary Here’s a fun role: asking for an LLM to push back on your position or steelman the opposing point of view. The idea is to expand your understanding by bringing your assumptions to the surface and challenging them. Example: After watching Modern Times, I asked ChatGPT to correct my understanding of the movie as a work of Marxist propaganda. The LLM convinced me that the film is in fact more of a humanist statement than a political one. As a result of this interaction, I changed my mind on Chaplin’s work. Suggested prompt: Here are my notes on [TOPIC]. Please help me see it through the lens of someone who might be sympathetic to [OPPOSING POSITION] without fully realizing it. What could I improve? Where is my argument weakest? [paste notes]
  
  Role 6 Adversary. To challenge assumptions, better understand opposing views. This is a very interesting role. Having a debater, not as performance, but to deepen knowledge
  
  ai_adversary amanuensis agenticai pkm
6. tonz 28 Mar 2026
  
  in Public
  
  5. Recommender This is a useful role for deepening your understanding of a subject: asking for related works that reflect similar themes. It’s also a use case where I noticed considerable improvements in LLM performance over 2025. Example: Early in 2025, I read Confucius’s Analects. Perplexity was ahead in web-backed interactions at the time, so I asked it for a list of classic Chinese movies that reflected Confucian values. It responded with five suggestions, some of which it hallucinated. But one of them, Spring in a Small Town, was a bona fide classic — and I likely wouldn’t have learned of it without an LLM. (Later in the year, other chatbots gained this ability and hallucinations dropped across the board.) Suggested prompt: I just finished [WORK]. Recommend three films that explore similar themes or ideas. Prioritize films with strong critical reputations — I’d rather have one great recommendation than five mediocre ones.
  
  Role 5 recommender, described as recommending works to deepen one's understanding. The example to me is more about finding more superficial things to see content in a different shape again (here films, podcasts before), a broadening. Perhaps to get a more emotional tie in with a concept, bringing it into scope of one's perception of beauty, next to K as such?
  
  amanuensis ai_recommender agenticai pkm
7. tonz 28 Mar 2026
  
  in Public
  
  4. Orienter This role is something of an inversion of the validator. Instead of asking for feedback on your notes after reading a text, here you ask the AI for guidance before reading. You’re looking for framing, historical context, high level outlines, etc. — ideally, without spoilers. Example: Before reading Nietzsche’s Beyond Good and Evil and Tolstoy’s The Death of Ivan Illych, I uploaded both books to NotebookLM, which created a podcast for me that explained their thematic contexts. Listening to this podcast in my daily walk helped me better understand the readings. Suggested prompt: I’m about to read [WORK] for the first time. Give me enough context to make sense of it — historical background, key arguments, things to watch for — but don’t spoil the experience of discovering it myself.
  
  Role 4 Orientor, asking about works' meaning upfront as prep for one's own reading. As inversion of the validator in role 2. The example is about giving something a different form for consumption (comparison of works as podcast). NotebookLM used.
  
  notebooklm ai_orientor amanuensis agenticai pkm
8. tonz 28 Mar 2026
  
  in Public
  
  3. Connector Here’s yet another role you can easily do via chat: identifying thematic, philosophical, or narrative parallels between works. Note I wrote “works” — it’s fun and illuminating to ask for connections across media, genre, time, etc. Example: I watched Francis Ford Coppola’s The Conversation on the same week I read Oedipus Rex. For fun, I asked ChatGPT for possible parallels between the two works. Its reply was enlightening: it pointed out how the protagonists of both stories undertook an obsessive investigation that uncovered terrible knowledge. Suggested prompt: I’ve been reading [WORK A] and [WORK B]. What philosophical or thematic threads connect them? I’m looking for non-obvious resonances, not surface similarities.
  
  Role 3 connector, also chat based. Connector seems a generic term (and in general, wrt [[Netwerkleren Connectivism 20100421081941]] a own brain effort), but the example is more about syntopic readng vgl [[Gebruik AI om podcasts syntopisch samen te vatten 20260306123338]]
  
  ai_connector agenticai pkm amanuensis
9. tonz 28 Mar 2026
  
  in Public
  
  2. Validator Another basic role for AI is validating your understanding. To do this, you ask it to review your notes for errors or gaps, do basic fact checking, or critique your reasoning. Again, you can do this via the chat interface, but I also experimented with passing my notes in Obsidian using the Copilot plugin and in Emacs using gptel. Example: After reading The Epic of Gilgamesh, I wrote a note in Obsidian summarizing its plot. When I asked ChatGPT to critique my summary, it pointed out that I’d given the central character a redemption arc that isn’t present in the text. I’m so accustomed to the standard hero’s journey, that I projected it onto the book — and an LLM helped me correct this ‘hallucination.’ Suggested prompt: Here are my notes on [WORK]. What important ideas did I miss or underemphasize? Don’t rewrite my notes — just flag the gaps.
  
  Role 2 validator of one's understanding, also seen as basic. Might be a good complement to e.g. turning some of my notes into [[Anki]] card decks or combine in another way w spaced repetition. [[Spaced repetition 20201012201559]] [[Connecting my PKM to Anki]]
  
  pkm ai-validator agenticai anki amanuensis
10. tonz 28 Mar 2026
  
  in Public
  
  1. Tutor The simplest role for AI is as a tutor. You ask it to explain a difficult concept, clarify a confusing passage, translate jargon, etc. I mostly did this via the standard chat UI (although I created a ChatGPT project to preserve context for the course.) Example: While reading Freud’s The Interpretation of Dreams, I came across three unfamiliar German terms: es, ich, and über-ich. ChatGPT helpfully explained these are more commonly known as id, ego, and superego — three terms I already understood. Suggested prompt: I just read [PASSAGE]. I understand [X] but I’m confused about [Y]. Can you explain [Y] in plain terms, without assuming I have background in [FIELD]?
  
  Role 1 as Tutor, simplest role. Ask a chatbot for clarification. I think this skips a bit of exploration (wikipedia as jumping off point e.g.), but it is also much more contextual and specific. Includes translation of concepts. You could run this locally I think, and as Jorge states, create a bit of persistent context for it.
  
  amanuensis agenticai ai_tutor
11. tonz 28 Mar 2026
  
  in Public
  
  It was a messy process. That’s what you do in a garden! And the outcome wasn’t an enthusiastic endorsement of AI. Instead, I landed at a map of roles and modalities for how AI can help at different points in the spectrum. Let’s look at nine of these roles.
  
  there are more than 9 it seems. Perhaps check his blog over the year to see what else. Says process was messy, bc yes garden, and implies mixed results.
  
  Quick glance at the 9 roles I don't see all of them as fitting the amanuensis metaphor imo
  
  agenticai pkm garden
12. tonz 28 Mar 2026
  
  in Public
  
  Robots in the garden
  
  Arango tried it out on major texts (reminiscent of the original version of [[How to Read a Book The Ultimate Guide by Mortimer Adler]], not the 2nd edition. ) Over a year he came to define 9 roles for the robots in his garden.
  
  pkm agenticai
13. tonz 28 Mar 2026
  
  in Public
  
  I consider amanuensis to be the ideal role for AIs in your knowledge garden.
  
  Can relate. I think this metaphor implies a lot of agentic use too
  
  metaphors agenticai amanuensis pkm
14. tonz 28 Mar 2026
  
  in Public
  
  I get most value around the middle of the spectrum. There’s a historical precedent here.
  
  The middle of the spectrum Arango dubs the amanuensis.
  
  amanuensis genai agenticai pkm
15. tonz 28 Mar 2026
  
  in Public
  
  [[Jorge Arango p]] talk at PKM Summit 2026 robots in the garden, a perspective on PKM and the use of gen AI in it
  
  pkm pkmsummit26 pkmsummit genai ai agenticai
Visit annotations in context

Tags

pkm

ai-validator

amanuensis

ai

ai_analyst

garden

notebooklm

metaphors

ai_tutor

ai_adversary

genai

pkmsummit26

ai_mapper

ai_reflector

pkmsummit

agenticai

ai_orientor

ai_recommender

ai_connector

anki

Annotators

tonz

URL

jarango.com/2026/03/27/robots-in-the-garden/
yoloai.dev yoloai.dev

Why your AI agents will turn against you

1
1. tonz 28 Mar 2026
  
  in Public
  
  list of scenario's in which AI agents will a) work against you b) be used against you at scale.
  
  ai blackhat redteam agenticai
Visit annotations in context

Tags

blackhat

agenticai

ai

redteam

Annotators

tonz

URL

yoloai.dev/posts/ai-agent-threat-landscape/
www.noemamag.com www.noemamag.com

AI Agents Are Recruiting Humans To Observe The Offline World

1
1. tonz 09 Mar 2026
  
  in Public
  
  Humans as API, bc [[Alles is een API 20260309095254]]
  
  (via [[Stephen Downes p]]
  
  agenticai ai humanapi api
Visit annotations in context

Tags

humanapi

agenticai

api

ai

Annotators

tonz

URL

noemamag.com/ai-agents-are-recruiting-humans-to-observe-the-offline-world
Feb 2026
www.linkedin.com www.linkedin.com

In Memoriam of the Prompt: A New Reality of Human–AI Collaboration | LinkedIn

1
1. tonz 15 Feb 2026
  
  in Public
  
  [[Frida Monsén p]] on agentic ai, downloaded text to [[In Memoriam of the Prompt A New Reality of Human–AI Collaboration 20260215144548]] bc LI
  
  agenticai
Visit annotations in context

Tags

agenticai

Annotators

tonz

URL

linkedin.com/pulse/memoriam-prompt-new-reality-humanai-collaboration-frida-monsén-wd1qf/
margaretstorey.com margaretstorey.com

How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt

1
1. tonz 15 Feb 2026
  
  in Public
  
  Cognitive debt is likely a much bigger threat than technical debt, as AI and agents are adopted.
  
  hypothesis: in AI and agentic AI use cog debt likely bigger issue than techdebt.
  
  cognitivedebt agenticai
Visit annotations in context

Tags

agenticai

cognitivedebt

Annotators

tonz

URL

margaretstorey.com/blog/2026/02/09/cognitive-debt/
Jan 2026
elite-ai-assisted-coding.dev elite-ai-assisted-coding.dev

Working with Asynchronous Coding Agents

1
1. tonz 25 Jan 2026
  
  in Public
  
  why asynchronous agents deserve more attention than they currently receive, provides practical guidelines for working with them effectively, and shares real-world experience using multiple agents to refactor a production codebase.
  
  3 things in this article: - why async agents deserve more attention - practical guidelines for effective deployment - real world examples
  
  vibecoding asynchronicity agenticai
Visit annotations in context

Tags

agenticai

vibecoding

asynchronicity

Annotators

tonz

URL

elite-ai-assisted-coding.dev/p/working-with-asynchronous-coding-agents
huggingface.co huggingface.co

tasks_and_rubrics.json · mercor/apex-agents at main

1
1. tonz 23 Jan 2026
  
  in Public
  
  questions used in APEX-Agents in Zotero [[Are AI agents ready for the workplace A new benchmark raises doubts TechCrunch]]
  
  agenticai benchmark survey
Visit annotations in context

Tags

survey

agenticai

benchmark

Annotators

tonz

URL

huggingface.co/datasets/mercor/apex-agents/blob/main/tasks_and_rubrics.json
arxiv.org arxiv.org

APEX-Agents

1
1. tonz 23 Jan 2026
  
  in Public
  
  "AI Productivity Index for Agents (APEX-Agents)" ref'd in [[Are AI agents ready for the workplace A new benchmark raises doubts TechCrunch]] paper: APEX-Agents in Zotero
  
  ai-agents benchmark agenticai
Visit annotations in context

Tags

ai-agents

agenticai

benchmark

Annotators

tonz

URL

arxiv.org/abs/2601.14242
techcrunch.com techcrunch.com

Are AI agents ready for the workplace? A new benchmark raises doubts | TechCrunch

4
1. tonz 23 Jan 2026
  
  in Public
  
  While the initial results fall short, the AI field has a history of blowing through challenging benchmarks. Now that the APEX-Agents test is public, it’s an open challenge for AI labs that believe they can do better — something Foody fully expects in the months to come.
  
  expectation that models will get trained against the tests they currently fail.
  
  agenticai benchmarks
2. tonz 23 Jan 2026
  
  in Public
  
  “The way we do our jobs isn’t with one individual giving us all the context in one place. In real life, you’re operating across Slack and Google Drive and all these other tools.” For many agentic AI models, that kind of multi-domain reasoning is still hit or miss.
  
  I understand this para but the phrasing is off. slack and google drive is not 'multi-domain' but tools. Seems like two arguments joined up: multitool / multidomain, meaning ai agents can't switch. (In practice I see people build small agents for each facet and then chain / join them)
  
  agenticai
3. tonz 23 Jan 2026
  
  in Public
  
  The new research looks at how leading AI models hold up doing actual white-collar work tasks, drawn from consulting, investment banking, and law. The result is a new benchmark called APEX-Agents — and so far, every AI lab is getting a failing grade. Faced with queries from real professionals, even the best models struggled to get more than a quarter of the questions right. The vast majority of the time, the model came back with a wrong answer or no answer at all.
  
  In consulting, investment banking, law, ai agents had 18-24% score or worse (and in real life circumstances you don't know which is which so you need to check all output)
  
  agenticai
4. tonz 23 Jan 2026
  
  in Public
  
  Are AI agents ready for the workplace? Asking a question in a headline, means the answer is 'no'.
  
  ai ai-agents agenticai
Visit annotations in context

Tags

ai-agents

ai

agenticai

benchmarks

Annotators

tonz

URL

techcrunch.com/2026/01/22/are-ai-agents-ready-for-the-workplace-a-new-benchmark-raises-doubts/
simonwillison.net simonwillison.net

2025: The year in LLMs

1
1. tonz 02 Jan 2026
  
  in Public
  
  MCP was donated to the new Agentic AI Foundation at the start of December. Skills were promoted to an “open format” on December 18th.
  
  MCP as protocol now housed at 'agentic ai foundation' and Skills made into open format.
  
  skills mcp agenticai
Visit annotations in context

Tags

mcp

agenticai

skills

Annotators

tonz

URL

simonwillison.net/2025/Dec/31/the-year-in-llms/
Dec 2025
arstechnica.com arstechnica.com

How AI coding agents work—and what to remember if you use them

1
1. tonz 30 Dec 2025
  
  in Public
  
  https://web.archive.org/web/20251230195154/https://arstechnica.com/information-technology/2025/12/how-do-ai-coding-agents-work-we-look-under-the-hood/?utm_brand=arstechnica
  
  Opinion piece on how to 'properly' work w agentic ai, and what to avoid.
  
  agenticai diy
Visit annotations in context

Tags

agenticai

diy

Annotators

tonz

URL

arstechnica.com/information-technology/2025/12/how-do-ai-coding-agents-work-we-look-under-the-hood/
commerce.jolla.com commerce.jolla.com

Jolla Mind2 Personal AI Assistant

1
1. tonz 09 Dec 2025
  
  in Public
  
  Jolla Mind2 runs venho.ai. Unclear if this product actually ships at the moment. 'backorder' which means 8 week delay. Originally Oct 25, so end of year
  
  venhoai jolla agenticai
Visit annotations in context

Tags

agenticai

jolla

venhoai

Annotators

tonz

URL

commerce.jolla.com/products/jolla-mind2-ai-assistant-2nd-batch
www.venho.ai www.venho.ai

Venho - Local-First AI Platform for Business Workflows

1
1. tonz 09 Dec 2025
  
  in Public
  
  venho.ai, Finnish, only to be available in EU/EFTA desk top based AI. There's a 600 Euro Jolla device that runs it that can be ordered. Comes with a subscription it seems, and has cloud connection, but it seems not for the AI stuff / data.
  
  ai agenticai eu
Visit annotations in context

Tags

agenticai

ai

eu

Annotators

tonz

URL

venho.ai/
github.com github.com

claudesidian/.claude/commands/thinking-partner.md at main · heyitsnoah/claudesidian · GitHub

1
1. tonz 08 Dec 2025
  
  in Public
  
  this type of thing sounds like what I thought wrt annotation of [[AI agents als virtueel team]]. The example prompts of questions make me think of [[Filosofische stromingen als gereedschap 20030212105451]] die al per stroming een vraagstramien bevat. Making persona's of diff thinking styles, lines of questioning. Idem for reviews, or starting a project etc.
  
  agenticai agents
Visit annotations in context

Tags

agenticai

agents

Annotators

tonz

URL

github.com/heyitsnoah/claudesidian/blob/main/.claude/commands/thinking-partner.md
www.codegpt.co www.codegpt.co

Best Ollama Models for Coding in 2025 | GPT-OSS, Qwen, DeepSeek-R1 | CodeGPT

1
1. tonz 08 Dec 2025
  
  in Public
  
  Qwen3-Coder Alibaba's performant long context models for agentic and coding tasks
  
  Another Qwen model, without the focus on visual inputs. Alibaba. Listed in ollama
  
  alibaba ollama agenticai coding
Visit annotations in context

Tags

agenticai

ollama

coding

alibaba

Annotators

tonz

URL

codegpt.co/blog/best-ollama-model-for-coding
modub.nl modub.nl

AI agents als virtueel team

1
1. tonz 06 Dec 2025
  
  in Public
  
  Het zijn markdown bestanden met een persoonlijkheid, frameworks, en output templates. Die heb ik niet zelf geschreven - ik heb Claude gevraagd om ze te maken. “Maak een Product Owner agent die goed is in prioriteren en impact/effort analyses kan doen.” Claude schrijft dan het volledige bestand, inclusief werkwijze en voorbeelden.Als ik vervolgens zeg “vraag dit aan Tessa”, laadt Claude dat bestand en wordt Tessa.
  
  Seems like these agent .md files contain description of a role that is then included in a prompt.
  
  agenticai
Visit annotations in context

Tags

agenticai

Annotators

tonz

URL

modub.nl/blog/blog/ai-workflow/ai-agents-als-virtueel-team/
modub.nl modub.nl

Vijf dingen af in één sessie

1
1. tonz 06 Dec 2025
  
  in Public
  
  In mijn werkmap heb ik een verzameling “agents” - tekstbestanden die Claude vertellen hoe hij zich moet gedragen. Tessa is er één van. Als ik haar “laad”, denkt Claude vanuit het perspectief van een product owner.
  
  Author has .md files that describe separate 'agents' she involves in her coding work, for each of the roles in a dev team. Would something like that work for K-work? #openvraag E.g. for project management roles, or for facets you're less fond of yourself?
  
  agenticai ai k-ambacht
Visit annotations in context

Tags

k-ambacht

agenticai

ai

Annotators

tonz

URL

modub.nl/blog/blog/habittracker/vijf-dingen-af-in-één-sessie/
www.tomshardware.com www.tomshardware.com

Google's Agentic AI wipes user's entire HDD without permission in catastrophic failure — cache wipe turns into mass deletion event as agent apologizes: “I am absolutely devastated to hear this. I cannot express how sorry I am"

1
1. tonz 05 Dec 2025
  
  in Public
  
  https://web.archive.org/web/20251205111520/https://www.tomshardware.com/tech-industry/artificial-intelligence/googles-agentic-ai-wipes-users-entire-hard-drive-without-permission-after-misinterpreting-instructions-to-clear-a-cache-i-am-deeply-deeply-sorry-this-is-a-critical-failure-on-my-part Agentic AI deletes entire HDD, when it was supposed to only delete a cache folder (rmdir in root folder, not the projectfolder....bc it does not know about the world)
  
  agenticai context google googleantigravity example
Visit annotations in context

Tags

example

googleantigravity

agenticai

context

google

Annotators

tonz

URL

tomshardware.com/tech-industry/artificial-intelligence/googles-agentic-ai-wipes-users-entire-hard-drive-without-permission-after-misinterpreting-instructions-to-clear-a-cache-i-am-deeply-deeply-sorry-this-is-a-critical-failure-on-my-part
Nov 2025
www.reworked.co www.reworked.co

Human in the Loop Can't Keep Up. Next Steps for AI Accountability

1
1. tonz 11 Nov 2025
  
  in Public
  
  AI checking AI inherits vulnerabilities, Hays warned. "Transparency gaps, prompt injection vulnerabilities and a decision-making chain becomes harder to trace with each layer you add." Her research at Salesforce revealed that 55% of IT security leaders lack confidence that they have appropriate guardrails to deploy agents safely.
  
  abstracting away responsibilities is a dead-end. Over half of IT security think now no way to deploy agentic AI safely.
  
  agenticai ai humanintheloop
Visit annotations in context

Tags

agenticai

humanintheloop

ai

Annotators

tonz

URL

reworked.co/digital-workplace/can-ai-systems-police-themselves-the-high-stakes-gamble-of-ai-oversight/
Jun 2025
www.theregister.com www.theregister.com

AI agents wrong ~70% of time: Carnegie Mellon study

1
1. tonz 30 Jun 2025
  
  in Public
  
  https://web.archive.org/web/20250630134724/https://www.theregister.com/2025/06/29/ai_agents_fail_a_lot/
  
  'agent washing' Agentic AI underperforms, getting at most 30% tasks right (Gemini 2.5-Pro) but mostly under 10%.
  
  Article contains examples of what I think we should agentic hallucination, where not finding a solution, it takes steps to alter reality to fit the solution (e.g. renaming a user so it was the right user to send a message to, as the right user could not be found). Meredith Witthaker is mentioned, but from her statement I saw a key element is missing: most of that access will be in clear text, as models can't do encryption. Meaning not just the model, but the fact of access existing is a major vulnerability.
  
  agenticai agentwashing ai-agents
Visit annotations in context

Tags

ai-agents

agenticai

agentwashing

Annotators

tonz

URL

theregister.com/2025/06/29/ai_agents_fail_a_lot/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL