AI Agent · Productivity Automation
AI Personal Assistant
Building an agentic AI to understand where it earns trust, and where it doesn't.
Agentic AI is one of the most overhyped and under-examined categories in product development. Every AI company is pitching agents that can 'do work autonomously', but few have been honest about where agents break trust, require human oversight, and cause more harm than help when they act without confirmation. I built a personal AI assistant not because I needed one, but because operating one in a real context was the only way to develop an accurate intuition about agentic failure modes.
- 1.Task Orchestration: Build an agent that can break compound tasks into steps, execute them in sequence, and handle partial failures without abandoning the full task.
- 2.Context Management: Design the context window strategy for multi-turn, multi-day workflows, where the agent needs to remember decisions made in previous sessions.
- 3.Trust Boundary Mapping: Document empirically where agentic AI earns trust (routine, low-stakes, reversible actions) versus where it requires a human in the loop (irreversible, high-stakes, or ambiguous intent).
- 4.Guardrail Architecture: Design a guardrail system that applies different confirmation requirements based on action category, not a blanket 'confirm everything' that defeats the purpose of automation.
- ◆Real Workflow Integration: Deployed the assistant into actual personal and professional workflows for 90 days, not a demo environment, but real tasks with real consequences.
- ◆Failure Logging: Maintained a structured log of every agent failure, categorising by failure type, task category, and whether a human-in-the-loop checkpoint would have prevented it.
- ◆Trust Calibration: Developed a trust scoring framework for action categories, mapping each to an autonomy level (full auto, confirmation required, human-only).
- ◆Guardrail Iteration: Iterated the guardrail design based on 90-day failure log analysis, converging on a minimal set of hard stops that prevented the high-severity failures without killing automation value.
The failure mode that shows up most in production agents isn't the dramatic one, it isn't the agent sending the wrong email or deleting the wrong file. It's confident mediocrity: the agent completing a task in a way that's technically correct but wrong in context. It answered the literal question instead of the implied question. It scheduled the meeting without checking the agenda. It summarised the document without noting the anomaly that a human reader would have flagged. These failures are hard to prevent with guardrails alone, they require the agent to have a model of what 'good' looks like, not just 'done'.
The most useful output from this project wasn't the assistant itself, it was clarity on where agentic AI earns trust and what guardrails actually need to look like in practice. That understanding directly shapes how I evaluate and spec agentic AI products.
Hands-on intuition about agentic AI failure modes that no amount of reading about it can replace.