In my previous post, I argued that full specification fails with AI — and that component-based architecture with clear interfaces is the right model. Since then, several readers pointed me to a similar argument by Javi Lopez, who draws a sharp parallel to the CASE tools of the late 1980s: the same promise, the same illusion, a new mask.
Lopez is right. And I want to go one step further — not just to say what goes wrong, but to show what it looks like when it goes wrong, and what it takes to recover.
Because I lived it.
You can find the project on github
The Project: A Concrete Example
A few months ago I built an AI agent for my Obsidian vault — a tool that automatically tags notes, generates embeddings, links related content, and answers questions about my entire knowledge base via RAG.
It started the way most things start: with small proof-of-concepts. I needed to explore the basics — how to connect to a language model, how to structure a SQLite database, how ChromaDB works. I fed the AI incremental specifications. It found code snippets, extended the application, and things moved fast.
Very fast.
Too fast.
The Chaos Phase
What emerged was a single Python file. One file containing everything: database logic, LLM calls, embedding pipelines, configuration, RAG retrieval, file watching. Constants and URLs were defined multiple times. Responsibilities were tangled together with no discernible boundary.
When I tried a first refactoring — splitting functions into separate modules — it failed. The assignment was arbitrary. Functions ended up in modules based on intuition, not on coherent responsibility. The result was the same chaos, just spread across more files.
Then I changed the LLM model. Errors cascaded. I could not trace them. I did not understand the code anymore.
That was the moment.
The Reset
I stopped. Not to fix the code — to fix the thinking.
I went back to the very beginning of the development process and did something I should have done first: I sat down with the AI and designed an architecture. No code. Just structure.
We defined:
- What are the distinct responsibilities in this system?
- What abstraction level does each responsibility operate at?
- Where are the boundaries?
The result was a clear separation: an orchestrator (agent.py), a provider dispatch layer (core.py), configuration management (config.py), utility functions (utils.py), and two focused services for embeddings and RAG. Each module with a single, coherent concern.
Once that map existed, assigning the existing code was straightforward. Almost mechanical. The architecture made the right answer obvious.
The application now has clean external interfaces, is easy to configure for different LLM providers, and — crucially — has unit tests. Things that were impossible in the chaos phase.
What This Taught Me About the Architect’s Role
Here is the lesson I want to add to the conversation:
Interfaces are the result of good architecture — not its starting point.
In my previous post, I wrote about “one AI per component” and explicit interfaces as the solution. That is true. But there is a step before that step, and it is the one that actually matters:
The abstraction level must be coherent across all components before anyone writes a single line of code.
When I tried to refactor by splitting functions into modules, I failed — not because the modules were wrong, but because I had not decided what kind of thing each module was. Was core.py a database module? A client wrapper? A provider abstraction? Until that question had a clear answer, any assignment was arbitrary.
What the architect must define is not just the interface, but the conceptual identity of each component. What does it know? What does it not know? At what level of the system does it operate?
Only then do the interfaces follow naturally.
What Remains Irreplaceable
Lopez warns: don’t lose the capacity to surgically change the code. He is right. But I would frame it slightly differently.
The risk is not just losing the ability to change code. The risk is losing the mental model of the system — the map in your head that tells you where a change belongs, what it will affect, and what it cannot touch.
AI can implement a component. AI can accelerate the refinement. But AI cannot hold the coherent system image across all components simultaneously — because that image lives in the architect’s mind, shaped by decisions that were made for reasons that are never fully captured in code or documentation.
Three things remain irreducibly human:
Architectural decisions — the trade-offs, the priorities, the values embedded in the structure.
Conceptual coherence — ensuring that every component operates at the right abstraction level and speaks the same conceptual language as the rest of the system.
Responsibility — the one who gets called at 3 AM still needs to understand what they built.
The Practical Takeaway
If I were to start the Obsidian agent project again, I would spend the first session with AI not writing code, but drawing a map. Domains, responsibilities, abstraction levels, boundaries. Abstract — not detailed. Orientation, not a construction plan.
Only once that map is stable would I let AI touch code — one component at a time, within clear boundaries, with local context only.
The chaos phase was not a failure of AI. It was a failure of process. AI did exactly what I asked: it extended the application fast. The problem was that I had not yet answered the question that only I could answer: what should this system actually be?
That question cannot be specified away. It can only be thought through.
This post is part of an ongoing series on working with AI in software and system architecture.
If you found this useful, consider subscribing for more posts on AI and software architecture.
