I am shipping today. Specifically, I am shipping the core narrative engine for Inky, an AI-driven storytelling application.
When you start building an ai story app, you quickly realize that the challenge isn't the prompt. Anyone can write a prompt that says, "Write a story about a dragon." The challenge is the system—the architecture that manages state, maintains character consistency, and handles narrative branching without the whole thing collapsing into a hallucinated mess.
I run a multi-product studio where AI is the team. Inky is a product of that environment. It isn't just a wrapper; it is a demonstration of how agentic engineering changes the way we think about software architecture.
The Architecture of Narrative State
The core challenge of building an ai story app is state management. LLMs are stateless by nature. If you ask an LLM to continue a story, it only knows what is in the current context window. As the story grows, the context window fills up. Eventually, the model forgets that the protagonist lost their sword in chapter two.
I learned the hard way that you cannot rely on the model's memory. You have to build an external brain.
In Inky, I use a structured narrative database. Every character, location, and plot point is an object with its own metadata. When a user interacts with the story, the system doesn't just send the last few paragraphs to the LLM. It queries the database, retrieves the relevant state, and injects it into the context. This is the difference between a toy and a product.
Moving Beyond the Prompt
Most people building in this space spend their time on prompt engineering. I spend my time on agentic engineering.
In my studio, I use a custom orchestration layer I built called VERA. For Inky, VERA manages a fleet of specialized agents. One agent is the Architect—it handles the high-level plot structure. Another is the Chronicler—it updates the world state after every turn. A third is the Prose Stylist—it ensures the tone remains consistent.
By decoupling these concerns, the system becomes more predictable. If the prose starts getting repetitive, I don't have to rewrite a 2,000-word prompt. I just tune the Prose Stylist.
The Stack and the Studio Mindset
I am an architect of systems, not an author of one stack. For Inky, the choice of tools was driven by the need for speed and durability.
I use a monorepo structure because I am a solo operator running a studio. I need to move between the frontend, the backend, and the agent orchestration layer without friction. The stack includes:
- Claude API and Gemini: I use different models for different agents based on their strengths in reasoning versus creative writing.
- PostgreSQL with pgvector: Essential for storing narrative state and performing semantic searches on past story events.
- Custom Agent Orchestration: This is the glue that allows the agents to talk to the database and each other.
Building an ai story app in 2026 means recognizing that the code is often the easiest part. The hard part is the feedback loops. I’ve spent more time designing the monitoring systems—watching how agents interact and where they diverge—than I have writing UI components.



