I am shipping Inky today. It is a digital storytelling platform where the narrative isn't just generated—it is architected.
When I started building an ai story app, I realized quickly that the market is flooded with thin wrappers. Most people are sending a single prompt to an LLM and calling it a product. That approach breaks the moment you need narrative consistency, character depth, or a plot that doesn't collapse under its own weight after three chapters.
Inky is different. It is built on a system of agentic engineering where AI isn't just an autocomplete feature; it is the team. Here is how the system works, what I learned the hard way, and why the architecture matters more than the model.
The Architecture of Agentic Storytelling
Building an ai story app requires moving away from the 'one prompt' mentality. In Inky, the storytelling process is broken down into discrete roles handled by specialized agents. I use a custom orchestration layer I built called VERA to manage these handoffs.
The Director Agent
This agent holds the high-level state. It doesn't write prose. It manages the story arc, ensures the pacing is correct, and decides when a new character needs to be introduced. It acts as the source of truth for the narrative's direction.
The Archivist
One of the biggest hurdles in building an ai story app is the context window. Even with 200k tokens, a long-form story will eventually lose its thread. The Archivist agent manages a vector database of 'world facts' and 'character memories.' When the Writer agent needs to know what color a character's eyes were in chapter one, the Archivist fetches that specific metadata.
The Writer
This agent is the only one focused on prose. By isolating the writing task from the structural task, the quality of the output stays high. It receives a brief from the Director and context from the Archivist, then executes the scene.
Managing State in a Generative Environment
In a traditional CRUD app, state is predictable. In a generative system, state is fluid. I learned the hard way that you cannot rely on the LLM to remember the state of the world. You have to externalize it.
I use a monorepo architecture to keep the frontend, backend, and agent logic tightly coupled. The state is stored in a structured Firestore database, not just in a chat history. Every character has a JSON schema. Every location has a set of attributes. When an agent modifies the world, it updates the database first. The next prompt is then built from that structured data.
This ensures that if a character loses a sword in chapter two, the system knows they don't have it in chapter five, regardless of how many tokens have passed in between. Architecting systems like this is what separates a toy from a tool.


