I am shipping Inky today. It is not a wrapper, and it is not a weekend experiment. It is a digital product built within a multi-product studio where AI functions as the operating layer rather than just a tool for autocomplete.
When you start building an ai story app, the temptation is to focus on the prompt. You think the value lies in the specific sequence of words you send to an LLM to generate a narrative. This is a mistake. The prompt is a commodity. The value is in the system architecture—the way you manage state, handle context windows, and orchestrate agents to ensure the story remains coherent over ten thousand words instead of two hundred.
This is what I have learned the hard way while architecting Inky.
The System Behind the Story
Most people approaching this space see a linear path: User Input -> LLM -> Output. In a production environment, that path is brittle. It breaks the moment the user wants to backtrack, branch a narrative, or maintain a complex world-state that exceeds the immediate context window of the model.
In building Inky, I treated the application as a series of feedback loops. I am not an author of one stack; I am an architect of systems. The stack for Inky involves a monorepo structure that allows me to share types and logic between the core engine and the delivery layers. I use Claude API for the heavy lifting of narrative synthesis and Gemini for high-throughput, low-latency world-building tasks.
The operating layer is VERA—the custom agent orchestration layer I built for my studio. VERA handles the background tasks that a human team would normally manage: consistency checks, character arc tracking, and metadata generation. When you are building an ai story app, you are actually building a small, automated publishing house.
Agentic Engineering in Practice
I use the term agentic engineering to describe how this studio functions. It means I am not just writing code; I am designing agents that write, test, and monitor code. For Inky, this meant building a 'Narrative Architect' agent.
This agent doesn't write the story. It maintains the 'Source of Truth'—a structured JSON object that tracks every character's current location, emotional state, and known facts. Before the 'Writer' agent generates a single sentence, the Architect validates the state. If the Writer tries to put a character in two places at once, the system catches it before the user ever sees the output.
This is how you move from a toy to a product. You stop relying on the model to 'remember' and start building external memory systems that the model can query.



