On this page
- The Architecture of an Agentic Storytelling System
- Moving Beyond Simple Prompts
- The Stack: Choosing Instruments, Not Credentials
- VERA: The Orchestration Layer
- Lessons Learned the Hard Way
- Context Windows and State Management
- The Cost of Latency
- Shipping Today: The Reality of AI Operations
- Profit Before Scale
Building an AI story app is often framed as a prompt engineering exercise. It isn't. When I started building Inky, my goal wasn't to create a wrapper for a large language model. It was to architect a system where AI functions as the operating layer for narrative generation.
I am working in public on this project because the gap between AI hype and shipping software is widening. Most of what you read online is about what AI might do. This is about what it is doing in my studio today.
The Architecture of an Agentic Storytelling System
When you are building an ai story app, the first thing you realize is that a single prompt cannot maintain the coherence required for a long-form narrative. A story is a system of constraints—character arcs, world-building rules, and plot pacing. If you dump all of that into one context window, the model eventually drifts.
Inky uses what I call agentic engineering. Instead of one monolithic call, the system is broken down into specialized agents.
Moving Beyond Simple Prompts
In the Inky architecture, we have distinct agents for different layers of the craft:
- The Architect: Responsible for the structural integrity of the plot. It doesn't write prose; it manages the outline and ensures the narrative beats align with the user's intent.
- The Chronicler: This agent maintains the 'world state.' If a character loses a key in chapter two, the Chronicler ensures they don't magically use it in chapter five.
- The Stylist: This is the only agent that touches the final prose. It takes the instructions from the Architect and the constraints from the Chronicler to generate the text.
By separating these concerns, the system becomes more predictable. You can swap out the Stylist model (perhaps using Claude 3.5 Sonnet for its nuance) while keeping the Architect on a faster, cheaper model like Gemini 1.5 Flash.



