Building an AI Story App: The Systems Behind Inky

Building an AI story app is often framed as a prompt engineering exercise. It isn't. If you are building a system that generates a cohesive, multi-chapter narrative with consistent characters and visual assets, you aren't writing prompts—you are architecting a state machine.

I am currently building Inky, a multi-product studio project designed to handle long-form storytelling. The goal was to move past the 'chatbot' interface and create a system that functions as a production house. This required moving away from simple API calls and toward what I call agentic engineering.

Here is the architecture, the stack, and the lessons I learned the hard way while shipping today.

The Shift from Prompts to Agentic Engineering

When you start building an AI story app, the temptation is to send a massive prompt to Claude or GPT-4 and ask for a story. This works for a few paragraphs. It fails for a book. The context window drifts, the narrative arc flattens, and the 'AI voice' becomes repetitive.

Inky uses VERA, my custom agent orchestration layer, to break the production into discrete roles. Instead of one prompt, the system runs a sequence of specialized agents:

The Architect: Defines the narrative arc, themes, and world-building constraints.

The Casting Director: Generates detailed character descriptions and visual 'seeds' to ensure consistency across chapters.

The Scribe: Writes the actual prose, one scene at a time, constrained by the Architect’s outline.

The Continuity Editor: Reviews the output against the global state to ensure a character who lost a sword in Chapter 2 doesn't suddenly have it in Chapter 4.

This modularity allows me to swap models based on the task. I might use Claude 3.5 Sonnet for the prose because of its nuance, but use a faster, cheaper model for the initial structural outlining. By treating AI as the team rather than a single oracle, the system becomes resilient.

Managing Narrative State Across Long Contexts

Consistency is the primary friction point when building an AI story app. If the reader is ten chapters deep, the system must remember the emotional weight of a previous scene without re-sending the entire book text in every API call—which is both expensive and prone to noise.

I solved this by implementing a 'Narrative Ledger.' This is a structured JSON object stored in a Postgres database that tracks:

Character State: Current location, inventory, and relationship status.

Plot Points: Resolved vs. unresolved threads.

Visual Anchors: Specific physical descriptions used to generate consistent image prompts.

Before the Scribe agent writes a single word, the system queries the Ledger for the relevant context. This keeps the prompt size down and the focus sharp. I learned the hard way that relying on the LLM's 'memory' is a recipe for hallucinations. The database is the source of truth; the LLM is just the processor.

The Media Pipeline: Beyond Text

Inky isn't just text. It’s a visual and auditory experience. Integrating image generation (Stable Diffusion via Replicate) and audio synthesis (ElevenLabs) into the workflow introduced a new set of engineering challenges.

To keep characters looking the same, the Casting Director agent generates a 'Visual DNA' string for each character. This string is prepended to every image generation prompt. I also implemented a feedback loop where the system analyzes the generated image to ensure it matches the DNA. If the hair color is wrong, the system catches it before the user does.

For audio, the challenge was latency. Generating high-quality narration for a 2,000-word chapter can take 30 seconds. I moved this to an asynchronous worker pattern. The text is generated, the user starts reading, and the audio is streamed in chunks as it becomes available. Shipping today means managing user expectations around the 'wait time' inherent in heavy inference tasks.

What I Learned the Hard Way

I’ve spent my career building systems—from Army logistics to e-commerce platforms with 8,000 SKUs. Software is just another dialect of that same impulse. Here are the specific technical lessons from the Inky build:

Rate limits are the real ceiling: When you have four agents calling APIs simultaneously for a single user request, you hit tier limits fast. I had to build a robust queuing system to prevent the entire app from locking up during peak usage.

Structured output is non-negotiable: Never ask an LLM for 'a story.' Ask for a JSON object with keys for \prose\, \visual_prompt\, and \state_updates\. It makes the downstream integration much cleaner.

The 'AI' is 20% of the code: The other 80% is standard, boring, reliable engineering. Auth, database migrations, error handling, and UI state. Don't get so caught up in the model that you forget to build a good app.

Shipping Today

Inky is currently running in a private beta. It isn't a 'paradigm shift' or a 'game-changer.' It is a tool for people who want to build worlds. By treating AI as an operating layer rather than a magic wand, I’ve been able to build a system that produces consistent, high-quality results without a massive team.

If you are building an AI story app, stop focusing on the prompts and start focusing on the state. The value isn't in the model you use—it's in the system you architect around it.

I’m working in public on this and other studio products. If you want to see the specific implementation details of the VERA orchestration layer, I’ve documented the process in my resources.

Happy to talk.

Full implementation in The Builder's Playbook — justintsugranes.dev/resources/builders-playbook

Here is the architecture, the stack, and the lessons I learned the hard way while shipping today.

The Shift from Prompts to Agentic Engineering

Inky uses VERA, my custom agent orchestration layer, to break the production into discrete roles. Instead of one prompt, the system runs a sequence of specialized agents:

The Architect: Defines the narrative arc, themes, and world-building constraints.

The Casting Director: Generates detailed character descriptions and visual 'seeds' to ensure consistency across chapters.

The Scribe: Writes the actual prose, one scene at a time, constrained by the Architect’s outline.

The Continuity Editor: Reviews the output against the global state to ensure a character who lost a sword in Chapter 2 doesn't suddenly have it in Chapter 4.

Managing Narrative State Across Long Contexts

I solved this by implementing a 'Narrative Ledger.' This is a structured JSON object stored in a Postgres database that tracks:

Character State: Current location, inventory, and relationship status.

Plot Points: Resolved vs. unresolved threads.

Visual Anchors: Specific physical descriptions used to generate consistent image prompts.

The Media Pipeline: Beyond Text

What I Learned the Hard Way

Rate limits are the real ceiling: When you have four agents calling APIs simultaneously for a single user request, you hit tier limits fast. I had to build a robust queuing system to prevent the entire app from locking up during peak usage.

Structured output is non-negotiable: Never ask an LLM for 'a story.' Ask for a JSON object with keys for \prose\, \visual_prompt\, and \state_updates\. It makes the downstream integration much cleaner.

The 'AI' is 20% of the code: The other 80% is standard, boring, reliable engineering. Auth, database migrations, error handling, and UI state. Don't get so caught up in the model that you forget to build a good app.

Shipping Today

If you are building an AI story app, stop focusing on the prompts and start focusing on the state. The value isn't in the model you use—it's in the system you architect around it.

I’m working in public on this and other studio products. If you want to see the specific implementation details of the VERA orchestration layer, I’ve documented the process in my resources.

Happy to talk.

Full implementation in The Builder's Playbook — justintsugranes.dev/resources/builders-playbook

Building an AI Story App: The Systems Behind Inky

The Shift from Prompts to Agentic Engineering

Managing Narrative State Across Long Contexts

The Media Pipeline: Beyond Text

What I Learned the Hard Way

Shipping Today

Building an AI Story App: Systems Over Hype

Building an AI Story App: Architecture and Lessons from Inky

Building an AI Story App: Lessons from Shipping Inky

Building an AI Story App: The Systems Behind Inky

The Shift from Prompts to Agentic Engineering

Managing Narrative State Across Long Contexts

The Media Pipeline: Beyond Text

What I Learned the Hard Way

Shipping Today

Building an AI Story App: Systems Over Hype

Building an AI Story App: Architecture and Lessons from Inky

Building an AI Story App: Lessons from Shipping Inky

Building an AI Story App: The Systems Behind Inky

The Shift from Prompts to Agentic Engineering

Managing Narrative State Across Long Contexts

The Media Pipeline: Beyond Text

What I Learned the Hard Way

Shipping Today

How I’m building the studio.

Related posts

Building an AI Story App: Systems Over Hype

Building an AI Story App: Architecture and Lessons from Inky

Building an AI Story App: Lessons from Shipping Inky

Building an AI Story App: The Systems Behind Inky

The Shift from Prompts to Agentic Engineering

Managing Narrative State Across Long Contexts

The Media Pipeline: Beyond Text

What I Learned the Hard Way

Shipping Today

How I’m building the studio.

Related posts

Building an AI Story App: Systems Over Hype

Building an AI Story App: Architecture and Lessons from Inky

Building an AI Story App: Lessons from Shipping Inky