Building an AI Story App: Architecture and Lessons from Inky

I am building Inky. It is an AI storytelling app designed to handle complex, multi-chapter narratives. Most people building an ai story app start with a single prompt and a text area. That works for a paragraph, but it breaks for a book. I learned the hard way that narrative consistency requires more than a better model; it requires a better system.

Inky is part of my multi-product studio. It is not a side project; it is a test of the operating layer I have been refining. The goal is to move from simple text generation to a structured, agentic workflow that understands plot, character arcs, and world-building constraints.

The Architecture of Agentic Engineering

When you are building an ai story app, the first hurdle is realizing that a single LLM call is insufficient for long-form content. If you ask a model to write a 50,000-word novel, it will hallucinate, lose the plot by chapter three, and ignore the character development you established in the prologue.

I architected Inky using agentic engineering. Instead of one prompt, I use a network of specialized agents managed by VERA, my custom orchestration layer.

The Director-Writer-Editor Pattern

In Inky, the work is split across three primary roles:

The Director: This agent holds the high-level outline. It manages the story Bible—a structured JSON object containing character traits, locations, and plot beats. It ensures that if a character loses their left arm in chapter two, they aren't playing piano with both hands in chapter ten.

The Writer: This agent focuses on the prose. It receives a specific scene objective from the Director and a slice of the story Bible. Its only job is to produce high-quality narrative text.

The Editor: This agent reviews the Writer’s output against the Director’s constraints. It looks for continuity errors and tone shifts. If the prose is off, it sends it back for a rewrite.

This system mimics a real production house. It is a feedback loop, not a linear pipeline.

Solving the Context Window Problem

Context is the most expensive and volatile resource in building an ai story app. Even with 200k token windows, you cannot simply dump an entire book into the prompt and expect the model to maintain focus. The signal-to-noise ratio degrades as the window fills.

I solved this by implementing a sliding window of "active memory" and a vector database for "latent memory."

Active Memory: The last two chapters and the current scene outline. This stays in the immediate context.

Latent Memory: The story Bible and previous plot points stored as embeddings. When the Writer agent needs to reference a character's backstory from ten chapters ago, VERA performs a similarity search and injects only the relevant snippet into the prompt.

This keeps the context lean and the output sharp. I learned the hard way that over-stuffing the context window leads to "lazy writing" from the model, where it starts summarizing instead of dramatizing.

The Stack: Monorepo and VERA

I run a multi-product studio as a solo operator. To do this, I use a monorepo architecture. Inky shares the same core logic for agent orchestration, billing, and authentication as my other products.

I use TypeScript for the entire stack, but I don't call myself an expert. It is simply the most efficient tool for the job. The backend runs on a series of MCP (Model Context Protocol) servers that allow my agents to interact directly with the file system and the database.

VERA, the orchestration layer, handles the handoffs between Claude 3.5 Sonnet (for logic and editing) and Gemini 1.5 Pro (for long-context retrieval). By using the right model for the right sub-task, I have reduced API costs by 40% while increasing the quality of the narrative output.

Lessons Learned the Hard Way

Building in public means being honest about what broke. Early in the development of Inky, I relied too heavily on the model to maintain its own state. I assumed the LLM would "remember" the character's motivation if I mentioned it once in the system prompt. It didn't.

I had to rebuild the state management layer three times. The lesson: Never trust the model to be the source of truth for your application state. The database is the source of truth; the model is just a processor.

Another lesson: UI matters more in AI apps than in traditional SaaS. Because the underlying process is non-deterministic, the user needs to see the "thinking" of the agents. I added a telemetry view in Inky that shows the Director and Editor agents communicating. It turns out that seeing the system work builds more trust than a simple loading spinner.

Shipping Today

Inky is currently in a closed beta. I am shipping updates daily, focusing on the character relationship graph and the automated world-building module.

Building an ai story app has taught me more about system architecture than any CRUD app ever could. It is about managing entropy. If you can control the chaos of a generative model, you can build anything.

I am working in public on this and other studio products. If you are interested in the specific implementation of the VERA orchestration layer or how I manage the monorepo, I am happy to talk.

Work through this for your product in a 1:1 — justintsugranes.dev/booking

The Architecture of Agentic Engineering

I architected Inky using agentic engineering. Instead of one prompt, I use a network of specialized agents managed by VERA, my custom orchestration layer.

The Director-Writer-Editor Pattern

In Inky, the work is split across three primary roles:

The Director: This agent holds the high-level outline. It manages the story Bible—a structured JSON object containing character traits, locations, and plot beats. It ensures that if a character loses their left arm in chapter two, they aren't playing piano with both hands in chapter ten.

The Writer: This agent focuses on the prose. It receives a specific scene objective from the Director and a slice of the story Bible. Its only job is to produce high-quality narrative text.

The Editor: This agent reviews the Writer’s output against the Director’s constraints. It looks for continuity errors and tone shifts. If the prose is off, it sends it back for a rewrite.

This system mimics a real production house. It is a feedback loop, not a linear pipeline.

Solving the Context Window Problem

I solved this by implementing a sliding window of "active memory" and a vector database for "latent memory."

Active Memory: The last two chapters and the current scene outline. This stays in the immediate context.

Latent Memory: The story Bible and previous plot points stored as embeddings. When the Writer agent needs to reference a character's backstory from ten chapters ago, VERA performs a similarity search and injects only the relevant snippet into the prompt.

The Stack: Monorepo and VERA

I run a multi-product studio as a solo operator. To do this, I use a monorepo architecture. Inky shares the same core logic for agent orchestration, billing, and authentication as my other products.

Lessons Learned the Hard Way

Shipping Today

Inky is currently in a closed beta. I am shipping updates daily, focusing on the character relationship graph and the automated world-building module.

I am working in public on this and other studio products. If you are interested in the specific implementation of the VERA orchestration layer or how I manage the monorepo, I am happy to talk.

Work through this for your product in a 1:1 — justintsugranes.dev/booking

Building an AI Story App: Architecture and Lessons from Inky

The Architecture of Agentic Engineering

The Director-Writer-Editor Pattern

Solving the Context Window Problem

The Stack: Monorepo and VERA

Lessons Learned the Hard Way

Shipping Today

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Systems Over Prompts

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Architecture and Lessons from Inky

The Architecture of Agentic Engineering

The Director-Writer-Editor Pattern

Solving the Context Window Problem

The Stack: Monorepo and VERA

Lessons Learned the Hard Way

Shipping Today

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Systems Over Prompts

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Architecture and Lessons from Inky

The Architecture of Agentic Engineering

The Director-Writer-Editor Pattern

Solving the Context Window Problem

The Stack: Monorepo and VERA

Lessons Learned the Hard Way

Shipping Today

How I’m building the studio.

Related posts

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Systems Over Prompts

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Architecture and Lessons from Inky

The Architecture of Agentic Engineering

The Director-Writer-Editor Pattern

Solving the Context Window Problem

The Stack: Monorepo and VERA

Lessons Learned the Hard Way

Shipping Today

How I’m building the studio.

Related posts

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Systems Over Prompts

Building an AI Story App: Lessons from the Inky Architecture