Building an AI Story App: Systems Over Prompts

I am currently building an ai story app called Inky.

Most people think building an ai story app is about writing a clever system prompt. It isn't. It's about architecting a system that can handle state, memory, and narrative logic across multiple turns without hallucinating the plot into a corner. Most 'AI apps' are just thin wrappers around a single API call. They are brittle, expensive to run, and offer no real moat.

Inky is different. It is a product of my studio’s operating model: AI as the team, not just a feature. I am building this in public to show the difference between a prompt-engineered toy and an architected system.

The Artifact: Why Inky Exists

Inky is a digital product designed to help users co-create long-form narratives. The goal isn't to have the AI write a book for you—that’s boring. The goal is to build a system that acts as a creative partner, maintaining the 'world state' while the user drives the intent.

When you are building an ai story app, the core challenge is consistency. If a character has blue eyes in chapter one, they cannot have green eyes in chapter four. If the story takes place in a desert, it shouldn't start raining because the LLM forgot the setting. Solving this requires more than a large context window; it requires a structured data layer that sits between the user and the model.

Moving Beyond the Wrapper

I learned the hard way that relying on a single 'God Prompt' is a recipe for failure. As the story grows, the prompt becomes bloated, the model loses focus, and the cost per token skyrockets.

Instead, I use agentic engineering. Inky is powered by VERA, the custom orchestration layer I built for my studio. VERA treats different LLMs and specialized prompts as workers in a factory line.

Agentic Engineering in Practice

In the Inky architecture, the work is split across several specialized agents:

The Archivist: This agent doesn't write prose. Its only job is to extract facts from the current turn and update the 'World Bible' (a structured JSON object in the database).
The Director: This agent analyzes the user's input against the current narrative arc and decides the tone and direction of the next beat.
The Weaver: This is the prose engine. It takes the instructions from the Director and the facts from the Archivist to generate the actual text.

By decoupling these concerns, I can use a cheaper, faster model for the Archivist and a high-reasoning model for the Weaver. This makes the system more durable and significantly more profitable.

The Technical Stack: Choosing Instruments

I don't choose a stack based on what is trending. I choose instruments that allow me to ship today and maintain the system alone.

For Inky, the stack is a monorepo designed for speed:

Backend: Node.js with a custom agent orchestration layer.
Database: PostgreSQL with pgvector. Vector storage is essential for building an ai story app because it allows the system to perform semantic searches over the entire story history without stuffing the entire text into every prompt.
LLMs: A mix of Claude 3.5 Sonnet for reasoning and Gemini 1.5 Flash for high-volume extraction tasks.
Infrastructure: Serverless functions to keep cold starts low and costs tied directly to usage.

This architecture allows me to swap out models as better ones become available. I am not an author of one stack; I am an architect of a system that serves the product.

State Management and Narrative Loops

When building an ai story app, your state machine is your most important asset. In Inky, the 'story' isn't just a string of text. It is a collection of nodes. Each node contains the prose, the world state at that moment, and the metadata about what changed.

This allows for features like 'branching' and 'time travel.' If a user doesn't like the direction a story is taking, they can jump back to a previous node. Because we store the world state at every step, the Archivist can simply revert the World Bible to that point in time. This is impossible with a simple chat-based wrapper.

Lessons Learned the Hard Way

Building this has taught me a few things about the current state of AI development:

Context is a liability, not just an asset. Just because a model can take 200k tokens doesn't mean you should give them to it. The more noise you provide, the less signal you get in the output. Curated context beats raw volume every time.
Latency kills creativity. If a user has to wait 30 seconds for a story beat, the flow is broken. I’ve spent a significant amount of time optimizing the parallel execution of agents to get response times under 5 seconds.
Profit before vanity. It is easy to build a cool demo that costs $2.00 per session in API fees. It is much harder to build a sustainable business. I track the token cost of every story beat to ensure the unit economics work from day one.

Shipping Today

Inky is currently in private beta. I am not interested in hype or 'disrupting' the publishing industry. I am interested in building a tool that works and a studio that scales through systems rather than headcount.

If you are building an ai story app or any complex agentic system, focus on the data layer and the orchestration logic. The models will get better and cheaper, but the architecture of your system is what provides the value.

I am working in public on this and other studio products. If you want to see the specific schemas or the VERA orchestration logic, I am happy to talk.

Check out the Builder's Playbook for the full implementation details of this architecture — totalventures.io/resources/builders-playbook

I am currently building an ai story app called Inky.

The Artifact: Why Inky Exists

Moving Beyond the Wrapper

I learned the hard way that relying on a single 'God Prompt' is a recipe for failure. As the story grows, the prompt becomes bloated, the model loses focus, and the cost per token skyrockets.

Instead, I use agentic engineering. Inky is powered by VERA, the custom orchestration layer I built for my studio. VERA treats different LLMs and specialized prompts as workers in a factory line.

Agentic Engineering in Practice

In the Inky architecture, the work is split across several specialized agents:

The Archivist: This agent doesn't write prose. Its only job is to extract facts from the current turn and update the 'World Bible' (a structured JSON object in the database).
The Director: This agent analyzes the user's input against the current narrative arc and decides the tone and direction of the next beat.
The Weaver: This is the prose engine. It takes the instructions from the Director and the facts from the Archivist to generate the actual text.

By decoupling these concerns, I can use a cheaper, faster model for the Archivist and a high-reasoning model for the Weaver. This makes the system more durable and significantly more profitable.

The Technical Stack: Choosing Instruments

I don't choose a stack based on what is trending. I choose instruments that allow me to ship today and maintain the system alone.

For Inky, the stack is a monorepo designed for speed:

Backend: Node.js with a custom agent orchestration layer.
Database: PostgreSQL with pgvector. Vector storage is essential for building an ai story app because it allows the system to perform semantic searches over the entire story history without stuffing the entire text into every prompt.
LLMs: A mix of Claude 3.5 Sonnet for reasoning and Gemini 1.5 Flash for high-volume extraction tasks.
Infrastructure: Serverless functions to keep cold starts low and costs tied directly to usage.

This architecture allows me to swap out models as better ones become available. I am not an author of one stack; I am an architect of a system that serves the product.

State Management and Narrative Loops

Lessons Learned the Hard Way

Building this has taught me a few things about the current state of AI development:

Context is a liability, not just an asset. Just because a model can take 200k tokens doesn't mean you should give them to it. The more noise you provide, the less signal you get in the output. Curated context beats raw volume every time.
Latency kills creativity. If a user has to wait 30 seconds for a story beat, the flow is broken. I’ve spent a significant amount of time optimizing the parallel execution of agents to get response times under 5 seconds.
Profit before vanity. It is easy to build a cool demo that costs $2.00 per session in API fees. It is much harder to build a sustainable business. I track the token cost of every story beat to ensure the unit economics work from day one.

Shipping Today

I am working in public on this and other studio products. If you want to see the specific schemas or the VERA orchestration logic, I am happy to talk.

Check out the Builder's Playbook for the full implementation details of this architecture — totalventures.io/resources/builders-playbook

Building an AI Story App: Systems Over Prompts

The Artifact: Why Inky Exists

Moving Beyond the Wrapper

Agentic Engineering in Practice

The Technical Stack: Choosing Instruments

State Management and Narrative Loops

Lessons Learned the Hard Way

Shipping Today

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Lessons from Shipping Inky

Building an AI Story App: Systems Over Prompts

The Artifact: Why Inky Exists

Moving Beyond the Wrapper

Agentic Engineering in Practice

The Technical Stack: Choosing Instruments

State Management and Narrative Loops

Lessons Learned the Hard Way

Shipping Today

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Lessons from Shipping Inky

Building an AI Story App: Systems Over Prompts

The Artifact: Why Inky Exists

Moving Beyond the Wrapper

Agentic Engineering in Practice

The Technical Stack: Choosing Instruments

State Management and Narrative Loops

Lessons Learned the Hard Way

Shipping Today

How I’m building the studio.

Related posts

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Lessons from Shipping Inky

Building an AI Story App: Systems Over Prompts

The Artifact: Why Inky Exists

Moving Beyond the Wrapper

Agentic Engineering in Practice

The Technical Stack: Choosing Instruments

State Management and Narrative Loops

Lessons Learned the Hard Way

Shipping Today

How I’m building the studio.

Related posts

Building an AI Story App: Lessons from the Studio Floor

Building an AI Story App: Lessons from the Inky Architecture

Building an AI Story App: Lessons from Shipping Inky