Product Managing a Team of AI Agents: A 4-Week MVP Sprint
Summary: As the sole product owner on Heart of the Cosmos, I led end-to-end development of a digital tabletop RPG from zero to a live, playable MVP in four weeks. The constraint that shaped every decision: my “team” was made up entirely of AI agents and LLMs, each assigned a functional role in the product development process.
This project was an exercise in product thinking as much as it was in building. The goal was to validate a product hypothesis, manage scope under real constraints, and demonstrate that a structured PM process could drive coherent output even without human collaborators.
MVP: Live Demo (working proof of concept, some assets still in progress)

The Problem & Opportunity
The tabletop RPG space is narratively rich but structurally exclusionary. Most digital TTRPGs demand long time commitments, prior knowledge, and dedicated groups, creating friction that locks out casual or time-constrained players.
My hypothesis: There’s an underserved segment of players who want the emotional payoff of collaborative storytelling without the overhead of traditional RPG setup. A short-session, mobile-first, GM-free experience could meet that need.
Success would mean: Delivering a functional MVP that validated the core loop that involved collaborative tension, resource tradeoffs, and shared consequence within a genuinely engaging, compressed session.
AI Team Structure
Rather than treating AI tools as generic assistants, I deliberately assigned each one a functional product role, mirroring a real cross-functional team:
| Role | Tool | Responsibilities |
|---|---|---|
| PM Partner | Perplexity | Scope definition, research strategy, PRD drafting |
| Technical PM | Gemini | User stories, acceptance criteria, implementation planning |
| Design Partner | ChatGPT + Copilot | UI/UX ideation, persona development, content generation |
| QA | DeepSeek + Cline | Flow evaluation, logic testing, edge case review |
This structure kept outputs organized and gave me a repeatable framework for delegating work — a key constraint when operating as a team of one.
Discovery & Research
Week 1 — Divergent Exploration
I opened with a structured discovery phase to pressure-test the idea before committing to a direction.
Using Perplexity to synthesize forum discussions, app store reviews, and market trends around Roll20, One More Multiverse, and mobile RPGs, I surfaced consistent pain points:

- Sessions too long for casual play
- Steep mechanical and technical learning curves
- Short-form games that felt emotionally shallow
Outcome: Identified a primary persona — the “Quick-Session Tactician” — a player who wants high-stakes collaborative decisions and narrative depth in under an hour. This shaped every subsequent scoping decision.
I also validated a genuine gap: short, emotionally resonant digital RPGs designed around shared consequences were largely absent from the market.
Scoping the MVP
With research grounding the direction, I used a MoSCoW prioritization framework to define a lean, testable scope:
Must-Have:
- Distinct, playable characters with personality
- Escalating story structure across multiple acts
- Branching encounter decisions
- Trackable core mechanics: crew morale, ship integrity, treasure
Deferred:
- Player progression and save states
- Special ability conditions
- In-game inventory system
The deferred features weren’t cut arbitrarily, they were deprioritized because they added development complexity without proving the core value proposition. The core loop had to work first.
Using Perplexity to draft the initial PRD and Gemini to translate that into user stories and logic trees, I established a shared source of truth that guided every subsequent AI collaborator, effectively acting as a briefing document for my team.

Prototyping & Iteration
Week 2 — Wireframes to Feedback
I started with low-fidelity Figma wireframes to map the encounter flow before writing a line of code. These served a dual purpose: clarifying the interaction model for myself and giving Gemini a visual reference to structure the project’s file architecture.
Early playtesting with peers surfaced a calibration issue: the morale and ship integrity mechanics weren’t landing with enough weight. Players weren’t feeling the stakes. I rebalanced both systems to make outcomes feel more consequential. This small change made decisions carry more weight and impact on engagement.

Strategic Pivot: Story Over Mechanics
Midway through Week 2, I made a deliberate prioritization call: shift remaining sprint capacity toward story depth and emotional experience rather than expanding the mechanical system.
This wasn’t a scope retreat, it was a product decision backed by user feedback. Playtesters consistently flagged narrative tension as the most compelling part of the experience. Doubling down on what was already working was the highest-leverage move available.
Development
- Built in VSCode using vanilla JS/HTML/CSS — keeping the stack lightweight and mobile-compatible
- Used Gemini and Cline to scaffold architecture and guide implementation
- Integrated character art, story continuity, and sound through ChatGPT and Copilot
- Treated Copilot as a functional art department — output quality was appropriate for MVP validation without requiring specialist tools.

Results
What shipped:
- Fully playable prototype on mobile and desktop
- 3 structured story acts with branching, emergent consequences
- Consistent early feedback: “short, punchy, emotional”
“The second encounter in Act 1 had me and my friends seriously questioning each other’s morals. It was fun to argue about what was best for the crew — and then immediately regret the decision we made.”
What was validated:
- Core loop works: collaborative tension + resource tradeoffs + shared storytelling produced genuine engagement
- The framework is genre-portable: future variants in mystery, fantasy, or eLearning are credible next steps
- The game engine itself is a reusable asset — not just a one-off product.

Reflection & Takeaways
On AI as a product process, not just a tool: The most valuable shift in this project was treating LLMs as role-players in a process, not just autocomplete engines. Assigning each tool a specific function and giving them shared context via the PRD produced dramatically more coherent outputs than open-ended prompting. Clear briefs drive better output, whether your team is human or AI.
On prioritization under constraints: Every scoping decision in this project was a trade-off between learning value and execution cost. For example, when deferring inventory and progression systems, I made an intentional bet that the emotional core mattered more than feature breadth at MVP stage.
On leading without a team: Product management is fundamentally about coordination, clarity, and decision-making. This project reinforced that those skills transfer regardless of who (or what) is executing.
Next Steps
🟢 Refinement: Polish UX/UI and instrument the experience with engagement metrics, decision completion rates, drop-off points, act progression
🟡 Genre Expansion: Test the core framework in mystery and eLearning contexts to validate reusability
🔴 Go-to-Market: Explore community seeding on Steam, Reddit, and Discord; evaluate freemium and licensing models