Why fixed scope matters for AI work
AI projects have a reputation for scope creep. The promise is vast — “we could do so much with this data” — and the technology makes it easy to keep adding things. The result is a three-month build that delivers a prototype nobody uses in production.
We scope differently. Every engagement is a single sprint: one well-defined AI workflow, a fixed price, and a hard end date. You get something deployed and in production at the end of four weeks — not a roadmap, not a Figma file, not a demo that needs six more months. A working thing.
Here’s exactly what those four weeks look like.
Week 1 — Discovery and architecture
We spend the first week understanding the problem before we write a line of code. That means:
- Workflow mapping. We trace exactly where AI enters, what context it needs, and what it hands back to your system.
- Data audit. We look at real examples of the inputs the model will see. Bad data discovered in week one costs nothing. Bad data discovered in week three costs a week.
- Model selection. We pick the right model for the task — not the most impressive one, the right one. For classification, a fast cheap model is usually better than a slow expensive one.
- Scope lock. We write a one-page scope document. Everything in it is in. Everything not in it is out. Both sides sign off before week two begins.
Week 2 & 3 — The build
Weeks two and three are heads-down development. By the end of week two you have a working internal version. The model is integrated, the prompt is drafted, and the output is writing to your database. By end of week three:
- The prompt is tuned against real data, not synthetic examples
- Error handling and fallback behaviour is in place
- Costs are instrumented and within budget
- A basic eval set is passing — typically 50 hand-labelled examples we run against every prompt change
We share progress async throughout. No weekly standups. A Loom walkthrough at the end of each week, a shared doc with notes, and a channel where you can ask questions.
Week 4 — Integration, evaluation, and deploy
Week four is about making it real. We integrate the AI endpoint with your existing product, run end-to-end tests against production data, and deploy to your infrastructure (or ours, if you don’t have any). At the end of the week you get:
- A production-deployed AI feature accessible to real users
- A handover document covering the prompt, the architecture, the eval set, and how to tune it going forward
- One week of post-launch monitoring included — we watch costs and output quality and fix anything that needs fixing
What we won’t scope in
A few things we deliberately exclude from a first sprint:
- Fine-tuning. It’s rarely necessary and adds weeks. Prompt engineering on a frontier model outperforms a fine-tuned smaller model on almost every task we’ve tried.
- A custom UI. The first sprint integrates into what you have. A new interface is a second sprint.
- Multiple workflows at once. One workflow, done well. The second sprint is cheaper because the infrastructure is already there.
How to start
Send us two paragraphs: what the product does and which workflow you want to improve. We’ll come back within one business day with whether we think it’s a good first-sprint candidate and a rough scope. Start the conversation here.