Question 1

Which AI models do you work with?

Accepted Answer

We're model-agnostic. We ship with OpenAI's GPT-4o family, Anthropic's Claude family, Google's Gemini, and open-source models (Llama, Mistral, Qwen) when self-hosting matters. We pick per task based on cost, latency, accuracy on your eval set, and any data-residency constraints you have. The choice is never permanent — model swaps are usually a one-line config change in our pipelines.

Question 2

How much does it cost to run AI features in production?

Accepted Answer

We design every system with explicit cost controls — token budgets per request, caching, batching, and the cheapest viable model for the task. A typical small AI feature runs $50-300/month in API costs. A heavy RAG workload over a large corpus runs $500-3,000/month. We give you a per-1,000-call cost estimate before you commit, and we wire up dashboards so the number stops being a surprise.

Question 3

Can you add AI to our existing application?

Accepted Answer

Yes — that's the most common shape of engagement. We integrate via your existing APIs, queues, or webhooks, and we work in the codebase you already own. Laravel, Next.js, Django, Rails, Node, FastAPI — we've shipped AI features into all of them. We don't ask you to migrate stacks to use AI.

Question 4

What if the AI makes mistakes?

Accepted Answer

Every system we ship includes confidence scoring, structured-output validation, fallbacks for low-confidence outputs, an audit log of every model call, and a human-in-the-loop review surface for anything above your defined risk threshold. We design for graceful failure, not blind automation. You decide where the line is — we wire the system to enforce it.

Question 5

How long does an AI integration take?

Accepted Answer

A focused single-feature integration ships in 3-6 weeks. A full RAG system or workflow with multiple integrations runs 6-10 weeks. The two-week AI Strategy & Audit is a lighter-weight starting point if you don't yet know which feature to build first. We commit to dates during scoping and ship to them.

Question 6

Do you handle data privacy and compliance?

Accepted Answer

We default to providers with zero-data-retention contracts (OpenAI Enterprise, Anthropic ZDR, Azure OpenAI, AWS Bedrock) for any sensitive data flow, and we self-host open-source models when that's the right call. We've shipped AI work under HIPAA, SOC 2, and EU GDPR constraints — we're happy to sign a BAA or DPA before we start.

Question 7

Can you build agents that take actions, not just answer questions?

Accepted Answer

Yes — that's a different shape of engagement we call AI agents. They use tool calling to write to your CRM, send emails, query your database, or trigger workflows. Mabbly is a published example: an agent that researches, drafts, and ships marketing case studies end-to-end. See our AI agent development service for that kind of work specifically.

Question 8

What if the model improves after we ship?

Accepted Answer

Good problem to have. Our pipelines abstract the model behind a single config, so swapping GPT-4o for the next thing is one PR plus an eval-set rerun. We document this swap procedure in the runbook we hand off, and our retainer clients get model upgrades as part of monthly maintenance.

Question 9

Can you train a custom model on our data?

Accepted Answer

We can fine-tune open-source models or use the fine-tune APIs from OpenAI/Anthropic when the use case actually needs it — but most teams don't. Modern base models with good RAG and prompt strategy beat fine-tuning for the majority of business workloads, ship faster, and cost less to maintain. We'll tell you honestly which side of that line your problem sits on, usually within the first scoping call.

Question 10

How do you measure that an AI feature is actually working?

Accepted Answer

Every engagement starts with us building an evaluation set — 30 to 100 real examples from your data, with the right answer labeled for each one. That eval suite runs in CI on every commit, gives us an objective accuracy number per release, and stops us from regressing in places no one would otherwise notice. In production, we instrument the feature with the business metric you actually care about (deflection rate, time saved per ticket, conversion uplift, review-override rate). If we can't define that metric on day one, we usually recommend not shipping the feature yet.

Question 11

Will we get locked into a single AI provider?

Accepted Answer

No. The whole pipeline is built behind a model adapter so OpenAI, Anthropic, Gemini, and self-hosted Llama or Qwen models swap with a config change. Prompts are versioned in your codebase, eval sets are vendor-neutral, and any vector store we use stores raw text alongside embeddings so you can re-index with a different embedding model later without losing your data. Vendor flexibility is a deliverable, not an afterthought.

Question 12

Do you work with US clients only?

Accepted Answer

Borah Labs is a US-registered LLC (Delaware) and our delivery team operates in US-friendly time zones. Most of our AI integration clients are US-based, but we've shipped AI work for teams in Europe, the UK, and APAC too. Contracting and invoicing are in USD by default; EUR / GBP are available on request.

Question 13

What does a typical engagement look like, end to end?

Accepted Answer

A typical AI integration engagement runs four to six weeks of focused work, billed against a fixed scope agreed in week one. Week one is scoping and evaluation; weeks two and three are the core build; week four is human-in-the-loop and monitoring; weeks five and six are gradual rollout, handoff, and the first post-launch eval refresh. You get a senior AI engineer leading the work, a project manager keeping the schedule honest, and access to the wider Borah Labs team for adjacent help — frontend, backend, data, DevOps. We work in your codebase, in PRs reviewed by your team, with a public Linear or Plane board so you can see exactly where things stand on any given day. Everything we ship is documented enough that your engineering team can take it over and evolve it without us.

AI That Works in Production

Who this is for

Operations leader at a 30-200 person services company

Product manager at a Series-A to Series-B SaaS

Founder or COO running an established SMB

Sound familiar?

How we solve it

What you get

LLM-Powered Features

Workflow Automation

Intelligent Search & RAG

AI Strategy & Audit

Tech stack

Process

Scoping & evaluation harness

Build the production pipeline

Human-in-the-loop & monitoring

Production rollout

Flagship Build6-8 weeks

Choosing a stack

OpenAI vs Anthropic

RAG vs fine-tuning

Pinecone vs pgvector

LangChain vs DIY

Self-hosting vs API

Real shipped work

Mabbly

Koppa AI

FAQ

Ready to ship ai that works in production?