shipping production AI · since 2026 NAICS 541330 / 541511 / 541512 / 541519  ·  CMMC-aware
§00·The firm·est. operator-run·v2026.05
senior-only bench · operator-led · staffed-to-fit

We ship
production
AI systems. For mid-market and federal clients. Operator-led. Senior-only bench. Fixed-fee, fixed-scope, 4–12 week sprints. Production systems with monitoring, runbooks, and the receipts to back them.

availability
high
monitored prod systems
latency
sub-sec
interactive LLM routes
endpoints
66
on a shipped platform
MCP servers
6
in our own stack
founder
10+yr
hands-on AI eng
engagement
4–12wk
fixed-fee, fixed-scope
§ Start here·what's broken?

Start here. What's broken?

Most teams come to us mid-problem, not shopping for a service. Pick the line that sounds like your week, and we'll point you at the fix, the proof it works, and something worth reading on the way.

fig 01·what we run, end-to-end·representative of how we ship
github.com/dsee ↗
~/dsee   cat /var/log/shipping.sample illustrative
[deploy] deploy ok privatestack/api tagged release · CI-gated
[eval] eval pass rag/v3 golden suite green · drift in band
[deploy] deploy ok fedgov/ingest sam.gov sync · batch loaded
[scan] scan bedrock-iam clean ✓
[release]release mcp/gemini-bridge tagged · github
[deploy] deploy ok dsealgo risk-circuit · zero downtime
[review] red-team client-X findings triaged · prioritized
[release]release mcp/perplexity-async tagged · github
[deploy] deploy ok foodee/payments stripe · PII-clean
_
an illustrative slice of the kinds of work we ship github.com/dsee
What we run, end-to-end.
fig 01 · stack
Auth / Identity
Clerk · JWT · OIDC
least-priv IAM
API Gateway
66 routes
p95 < 200ms
LLM Routing
LiteLLM · 5 providers
cost-routed
Retrieval
pgvector · BM25+dense
hybrid hybrid
Evals
CI gates · golden suite
drift-tracked
Inference
Bedrock · vLLM
GPU-shared
Observability
Traces · logs · cost
per-tenant
AI Security
Red-team · STRIDE/AI
NIST AI RMF
Governance
EU AI Act · ISO 42001
CMMC-ready
every cell = a thing we'll write, run, and document for you. AWS-native
release cadence
CI-gated
every deploy, every system
eval discipline
golden
suites block CI on regression
model providers
5
routed cost-optimal
PII handling
scoped
least-privilege by design
scale
prod
across managed services
open source
active
maintained & upstreamed
§01 Who we work with.
Two buyers, same bench.

A rare combo: commercial and federal, delivered by the same engineers.

Mid-market CTOs and federal contracting officers don't usually share a vendor. They share ours. The same team that builds your multi-tenant SaaS is the team your CO can clear.

same team · same posture
Track A · Commercial mid-market → enterprise

Ship the AI system your CTO promised.

CTOs and Heads of AI who need production this quarter, not a strategy deck. We embed, ship, hand over IP, leave runbooks. No pyramid leverage — principals on your problem.

Move fastTwo-pizza teams, weekly demo cadence.
Senior-onlySenior-only bench. Surge from a vetted contractor network.
Fixed-feeScope quoted in 48h. No T&M.
Full IP transferCode, docs, runbook, 30-day support.
Track B · Federal & Public Sector SAM.gov · CMMC · NIST AI RMF

Deliver AI under FAR, the RMF, and a clock.

SAM.gov registered. CMMC-aware. Cleared-staff capable. The same bench that ships commercial — under the controls federal buyers actually need. No subcontracting the engineering out.

RegistrationsSAM.gov · CAGE · NAICS 541330/511/12/19.
FrameworksNIST AI RMF · EU AI Act · ISO 42001 fluent.
PostureCMMC L2 narrative-ready. Clearance-capable.
VehiclesSub on GSA MAS, CIO-SP4, OASIS+, SeaPort.
§02 What we do.
Four services. One bench.

Every engagement ends in a named deliverable, not a status update.

We don't sell "AI transformation." We sell a service in production, a threat model, a roadmap, a strategy doc — and a runbook a third party can operate. Below is the menu.

01
AI Engineering

Production LLM systems, not pilots.

RAG pipelines, agentic workflows, multi-tenant SaaS. We ship the service, the eval harness, the observability, and the runbook that outlives the engagement.

  • LLM applications & agents
  • MCP server design & integration
  • Fine-tuning & eval harnesses
  • Vector retrieval (pgvector, Atlas, Pinecone)
  • Inference infra (Bedrock, vLLM, LiteLLM)
  • CI gates & cost-routed model selection
You receive Production service · eval harness + CI gates · observability stack · 23-pg runbook · IP transfer.
02
AI Security

Security baked in, not bolted on.

AI security consulting that ladders — entry offers like phishing programs and vendor risk reviews up to red-teaming, governance, and a virtual CISO for AI programs. Findings with remediation, not a slide deck.

  • AI red-teaming (prompt injection, tool abuse, exfil)
  • Threat modeling — STRIDE for AI, supply-chain
  • AI governance framework — NIST AI RMF, ISO 42001
  • Vendor risk reviews · tabletop drills · access reviews
  • IAM hardening & access control for data platforms
  • Fractional / virtual CISO & GRC retainers
You receive Threat model · red-team report w/ remediation · AI use policy · governance charter · 30-day re-test.
03
AI Consulting

Engineering-adjacent advisory.

Readiness, architecture review, build-vs-buy, fractional CDO/CAIO. The person reviewing your stack is the person who'd build it — not a partner with a thesis to push.

  • Readiness assessment (data · infra · talent · gov)
  • Vendor + build-vs-buy memos, TCO modeling
  • Architecture review & cost passes
  • Embedded fractional CDO / CAIO (10–20 h/wk)
  • Federal AI advisory (CMMC, FedRAMP, FAR)
  • Audit, eval, board-level briefings
You receive Maturity scorecard · 90-day roadmap · vendor memo · board deck · optional fractional retainer.
04
AI Strategy

Multi-quarter, executive-level.

Roadmaps, operating-model design, investment theses, M&A due diligence. We model the ROI in numbers your CFO will defend and your board will sign.

  • 12-month data + AI roadmap, phased
  • Operating model — centralized vs federated
  • Investment thesis & ROI/NPV modeling
  • M&A — tech DD on AI/data assets
  • Policy & acceptable-use frameworks
  • Customer-facing AI disclosures
You receive Strategy doc · executive presentation · financial model · governance framework · quarterly review.
§03 One we'll show you.
The rest, on the call.

Eleven weeks, zero to production. The runbook outlived three engineers.

A representative engagement. Anonymized where we have to be — architecture, scope, and outcomes we can walk you through on a call.

case-01 · commercial multi-tenant LLM SaaS AWS · Bedrock · Lambda

PrivateStack — a private LLM platform for regulated industries.

Auth, billing, retrieval, routing across five providers, observability — built end-to-end and handed off with a runbook a third party could operate.

route mix · illustrative
/chatprimary
/embedhigh
/evalsteady
/adminlow
/billinglow
§ engagement at a glance

From a Notion doc to customer-zero in eleven weeks.

Clerk JWT auth, 66-endpoint API Gateway, Lambda backends. LiteLLM routing across Bedrock, OpenAI, Anthropic, Cohere, Mistral with per-tenant cost ceilings. pgvector retrieval, a golden-case eval harness, observability per tenant.

Two engineers, weekly demos, written decision log. Full IP transfer with a complete operator runbook — built to run without us.

Read the full case Talk to the principal
delivery
11 wks
endpoints
66
latency
sub-sec
handoff
full IP
§04 How we work.
A standard engagement.

Two-pizza teams. Weekly demos. Full IP transfer.

Fixed-fee scope. Named deliverables. Written decision logs. CI/CD, observability, and a runbook ship with every engagement — they aren't a separate phase, they're how we work.

01
Week 0

Scope.

Discovery call. Written scope, deliverables, milestones, fee. Fixed-fee quote inside 48 hours of the call.

02
Week 1

Plan.

Architecture diagram, threat model, ADRs, eval plan. End-of-week demo of the scaffolding. No surprises after this.

03
Weeks 2 — N

Build.

Two-pizza team. Weekly demo. Written decision log. CI from day one. Observability before features. Done means deployable.

04
Final

Hand-off.

Full IP transfer. Runbook. Observability dashboards. 30-day support window. Optional fractional retainer.

§05  What we won't take on

Listing what we decline
is the strongest signal
we have standards.

Nobody publishes this. We do. If you're looking for one of the engagements below, we'll happily refer you elsewhere — and the rest of this site means more because of it.

If your problem isn't here, it's likely one of ours.
§06 What we publish.
Receipts you can read.

Engineering credibility is verifiable in ten seconds.

Most consultancies hide their code. Ours is on GitHub. The Refinery Report is where we work in the open — eval harnesses, MCP server internals, red-team field notes.

The Refinery Report · Substack all posts ↗
Eval harnesses are the moat, not the model. · 12 min
Apr 30, 2026
Why we run six MCP servers in production. · 9 min
Apr 18, 2026
Red-teaming an agentic workflow — field notes. · 14 min
Apr 04, 2026
Cost-routing across five LLM providers without losing your evals. · 18 min
Mar 21, 2026
A fixed-fee playbook for AI engagements. · 7 min
Mar 09, 2026
github.com/dsee · OSS repos ↗
gemini-bridgeMCP servermaintained
perplexity-asyncMCP servermaintained
eval-harnessPython libmaintained
litellm-cost-routermiddlewaremaintained
dask-* contribsupstreamupstreamed
rmf-checklistNIST AI RMFmaintained
We commit to upstream weekly.
Every PR is signed by the engineer who wrote it.