Skip to main content
Kayvan Mazaheri

What I take on

I work with startups and engineering teams on backend architecture and AI integration — the unsexy production work that decides whether a feature ships.

Services

Engagement Options

01 ·

Backend Architecture Consulting

Find what's holding your backend back. Design the fix. Hand it to your team in a shape they can execute.

I’ve built and run backend systems serving tens of millions of users — Cafebazaar’s ad platform at 51M-user scale, Divar’s Real Estate vertical from scratch — and migrated a live monolith to microservices without downtime. The work I’m best at is the kind where the headline problem looks like one thing (latency, downtime, slow team) and the actual problem is something one layer down (service boundaries, retry storms, deploy coupling).

What I look at:

  • Bottlenecks under real load — DB, cache, gateway, queue
  • Service boundaries — where to split, where to keep together, where the team shape is wrong for the system shape
  • Failure modes — what happens when one component is slow or down
  • Event-driven vs request-response trade-offs
  • Observability gaps — what you can’t see is what will bite you

Typical engagement: 1 to 2 weeks of async work, 2 sync calls, written deliverables your team can act on.


Deliverables

  • Architecture review — bottlenecks, service boundaries, failure modes, observability gaps
  • Concrete next-steps roadmap, prioritised
  • Migration plan (if one is what you need)
  • One sync session with your engineering team
  • Written recommendations doc you can share internally

Ideal for: Startups hitting growth pain, teams that inherited a system they didn't design, CTOs who want a second opinion before a big architectural move.

project

02 ·

AI Integration Engineering

Ship AI features that hold up in production — not demos. Caching, fallbacks, evals, the parts that decide whether the feature survives contact with users.

I build AI features that ship — not impressive-looking demos that fall over in front of a real customer. The unsexy work is the work that matters: caching responses so cost doesn’t run away from you, falling back gracefully when the API is degraded, structuring prompts so the model can’t return something the system can’t handle, building evals so you actually know whether a change made things better.

Things I’ve shipped or am currently shipping:

  • RAG over PDF / DOCX / plain text, with citation-anchored answers
  • Embeddings + vector retrieval (Pinecone, self-hosted Elasticsearch)
  • Multi-tenant LLM systems where tenant isolation is non-negotiable
  • Agent platforms with predictability constraints — agents that you can run twice and get the same answer
  • LLM integration with OpenAI, Anthropic, and open-source models

What I don’t do: train models from scratch, do data science. I take a trained model and make it a feature your customers can rely on.


Deliverables

  • Working integration (prototype or production, depending on engagement length)
  • Architecture decision record for the approach (model choice, retrieval strategy, evals, fallback path)
  • Caching and cost plan
  • Handover doc your team can run with

Ideal for: Product teams adding AI features, founders building AI-native products, engineering teams who know what they want to ship but not how to make it reliable.

project