Skip to content
EngineerDallas

Venkatesh Kavali.

Full-stack engineer. I work on backend systems that move billions of records a week.

Mostly identity systems, AI tooling, and the kind of code that has to keep working when nobody's watching.

Currently building
  • Identity Orbit
    An identity system I'm prototyping on the side, based on entity resolution work I've done at the day job. Mixes deterministic, probabilistic, and behavioral linking, with a proximity score that decays over time.
  • vibeCodeScan
    Open-source CLI that catches the bugs LLMs love to ship. Semgrep rules plus some behavioral probes. Drops into CI.
  • fetchlab
    Recording and replay for agent stacks. It saves every tool call so you can diff a new run against a known-good one.
At work

I work on the identity backend for United Airlines. It chews through a few billion records every week. Most of my time goes into keeping it from making bad merges.

I like shipping fast. I like hard problems. I like building things that probably shouldn't exist yet.

// 001

Building

Stuff I'm building on the side. Some of it works. Most of it's in progress. None of these have users. They're here because the problems are interesting and I needed somewhere to put the code.

Identity Orbit

In private beta
Details →

An identity-resolution prototype. Started after I got tired of watching real systems make bad merges because they only knew how to compare two strings.

What's in it
  • deterministic, probabilistic, and behavioral signals all feeding the same score
  • a proximity score that decays if a profile goes quiet and strengthens when it doesn't
  • an anomaly-review queue for uncertain merges instead of silently introducing bad identity links
  • designed for billion-record weeks. streaming and batch run side by side

vibeCodeScan

Open source · Building
Open ↗
vibecodescanner.dev

Open-source scanner for the bugs you keep finding in LLM-written code.

What's in it
  • Semgrep ruleset I keep adding to as I see new patterns. SSRF, missing authz, secrets in logs.
  • behavioral probes for prompt-injection holes and weird tool-call behavior in agent stacks
  • drops into a CI step. Exits non-zero if it finds something past the threshold you set.
  • open by default. Defenders need shared rules more than attackers do.

fetchlab

In private beta
Open ↗
fetchlab-production.up.railway.app

Recording, replay, and diffing for agent stacks. Because agent failures are difficult to reason about with traditional logging.

What's in it
  • every tool call, retry, and prompt mutation saved as a structured trace
  • diff a new run against a golden trace in CI. catches drift after a model bump.
  • replay is deterministic. tool responses, model versions, system prompts all pinned.
  • you see the exact step that diverged, not a transcript and a guess

theprintf

Building
Open ↗
theprintf.com

Interview prep that grades you against the bar at frontier-AI labs and FAANG-tier loops, instead of just being polite.

Notes
  • rubric-based feedback. not a pep talk. tells you when your answer was wrong.
  • fake panels that flag missed constraints and the wrong primitive, not effort points
  • the unflattering reps most prep tools won't give you

Fliq

Building
Open ↗
fliq.co.in

Closed-circle micro-sharing. Built for the moments you wouldn't put on a public feed.

Notes
  • small groups. ephemeral by default. no follower math. no algorithm.
  • shaped to feel like the opposite of posting

ShipBack

Building
Open ↗
ship-back.net

One API for the boring half of commerce. Returns, refunds, RMA, dispute reasons, reverse logistics.

Notes
  • the reverse-flow stack as a primitive instead of bespoke glue at every merchant
  • ledger-grade correctness on returns and refunds. not best-effort.
  • the part of money movement everyone leaves for last

Semantic File Aggregator

Open source · Building
Details →

Flatten a messy photo archive without losing anything. Every duplicate gets isolated, byte-for-byte verified.

What's in it
  • hash-verified deduping. nothing gets silently dropped.
  • treats the source folder as read-only. it copies by default.
  • renames by capture time and groups into chronological windows in a single pass
  • you can cancel mid-run. it won't leave half a tree behind.

TheShipboard

Building
Open ↗
theshipboard.com

A board for solo builders. Ideas, tasks, ship logs, all in one place.

Notes
  • one surface instead of five. Apple Notes, Linear, Notion, a changelog doc, TODOs in code.
  • shipping cadence is the unit. not roadmap theater.

quietposter

In private beta
Open ↗
backend-production-5973.up.railway.app

Scheduled cross-posting for people who want to show up online without playing the game.

Notes
  • vanity metrics hidden on purpose. no like or view counts in the dashboard.
  • write once, drip across surfaces, never see a leaderboard
  • shaped like a writer's tool, not a marketer's one

gurooschool

Building
Open ↗
front-end-production-1dd5.up.railway.app

A learning surface for teachers who want their craft to outlive a single classroom.

Notes
  • courses are living documents. students go at their own pace.
  • the teacher is the product. voice and judgment stay intact at scale.

subsplit-ai

Paused
Open ↗
subsplit-app.com

Group subscriptions, settled like a tab. The ledger and splitter for shared SaaS.

What's in it
  • an actual ledger and reconciliation logic for monthly per-usage splits
  • paused as a standalone product. the wedge wasn't sharp enough.
  • the primitives belong inside something bigger. they will end up there.

AgenticOps

Building
Open ↗
agenticops-production.up.railway.app

Agentic DevOps. The agent detects something off, diagnoses it, proposes IaC patches, opens a PR, and waits for approval before applying.

What's in it
  • real Terraform plan and apply, with drift detection and Claude doing the IaC remediation
  • Argo Rollouts for canary and blue-green. Chaos Mesh for fault injection. Kayenta-style canary analysis.
  • OIDC SSO, SCIM, AES-256-GCM secrets at rest, multi-tenancy, audit logs
  • actual AWS, GCP, Azure cost adapters. not mocked numbers.
// 002

Systems

Things I've actually shipped at work. Real users, real traffic. Not the side projects.

  1. United Airlines

    Identity Backend

    I work on the identity resolution backend. It does billions of records a week, with streaming and batch pipelines running side by side. The whole thing is shaped around uptime. If a queue stalls, it's a customer-impact event by Monday.

    • around a billion records weekly across streaming and batch
    • real-time linking with explainable, auditable merges
    • designed to fail gracefully. failures can be replayed.
  2. Kafka / AWS MSK, Spring Boot, Postgres

    Distributed Processing & Streaming

    High-volume event pipelines tuned for ordering, idempotency, and throughput. Not the whiteboard version. The version that has to keep working at 3am. I've also spent a lot of time tuning Postgres past the point where adding more indexes stops helping.

    • Kafka and AWS MSK consumer groups under sustained load
    • Spring Boot for most of the workhorse services. About seven years of muscle memory there.
    • Postgres tuning: query plans, partitioning, batched updates at 10M+ rows
  3. Identity intelligence, agent tooling, security

    AI and Tooling Experiments

    The side projects. Stuff that isn't load-bearing yet but probably should be. Identity intelligence on top of the resolution layer. Recording and replay for agents. A security scanner for the bugs LLMs ship. Most of it exists because what's out there is either closed off or just hasn't been written yet.

    • identity intelligence: orbit confidence, decay, anomaly review
    • agent recording and replay (fetchlab)
    • open-source scanner for vibe-coded apps (vibeCodeScan)
    • the workflow glue. the part of the codebase nobody wants to read.
// 003

Build log

Quick notes on the harder stuff. Things that broke. Things I had to think about for a while.

  1. 01
    Scaling Postgres updates to 15M+ records
    Batched, partitioned, and tuned past the point where adding indexes stops helping.
  2. 02
    Identity graph experiments
    Treating confidence as a real signal instead of a binary match flag.
  3. 03
    Anomaly-aware entity resolution
    A review layer that catches conflicts before they ever hit the warehouse.
  4. 04
    Tuning Kafka pipelines under load
    Throughput, ordering, and idempotency all at once. Usually you only get two and have to come back for the third.
  5. 05
    Security scanning for vibe-coded apps
    Semgrep rules plus behavioral probes. Caught the things the model never noticed.
// 004

Bio

I write software for the awkward parts. The places where identity meets messy data. Where AI tools land in front of people who don't care about the stack. Where security questions hide in the part of the codebase nobody wants to open.

Java and Spring Boot are home base. Seven years of it. TypeScript and Python show up too, depending on what the day looks like. Most of what I do is backend work, distributed systems, and the boring plumbing that keeps AI features from doing something embarrassing in production.

Based in Dallas.

// 005

Reach out

What I want most: a role where payments, billing, or ledgers sit right next to AI or agent work. The places where the money side has to be exactly right and the model side has to be trustworthy enough to bet a product on. Founding engineer or staff-shaped scope. Happy to advise on identity, agents, or AI-app security on the side. Not really looking for pure contract dev.

Email's the fastest way. I read everything. I reply to most of it.