Skip to content

3+ Weeks to 30 Minutes(async): AI-Powered client-go Upgrades with Usage-First Scanning #9

Description

@sachdevasachin434
  • Title: 3+ Weeks to 30 Minutes(async): AI-Powered client-go Upgrades with Usage-First Scanning
  • Speaker: Sachin Sachdeva, Site Reliability Engineer @ Booking.com
  • Type: (Presentation 30-45 mins | Lightning Talk 5-10 mins)
  • Level: standard
  • Tags: kubernetes, golang, platform-engineering, automation, ai, devops

Description
If you maintain Kubernetes controllers or operators, you've probably felt this: a new k8s version drops, k8s client-go ships alongside it, and suddenly you're reading a changelog that lists every API change across 200+ packages — because it has no way of knowing you only use a fraction of them. Then you do it for co-versioned modules such as k8s.io/api and k8s.io/apimachinery. Then you write the fixes, untangle the dependency conflicts, and open the merge requests. All by hand.

At Booking.com, this cycle cost us three weeks per upgrade — as k8s releases every new version 3 times a year, its nine weeks of engineer time annually on work that isn't hard, just relentlessly manual and easy to get wrong.

The fix was to invert the problem. Instead of starting from the library, start from your code. Go's own type-checker gives you an exact usage fingerprint — which symbols your codebase actually calls, references, or embeds. Diff only those. The noise disappears entirely: 200+ packages in client-go, 25 that BKS uses, 6 breaking changes that actually matter. These are the real figures we got during client-go upgrade from v0.30.14 to v0.32.13.

This talk walks through the six-module pipeline we built on this insight:

  • Usage-first static analysis — exact symbol fingerprint of what your codebase calls, references, and embeds
  • Targeted symbol diffing between Git tags, scoped only to symbols in use
  • File-and-line impact mapping across multi-module repositories with severity classification
  • Deterministic dependency conflict resolution — no AI involved
  • Compiler-verified AI-generated source patches with build and test gates
  • Automated merge request creation with full context

We'll cover a real production run: v0.30.14 to v0.32.13, 13 repositories, 11 merge requests, all green. Three weeks reduced to 30 minutes of async review per upgrade cycle.

Attendees will leave with a reusable mental model — usage-first analysis over library-first noise — and a concrete understanding of where deterministic automation ends, where AI fits, and how to verify that it got it right.

Speaker Bio (optional)
I am Sachin Sachdeva, Site Reliability Engineer at Booking.com, I work with BKS team — the internal Kubernetes platform underpinning Booking's production infrastructure. My work focuses on platform reliability, developer tooling, and reducing the operational burden of running Kubernetes at scale. I built the upgrade automation described in this talk to eliminate the manual overhead of recurring client-go upgrade cycles across the BKS fleet.

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions