API Gateway Migration Patterns That Actually Work in Production
Last month I helped a logistics company migrate from an aging Apigee setup to Kong Gateway. On paper, it was a six-week project. In reality, it took fourteen weeks, three production incidents, and one very tense conversation with the CFO. The migration itself wasn’t the hard part. The hard part was everything nobody thought about before we started.
API gateways sit in the critical path of everything. Every microservice call, every third-party integration, every mobile app request flows through them. When you change them, you’re touching the nervous system of your entire platform. And yet I keep seeing organisations treat gateway migrations like swapping out a commodity component.
Pattern 1: The Strangler Fig (The One That Works)
Borrow the strangler fig pattern from application modernisation and apply it to your gateway layer. Run both gateways in parallel. Route new APIs through the new gateway from day one. Gradually migrate existing APIs one by one, starting with low-traffic, low-risk endpoints.
This is slow. It’s operationally messy—you’re running two gateways, two sets of monitoring, two configurations. Your ops team will complain. But it works because you never have a single moment where everything has to work perfectly on the new system.
I’ve used this pattern four times now. It’s never fast, but it’s never caused an outage either. That’s the trade-off worth making.
Pattern 2: The Big Bang (The One That Doesn’t)
“We’ll do a cutover weekend. Migrate everything Friday night, validate Saturday, fix issues Sunday, go live Monday.” I’ve seen this attempted twice. Both times it became a two-week incident with half the engineering team sleeping in the office.
The problem isn’t technical ambition—it’s that you cannot possibly test every API consumer’s behaviour against the new gateway in a staging environment. Some clients have retry logic that behaves differently. Some have hardcoded timeouts. Some are using undocumented features of the old gateway that don’t exist in the new one.
Big-bang gateway migrations are a bet that you know every caller and every edge case. You don’t. Nobody does.
Pattern 3: The Shadow Deploy
This one’s underrated. Deploy the new gateway alongside the old one, but mirror all traffic to both. The old gateway handles all real responses. The new gateway processes the same requests but you discard its responses—you only capture metrics and error rates.
After a few weeks, you have real production data showing how the new gateway handles your actual traffic patterns. No guessing, no load-test scenarios that don’t match reality. When the error rate on the shadow gateway drops to acceptable levels, you swap.
I first saw this pattern at a payments company and it’s become my default recommendation. The overhead of running shadow traffic costs less than one production outage, and it gives you hard data instead of vendor promises.
What Vendors Won’t Tell You
Every gateway vendor’s migration guide assumes you’re running a clean, well-documented API estate. That’s fantasy for most enterprises I work with. In practice, you’ll discover:
Undocumented APIs. Someone created an endpoint three years ago for a one-off integration. It’s handling 50,000 requests a day and nobody knows it exists until the migration breaks it.
Custom plugins and middleware. Your team wrote custom authentication plugins for the old gateway. Those don’t port to the new one. Budget two to four weeks just for plugin rewrites.
Rate limiting differences. The way Kong handles rate limiting versus Apigee versus AWS API Gateway is subtly different. Your consumers built their retry logic around the specific behaviour of your current gateway. Change that behaviour and you’ll get thundering herd problems.
The Discovery Phase Matters Most
Before you write a single line of migration code, spend two weeks doing nothing but discovery. Catalogue every API. Map every consumer. Document the rate limiting, authentication, and transformation rules actually in use—not what’s in the documentation, but what’s running in production.
This is where external expertise pays for itself. Teams like team400.ai bring tooling and methodology for API discovery that most internal teams don’t have. When you’ve got 400 endpoints and you need to know which ones are actually critical, you want someone who’s done this mapping exercise before.
Monitoring Is Not Optional
Your new gateway needs better observability than the old one from day one. Not after migration—before migration. Set up dashboards that compare response times, error rates, and throughput between old and new gateways in real time.
I use a simple traffic light system: green means the new gateway is within 5% of the old one on all metrics, amber means it’s within 15%, red means something’s wrong. If you’re not green on a specific API, it doesn’t get migrated yet.
The Bottom Line
API gateway migrations are infrastructure surgery. They’re not impossible, but they require more planning, more monitoring, and more patience than most teams budget for. Use the strangler fig pattern. Run shadow traffic. Invest in discovery. And whatever you do, don’t attempt a big-bang cutover on a Friday night.
Your future self will thank you.