Global Food Delivery Platform
Stabilised high-volume order processing with automated testing, DR validation, and per-pull-request preview environments.
#ECS#PostgreSQL#Observability#Kubernetes#Cloudflare

- Peak Orders
- 200,000+/day
- Restaurants
- 30,000+
- Monthly Active Users
- 1M+
- Average Page Load Time
- 200ms
Challenges
- Manual deployments with SCP and no Auditing or CI/CD
- Development teams testing code changes directly in production (vim)
- Expensive, non-scalable pet infrastructure
- Limited monitoring and alerting
- No verified disaster recovery process
What I Delivered
- Introduced Terraform to create entire estates from scratch with monthly rollover DR validation
- Implemented blue/red security testing
- Added Datadog monitoring with on-call routing rules and threshold-based alerts
- Deployed per-pull-request environments
- Migrated non-production to Kubernetes with production migration roadmap
- Implemented Cloudflare caching for edge delivery of assets
- Forensic analysis of security exploit attempts and vulnerability management
Outcomes
- Enabled scaling to 200,000+ orders/day with improved SLA compliance
- Reduced infrastructure costs via automation and Kubernetes adoption
- Improved platform security posture through proactive testing
- Enhanced developer productivity with ephemeral preview environments
- Provided resilience and operational continuity during AWS outages
- Improved observability with comprehensive monitoring and alerting
- Automated Security Monitoring and Incident Response