OpenText
Senior Site Reliability Engineer
- Designed and implemented a highly available Amazon EKS Kubernetes cluster with Istio service mesh across multiple AWS Availability Zones, achieving 99.99% uptime and enabling elastic pod and node auto-scaling for optimal performance and cost efficiency.
- Implementing end-to-end observability and incident response with New Relic and PagerDuty, reducing production incident detection time by 40% and improving on-call responsiveness.
- Kubernetes cluster deployment (everything-as-code) using Terraform (IaC) and GitLab CI/CD pipelines.
- Developing and maintaining multi-cloud infrastructure across AWS and Azure using Terraform and GitLab CI/CD pipelines.
- Building intelligent New Relic dashboards to enable fast, data-driven troubleshooting and proactive system monitoring.
- Collaborating with multiple engineering teams to debug and resolve production issues across multi-layered services, ensuring high availability and reliability.
- Assisting with all aspects of the hiring and onboarding process.
- Mentoring and training engineers on Kubernetes, cloud infrastructure, and DevOps best practices, improving team capability and knowledge sharing.