From DevOps to Production Engineering: Navigating the Path with Vanita Mohite, Principal Production Engineer at Yahoo

Understanding Production Engineering, SRE, and Scaling Large-Scale Systems at Yahoo. SRE, DevOps,, Incident Management, CI/CD, Cloud, Automation, Monitoring, Documentation, Agile

🚀 DevOps Career Talks | RealOps Podcast 🎙️

Guest: Vanita Mohite, Principal Production Engineer, Yahoo
Host: Gourav Shah, Founder of School of DevOps


🔹 Episode Summary:

In this insightful episode of DevOps Career Talks, we sit down with Vanita Mohite, Principal Production Engineer at Yahoo, to explore what it takes to transition from DevOps to SRE and Production Engineering. Vanita shares her 15+ years of experience at Yahoo, highlighting the evolution of infrastructure from on-premise data centers to AWS and Kubernetes (EKS), the critical role of observability, and the challenges of managing large-scale production systems.

Yahoo, despite no longer being at its peak, remains a powerhouse in Site Reliability Engineering (SRE) and Production Engineering. Vanita walks us through her career journey, from working with databases at IBM to becoming a Principal Production Engineer, leading major cloud migrations, automation projects, and resiliency strategies.

Thanks for reading School of Devops! Subscribe for free to receive new posts and support my work.


🔹 Key Takeaways from the Episode:

1️⃣ The Path from DevOps to SRE/Production Engineering

  • Vanita's early career in IBM’s Tivoli Backup team, scripting in Perl and Java for test automation.

  • Moving to Yahoo and starting with database administration, evolving into automation and infrastructure scaling.

  • SRE vs. DevOps: How production engineering at Yahoo involves designing, building, scaling, and maintaining critical systems.

2️⃣ Building & Managing Large-Scale Infrastructure at Yahoo

  • Yahoo’s hybrid infrastructure: On-prem data centers, AWS, GCP, Kubernetes (EKS).

  • Managing 1,000+ hosts, handling millions of user requests per hour.

  • Scaling Yahoo Help and internal applications with high availability and performance tuning.

3️⃣ Observability & Incident Management in Production Engineering

  • Role of monitoring & alerting: Custom Nagios-based monitoring, Splunk for logs, Datadog & New Relic for APM.

  • The importance of incident response & on-call: 30% of time spent on observability, monitoring, and post-mortem analysis.

  • SLA, SLI, SLO deep dive: Understanding availability contracts and how they affect business.

4️⃣ The Transition to Cloud & Automation

  • Migrating legacy applications to AWS & Kubernetes while maintaining custom internal stacks.

  • Automating CI/CD pipelines with Jenkins, Maven, and Ant.

  • Evolution of DevOps practices at Yahoo and lessons from building an enterprise-scale CI/CD system.

5️⃣ Avoiding Production Outages: Lessons from Mistakes

  • Vanita shares a memorable production outage due to server restarts in the wrong environment. 😨

  • How color-coding terminals for Dev, Staging, and Prod became a game-changer in preventing future issues.

  • Why GitOps and modern DevOps tools reduce human errors.

6️⃣ The Rise of FinOps: Cost Optimization in Production Engineering

  • Understanding FinOps: Tracking AWS cost projections for infrastructure.

  • How Yahoo’s custom dashboards help teams optimize resources in real-time.

  • The analogy of FinOps as a fitness tracker – what you measure, you optimize.


🎧 Why Listen to This Episode?

Are you a DevOps Engineer exploring SRE or Production Engineering? This episode breaks down what you need to know.
Curious about managing large-scale production systems? Get behind-the-scenes insights into Yahoo’s global infrastructure.
Want to learn about career growth in DevOps, SRE, and Production Engineering? Vanita shares practical lessons, mistakes, and strategies.

👉 Tune in now and take your DevOps career to the next level!

📢 Follow the RealOps Podcast on YouTube & Substack!

Taking Devops to Next Level

If you are a Devops Professional and want to take it to the next level, check out our programs on DevSecOps, Advanced Devops, MLOps and more at campus.schoolofdevops.com

Take your Devops Career to Next Level



Chapters

  • 00:00 Vanita's Journey into Production Engineering / SRE

  • 01:49 Transition to Yahoo and SRE / ProdOps Role

  • 05:25 The Role of a Production Engineer

  • 17:02 Infrastructure Management and Scaling

  • 22:26 Monitoring and Availability Strategies

  • 24:02 Monitoring and Incident Management in DevOps

  • 28:01 Understanding SLAs, SLIs, and SLOs

  • 32:55 Learning from Human Errors in Production

  • 37:30 The Evolution of DevOps and Collaboration

  • 42:40 Cost Management and Optimization in Cloud

  • 47:34 The Impact of Cloud Transformation on Startups

  • 48:39 The Importance of Documentation in Tech Projects

  • 50:47 Learning from Incidents and Continuous Improvement

  • 53:13 The Role of Automation in Production Engineering

  • 56:19 Balancing Work and Family as a Woman in Tech

  • 59:09 Navigating a Career Path in SRE and Production Engineering

  • 01:04:26 Essential Skills and Tools for Aspiring SREs

  • 01:09:53 Future Trends: MLOps and AIOps

  • 01:12:07 Outro