×
Jake Herbst

Jake Herbst

Senior Site Reliability Engineer

Atlanta, GA
(256) 479-5000

About

About

Site Reliability and Cloud Infrastructure Engineer focused on observability, reliability, and the automation and management of related cloud infrastructure.

Work Experience

Work Experience

  • Senior Site Reliability Engineer, Workday

    Sep, 2023 - Present

    Site Reliability Engineer focussed on developer improvements, observability, and cloud infrastructure.

    • Researched, proposed, and adopted Apache's DevLake tool to help teams better understand their developement practices and identify areas for improvement

    • Introduced and integrated security tools (git-secrets, SonarQube) into the development pipeline, enabling early identification and resolution of security vulnerabilities, thereby reducing risk and improving code quality

    • Led the migration of containerized services from CentOS 7 to CentOS 9, enhancing system security, performance, and compatibility with modern applications

  • Senior Site Reliability Engineer, Cypress.io

    Mar, 2022 - Present

    Software and Cloud Infrastructure Engineer focused on site reliability, observability, and cloud infrastructure for our SaaS product.

    • Instrumented our services running in Heroku with Prometheus metrics and Grafana Agent 'sidecars' to enable better observability and alerting

    • Deployed and managed AWS infrastructure(RDS, ECS, Redshift, etc), Github configuration and Grafana confiiguration with Terraform

    • Improved our processes and tooling to speed up the overall release process and reduce developer toil

    • Migrated multiple services from Heroku to ECS to reduce operation spend and improve observability

  • Senior Site Reliability Engineer, Mailchimp | Intuit

    Apr, 2015 - Mar, 20226 years 11 months

    Designed and implemented a variety of tooling to improve deliverability, monitoring, performance, and efficiency.

    • Transitioned Mailchimp's manual deployment process which deployed large releases every 5 weeks to a continuous deployment and delivery pattern that deploys up to 150 times a day

    • Improved developer and database engineer experience around database migrations and scheduled jobs

    • Implemented observability and auto-remediation for our application's load-balancers(nginx) and http servers(apache)

    • Added site-wide external monitoring for our most high-level endpoints across our entire infrastructure

    • Established capacity and utilization metrics to better understand our infrastructure utilization is at any given time and better estimate need for additional infrastructure to support growth

    • Planned and led a major initiative to upgrade hundreds of our critical servers to latest OS and hardware with 0 downtime

    • Created logging pipeline to centralize our orphaned AWS logs into our existing ELK stack which streamlined the developer and support experience

    • Led a team of core site reliability engineers to handle cross-cutting engineering projects to enable the organization to more easily identify and approach hurdles preventing service migration to GCP

  • Systems Engineer, Tropo, Inc.

    Jun, 2013 - Apr, 20151 year 10 months

    Responsible for production infrastructure in AWS EC2, configuration management of that infrastructure, as well as building out tooling used by support and engineering teams to easily handle customer inquiries.

    • Developed tooling (Ruby + Sinatra) which enabled support and product engineering teams to easily administer customer accounts

    • Managed application infrastructure in AWS EC2

    • Automated configuration management processes with Chef cookbooks

  • Systems Engineer, Hewlett-Packard

    Jun, 2011 - Jun, 20132 years

    Responsible for automation of server configuration and build-out for HP internal IT projects.

    • Built and configured physical and virtual Linux and HP-UX servers for HP internal IT projects

    • Developed and managed build automation tools and configuration management tools

Skills

Skills

  • Languages

    Typescript

    Python

    Go

    PHP

    Ruby

    BASH

  • Technologies

    AWS

    GCP

    Terraform

    Open-Telemetry

    Docker

    CI Tooling (GH Actions, CircleCI, Jenkins, Bamboo)

    Kubernetes

    Prometheus

    Grafana

    Kibana

    Elasticsearch

    Puppet

    Chef

    Linux

    Nginx

    Apache

    HAProxy

Education

Education

  • Management Information Systems / Computer Science, Bachelor, University

    Jan, 2007 - May, 2011