Site Reliability Engineer

Indus Valley Consultants • Contract • Novi, MI, US • 3d ago

our financial client-REQ-027369

Title: Site Reliability Engineer

Work Location: Charlotte, NC (Hybrid - Prefer candidates local to Charlotte)

Duration: 6+Months

Top Skills

Hard Skills:

Experience with DevOps tools
Observability & Monitoring tools, experience in public cloud platform AWS
Leading incident response for critical issues, Terraform experience

Soft Skills Mandatory Skills:

Ability to work across teams
Proactive approach to observability
Monitoring, vision to find issues and automate themJob Description:
Run the production environment by monitoring availability and taking a holistic view of system health
Support the applications with OnCall rotation support.
Provide stability to our applications and facilitates rapid feature development by taking active control on direction of the service and be proactive
Automate and eliminate manual work and look for opportunities for automation
Maintaining and implementing the SLO implementation adoption and automation
Production Readiness/Health Scoring & Error Budget Tracking
Runbook standards, maintenance, and updates

Required: Experience using DevOps tools and technologies such as GitLab, and Infrastructure as Code tools such as Terraform

Strong troubleshooting skills and building and enhancing the observability using monitoring tools

Proactive approach to Observability maturity, identifying problems, performance bottlenecks, and areas for improvement for observability

Leading incident response and supporting application teams.

Blameless postmortems Developer feedback for enhanced logging, runbooks and addressing technical debt.

Promoting observability best practices

Experience in monitoring tools Dynatrace & Splunk

Experience in public cloud platforms, preferably AWS and Api gateways

Experience developing API or Microservices or Frontend is a plus

Experience using source version control (SVC) such as Git