We are looking for a Lead Azure Site Reliability Engineer (SRE) to enable efficient monitoring and observability of the CDC Azure infrastructure and and applications.

The SRE will lead operations of the cloud environment with observability, IAC, and cloud-native best practices.

The engineer will be part of a larger effort to modernize the CDC DevOps enterprise framework by joining the team of 20 which is comprised of data scientists, software engineers, product owners, and DevOps engineers.

Mechanicode is a remote-first company, and this role will be 100% remote.

W2-Salary: 140-160k

Required

  • Must be a U.S citizen or green-card holder
  • 8+ years of professional experience
  • Proven leadership track record
  • Ability to pass a background check and obtain a public trust security clearance

Essential Skills, Experience, and Competencies:

  • Proficient with Observability in the cloud, building monitoring & alerting frameworks (grafana, datadog, newrelic etc.)
  • Has built alert escalation plans, disaster recovery infrastructure, and setup on-call rotations
  • Proficient with implementing cloud infrastructure on Azure.
  • Proficient with Terraform
  • Experience with Linux, and Bash scripting.
  • Experience with Kubernetes (AKS)
  • Substantial experience with programming languages like Python
  • Experience with containerization technologies (e.g.Docker, containerD)
  • Ability to develop the architecture for continuous integration and deployment as well as continuous monitoring
  • Experience supporting scalable and elastic applications on distributed architectures.
  • Strong ability and understanding of securing systems on the application, network, and infrastructure layers.
  • Experience managing network/compute/database infrastructure with infrastructure-as-code.
  • Expert in basic git actions like cloning, creating branches, navigating between branches, staging code for commit, committing code, resetting, and merging.
  • Ability to mentor & support junior members
  • Proven ability to work under pressure and in fast-paced environments.
  • Ability to operate and manage work, strategically reason, build relationships and influence others.


Nice to Have

  • Azure Certifications

Interview Steps

  • Preliminary Screen
  • Technical Assessment & Review
  • Client Review

Why Mechanicode?

Mechanicode’s vision is to bring peace of mind with technology.

We do so by building self-healing cloud infrastructure, resilient enough to withstand failures and sufficiently predictable to resolve issues without human intervention.

We do that by having automation as the cornerstone of our cloud solutions, significantly improving workforce attrition, and introducing agile rapid development conventions that improve the developer's experience.


About Mechanicode

Mechanicode a Cloud Digital services firm providing comprehensive DevSecOps, Cloud Native Engineering, IT Modernization & Automation services.

Founded by a former USDS engineer, Mechanicode has 13 years of experience developing innovative automation solutions improving the feedback loop in the developer experience, and using AWS/Azure Certified best practices for clients.

Mechanicode has experience in both the public and private sectors, providing modernization services that engage Agile best practices, scalable cloud architectures, and continuous integration & deployment standards.