Skip to main content
Checkr

Software Engineer - Reliability

2d

Checkr

San Francisco, US · Full-time · $127,000 – $176,000

About this role

Checkr is building the data platform to power safe and fair decisions. Over 140,000 companies and millions of people rely on Checkr for AI verification in the moments that matter most. Customers include Uber, Airbnb, DoorDash, Amazon, and Anthropic.

As a Software Engineer on the Site Reliability Engineering team within Platform Engineering, you will identify reliability challenges impacting engineering teams and platforms. You will develop innovative solutions while balancing standardization with tailored workflows.

You will design and maintain core observability libraries and tools used across all engineering teams. Daily work includes troubleshooting complex production issues around performance, availability, and data quality while participating in cross-organization incident response.

The role offers autonomy to handle complex support requests and contribute to architectural discussions. You will influence the reliability roadmap and help drive continuous improvement across the engineering organization.

Requirements

  • Bachelor’s degree in Computer Science or related field, or equivalent practical experience
  • 2+ years of software engineering experience, including 1+ years focused on reliability, scalability, and efficiency of distributed systems
  • Proficiency in Python (preferred), Go, or Ruby within Linux environments, and strong understanding of microservices, asynchronous systems, and remote APIs
  • Experience developing and operating production, customer-facing systems in AWS or Azure using Kubernetes, Docker, and Terraform
  • Skilled in observability and incident response practices using tools such as Datadog, Splunk, Grafana, Prometheus, and OpenTelemetry, with a focus on continuous improvement
  • Strong collaboration, documentation, and communication skills, with experience leading small projects, promoting platform adoption, and fostering a self-service, product-first mindset
  • An A-player mindset with a strong bias for action: raise the bar, move with urgency, stay resilient through ambiguity, and take ownership to deliver meaningful outcomes

Responsibilities

  • Design, build, ship, and maintain the core observability libraries, tools, and patterns used by all of Checkr’s engineering teams
  • Troubleshoot complex production issues across the stack, with respect to performance, availability, and data quality
  • Participate in a cross-organization incident response team, driving continuous improvement
  • Contribute to architectural discussions within the SRE team and with cross-functional teams
  • Influence cross-team projects and the reliability roadmap to enable engineering and help Checkr customers
  • Provide consultation and feedback across teams to ensure we are building highly reliable, efficient, and scalable systems