Software Engineer, Site Reliability

Remote, Full Time

down arrow

Who we are

Sensible is a climate and finance technology company. We’re using data and finance to confront the single largest problem facing the global economy and society: climate change.

Sensible was founded in 2019, and our first product is focused on consumer travelers and event-goers, allowing them to better understand, plan for, and mitigate weather that could negatively impact their experience. Our team combines atmospheric science, cloud architecture, product engineering, and user experience design to create useful and delightful products.

Sensible is a team built on trust, feedback, and communication. We recognize that diversity of background, skills, and experiences makes stronger teams, and we are therefore an equal opportunity employer.

down arrow

What you’ll be working on

  • Coordinate with engineering and product leaders to maintain a working roadmap for  business systems reliability and developer experience improvements and projects
  • Document and maintain SRE best practices
  • Maintain existing cloud based infrastructure including AWS resources and Kubernetes clusters
  • Maintain and improve monitoring, logging, and instrumentation/tracing systems
  • Implement and improve observability, alerting, on-call systems and procedures
  • Improve and implement CI/CD practices and pipelines for deploying containerized apps
  • Improve and implement monitoring for basic cloud security concerns including AWS/Kubernetes access management, endpoint security, and obfuscation of sensitive information

Required qualifications

  • A bachelor's degree in a STEM related field, or equivalent industry experience
  • Commitment to the spirit of continuous improvement
  • Flexibility around working hours in order to maintain high systems availability

Experience and comfort working with the following technologies or their equivalent:

  • AWS: IAM, VPC, EC2, Routing/Security, EKS, S3, ALB/NLB, RDS/Aurora
  • Kubernetes: Cluster management, deployments/services/pods, autoscaling, metrics, ingress, certificate management
  • CI/CD: Github actions or another common CI system like Circle, Travis, AWS Codepipeline, etc…
  • Programming: an imperative language like Python, Node, Go, Java, and/or Rust
  • Tooling: Terraform, Docker, AWS Cloudformation, Git

Desired qualifications

  • Experience with developing custom event-based pipelines for CI/CD and/or systems automation/management
  • Experience with creating custom SlackOps integrations for systems notifications and administration
  • Demonstrated ability to create basic internal tool webapps to facilitate things like configuration management, deployments, security, and/or monitoring systems
  • Experience maintaining system reliability in high-traffic environments - 10000+ requests/minute


To apply for this role email your resume to We will review and respond to all applications.