SPS Commerce is seeking a Site Reliability Engineer who will partner with development to deliver market leading products and services. The (SRE) team is responsible for delivering highly available platform services and deployment automation that empower our product engineering teams with services that are secure, reliable, cost effective, and foster a high rate of velocity.
Does this sound like you?
- You make a personal investment and take pride in the work you do as an engineer. The motivation to do better is rooted deep in your DNA and it needs no invitation to show up.
- You have a passion for DevOps and you approach technology operations problems like they're software problems - and you apply software engineering approaches to resolution.
- You work collaboratively - you know that success is seldom the result of one person or one team. It takes a village to craft a highly reliable, secure and fast platform.
Why Join SPS?
- We have proven that we know what it takes to not only be successful, but to be an industry leader. We can boast over 17+ years of consecutive quarter over quarter growth. That growth translates to continuous investment in our people, process and technology.
- Our tech teams are built from the ground up by tech enthusiasts; from leadership to individual contributor, our team members often have side projects and/or are highly involved in the local tech community.
- We're not afraid to take risks and break the status quo - we must continuously adapt, and are expected to lead, in an industry that is going through huge transformation and disruption.
- Maintain highly available, secure, and cost-effective container orchestration platforms such as Kubernetes and ECS
- Engineer Continuous Integration & Continuous Delivery (CI/CD) solutions that simplify and improve software deployments to enable high velocity for our Product Engineering partners.
- Develop robust monitoring and observability services and patterns to consistently improve the team’s ability to identify, react, respond, and recover from complex failures.
- Collaborate with Technology Engineering, Development, and Product Management to help develop, scale, and improve production systems and services
- Partner with service teams to provide appropriate documentation, cross-training, architecture planning, capacity management, and recommendations for future state
- Engineer technical solutions to prevent or reduce the frequency of failures, but ensure that we have effective coping strategies when there is a failure.
What experience and skills do I need?
- Bachelor's Degree or equivalent years of experience
- 2 or more years of experience professional software engineering
- Strong DevOps mindset
- Experience administering Linux
- Experience participating in Agile development methodology and task execution
- Experience with immutable and scalable infrastructure (infrastructure as code concepts)
- Demonstrated understanding of various identity and authorization systems