Site Reliability Engineer
2 weeks ago
Junior Site Reliability Engineer - SprintHiveAbout SprintHive
SprintHive is a South African fintech enabling seamless end-to-end customer onboarding that drives conversion rates, prevents fraud, and reduces risk. Our automated solutions fully onboard new customers in under two minutes.
The Role
We're seeking a Junior Site Reliability Engineer eager to launch their career in infrastructure and platform engineering. You'll work directly with our CTO, receiving hands-on mentorship while contributing to production systems from day one. This fully remote position offers exceptional learning opportunities in a modern cloud-native environment.
What You'll DoLearn & Build
- Gain hands-on experience with infrastructure-as-code using Terraform
- Work with Kubernetes and containerised applications
- Learn cloud platforms (Google Cloud and AWS) through real projects
- Develop automation scripts in Golang, Bash, or Python
- Participate in code reviews to improve your skills
Operational Support
- Assist in maintaining and improving our monitoring stack
- Help debug production issues alongside senior team members
- Document procedures and infrastructure components
- Support change control processes for production deployments
- Participate in on-call rotation (with full support and gradual responsibility increase)
Growth Opportunities
- Own small to medium infrastructure projects with guidance
- Contribute to security initiatives and incident response
- Collaborate with development team on infrastructure needs
- Progress toward independent project ownership
Requirements
- Strong problem-solving mindset and eagerness to learn
- Basic programming ability in any language
- Fundamental understanding of Linux/Unix systems
- Interest in cloud infrastructure and DevOps practices
- Ability to work independently in a remote environment
Preferred Qualifications
- Computer Science degree or relevant coursework (bootcamps, self-study welcome)
- Personal projects demonstrating infrastructure or automation work
- Familiarity with Git, containers, and cloud services
- Contributions to open source projects
Our Tech Stack
- Infrastructure
: Kubernetes (GKE), Terraform, Kong API Gateway - Monitoring
: Prometheus, Grafana, Elastic, Kibana, Mezmo, Falco - Languages
: Kotlin, Python, JavaScript, Golang - Architecture
: Microservices with Event Sourcing and CQRS - Database
: MongoDB Atlas - CI/CD
: Jenkins
Compensation & Benefits
- Salary: Competitive market rate
- 21 days paid leave
- Direct mentorship from experienced CTO
- Flexible working hours
- Fully remote position
- High-quality hardware (MacBook Pro, 34" Dell monitor)
- AI assistant subscriptions
- Clear growth path to Senior SRE role
Intermediate Site Reliability Engineer - SprintHiveAbout SprintHiveThe Role
We're seeking an Intermediate Site Reliability Engineer ready to take ownership of significant infrastructure projects. Working directly with our CTO, you'll have the autonomy to drive improvements while continuing to grow your expertise. This fully remote position offers the perfect balance of independence and support.
What You'll DoInfrastructure Ownership
- Maintain and improve infrastructure using Terraform across GCP and AWS
- Deploy and optimise applications on Kubernetes (GKE)
- Ensure infrastructure automation and reproducibility
- Own medium to large infrastructure projects independently
- Debug and resolve complex production issues
Operational Excellence
- Build automation tooling to eliminate manual processes
- Improve monitoring and alerting systems
- Write and execute production change plans
- Maintain security best practices in all deployments
- Participate confidently in on-call rotation
Collaboration & Growth
- Contribute to infrastructure architecture discussions
- Document systems and share knowledge with team
- Partner with development team on infrastructure requirements
- Participate in security incident response
- Begin mentoring junior team members as we grow
RequirementsMust-Haves
- 2-4 years of infrastructure/DevOps/SRE experience
- Hands-on experience with cloud platforms (GCP, AWS, or Azure)
- Working knowledge of infrastructure-as-code tools
- Container and orchestration experience (Docker, Kubernetes)
- Proven ability to complete projects independently
- Solid programming skills in at least one language (Python, Go, Bash)
Preferred Qualifications
- Terraform experience
- GCP/AWS certification
- Experience with monitoring tools (Prometheus, Grafana, ELK)
- Exposure to microservices architectures
- Computer Science degree or equivalent
Our Tech Stack
- Infrastructure
: Kubernetes (GKE), Terraform, Kong API Gateway - Monitoring
: Prometheus, Grafana, Elastic, Kibana, Mezmo, Falco - Languages
: Kotlin, Python, JavaScript, Golang - Architecture
: Microservices with Event Sourcing and CQRS - Database
: MongoDB Atlas - CI/CD
: Jenkins
Compensation & Benefits
- Salary: Competitive market rate
- 21 days paid leave
- Flexible working hours
- Fully remote position
- High-quality hardware (MacBook Pro, 34" Dell monitor)
- AI assistant subscriptions
- High project autonomy
- Clear growth path to Senior role
Why This Role?
Perfect for engineers ready to move beyond junior tasks but not yet requiring senior-level compensation. You'll own real projects, make meaningful decisions, and grow rapidly in a small team environment.
Senior Site Reliability Engineer - SprintHiveAbout SprintHiveThe Role
We're seeking a Senior Site Reliability Engineer to co-architect our infrastructure future. As the second senior SRE working directly with our CTO, you'll have exceptional influence over platform strategy and technical direction. This fully remote position is ideal for an expert seeking impact without bureaucracy.
What You'll DoStrategic Leadership
- Co-design long-term infrastructure vision and roadmap
- Lead complex, multi-quarter infrastructure transformations
- Define SRE standards, practices, and tooling strategies
- Make architectural decisions balancing scale, cost, and complexity
- Evaluate and introduce new technologies
Technical Excellence
- Architect sophisticated infrastructure solutions using Terraform
- Design zero-downtime deployment strategies and disaster recovery plans
- Lead Kubernetes platform optimisation for scale and efficiency
- Drive security architecture including identity management (OAuth2)
- Build advanced automation and self-healing systems
Team & Culture Building
- Mentor team members and establish engineering excellence standards
- Lead incident response and drive blameless post-mortem culture
- Interface with executive team on infrastructure strategy
- Own vendor relationships and technology evaluations
- Help build and scale the SRE team as we grow
RequirementsMust-Haves
- 5+ years of SRE/DevOps experience in production environments
- Expert-level knowledge of at least one major cloud provider
- Advanced Terraform skills with large-scale infrastructure experience
- Deep Kubernetes expertise including performance tuning and troubleshooting
- Proven track record of leading infrastructure initiatives
- Strong programming skills (Go, Python) for tooling development
- Experience mentoring engineers and driving technical standards
Preferred Qualifications
- Fintech or high-compliance environment experience
- Multi-cloud architecture experience
- Security certifications or demonstrated security expertise
- Open source contributions to infrastructure tools
- Experience scaling startups through rapid growth
Our Tech Stack
- Infrastructure
: Kubernetes (GKE), Terraform, Kong API Gateway - Monitoring
: Prometheus, Grafana, Elastic, Kibana, Mezmo, Falco - Languages
: Kotlin, Python, JavaScript, Golang - Architecture
: Microservices with Event Sourcing and CQRS - Database
: MongoDB Atlas - CI/CD
: Jenkins
Compensation & Benefits
- Salary: Competitive market rate
- 21 days paid leave
- Flexible working hours
- Fully remote position
- Premium hardware setup (MacBook Pro, 34" Dell monitor)
- AI assistant subscriptions
- Strategic influence at executive level
- Budget authority for infrastructure decisions
- Opportunity to build and lead growing SRE team
Why This Role?
Rare opportunity to join as a founding SRE member with CTO-level partnership. You'll have the authority of a Head of Infrastructure without the bureaucracy, the technical challenges of a scale-up without legacy constraints, and the influence to build the team and culture from scratch.
-
Senior Site Reliability Engineer
2 weeks ago
Cape Town, Western Cape, South Africa Sana Commerce Full time R120 000 - R180 000 per yearCompany Description What started in 2007 with a pizza and a plan has grown into a fast-moving SaaS company empowering manufacturers, distributors, and wholesalers to thrive in complex B2B commerce.Our mission is simple: help businesses build stronger relationships through seamless digital commerce.At Sana Commerce, you'll join a team that's bold,...
-
Senior Site Reliability Engineer
7 days ago
Cape Town, Western Cape, South Africa Sana Commerce Full time R1 500 000 - R2 500 000 per yearCompany DescriptionWhat started in 2007 with a pizza and a plan has grown into a fast-moving SaaS company empowering manufacturers, distributors, and wholesalers to thrive in complex B2B commerce.Our mission is simple: help businesses build stronger relationships through seamless digital commerce.At Sana Commerce, you'll join a team that's bold,...
-
Site Reliability
4 days ago
Cape Town, Western Cape, South Africa Canonical - Jobs Full time R80 000 - R120 000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Site Reliability Engineer
4 days ago
Cape Town, Western Cape, South Africa Canonical - Jobs Full time R600 000 - R1 200 000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers, and...
-
Reliability Engineer
2 weeks ago
Cape Town, Western Cape, South Africa 1a018d35-7946-4b80-8f4b-8f4ab3a9b78a Full time R120 000 - R180 000 per yearRELIABILITY ENGINEERWe are looking for a BSc or BEng (Mechanical or Electrical) Engineer, GCC (Factories) with Pr.Eng (ECSA) with 5+ years experiencein any physical asset intensive industry (such as oil & gas, chemical, utilities, power generation, manufacturing, facilities, nuclear etc.). The role will supportour Client's business a Reliability Engineer on...
-
Site Reliability Engineer
4 days ago
Cape Town, Western Cape, South Africa LexisNexis Full time R1 000 000 - R2 500 000 per yearAbout Our TeamLexisNexis Legal & Professional, which serves customers in more than 150 countries with 11,800 employees worldwide, is part of RELX, a global provider of information based analytics and decision tools for professional and business customers. Our company has been a long-time leader in deploying AI and advanced technologies to the legal market to...
-
Senior Site Reliability Engineer
4 days ago
Cape Town, Western Cape, South Africa Canonical - Jobs Full time US$120 000 - US$240 000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Site Reliability Engineering Manager
4 days ago
Cape Town, Western Cape, South Africa Canonical - Jobs Full time R600 000 - R1 200 000 per yearCanonical is a leading provider of open-source software and operating systems for global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers, and...
-
Senior Site Reliability
4 days ago
Cape Town, Western Cape, South Africa Canonical - Jobs Full time R120 000 - R180 000 per yearCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Site Engineer
2 days ago
Cape Town, Western Cape, South Africa Hire Resolve Full time R250 000 - R500 000 per yearWe are looking for a skilled Site Engineer to join our team in Cape Town. The Site Engineer will be responsible for overseeing and managing all construction activities on site, ensuring that the project is completed on time, within budget, and to the highest standard of quality.Key Responsibilities:- Oversee and manage all construction activities on site,...