SRE Engineer
2 days ago
Date Opened
02/12/2025
Job Type
Full time
Industry
Systems Engineering
Work Experience
5 years
Education Level
Degree/B-Tech
City
Cape Town
Province
Western Cape
Country
South Africa
Postal Code
7405
Job DescriptionEstablished in 2001, RSAWEB is South Africa's fastest growing internet service provider (ISP) with a focus on providing connectivity to home customers, and a wide array of technology solutions to businesses. We are obsessed about ensuring all our customers receive the best possible digital experience and exceptional customer service. Thousands of customers have given RSAWEB a 5-star rating, with an average rating of 4.7 out of 5 on Google – the best-rated ISP in South Africa. We are extremely proud of winning KFM's Best of the Cape Awards: Best ISP in 2021 and 2022 being one of the fastest streaming ISPs on Netflix and a consistently top-rated ISP on MyBroadband. These accolades are not for nothing, as we constantly strive to improve our products, services, and solutions to enhance each customer's experience. Having invested heavily in infrastructure, RSAWEB has built a strong presence in South Africa with Data Centres in Johannesburg and Cape Town.
Our Products and Services:
- Fibre-to-the-Home (FTTH)
- Fibre-to-the-Business (FTTB)
- Enterprise connectivity
- Mobile connectivity and data management
- Cloud infrastructure and more
At RSAWEB, we are passionate about using our creativity, to provide innovative solutions and services, that allow our customers to succeed in all areas of life. We believe that we are in the business of connecting customers and businesses with each other and a world of infinite possibility and opportunity, through technology. Our mission transcends our values through every customer, every interaction, every connection, every day.
Our values:
- We Build Trust and Ownership
- We Honour & Respect People
- We Cultivate Passion & Creativity
- We Innovate Feverishly
- We Go the Extra Mile
- We Believe in Humility
- We Communicate Openly & Honestly
- We Make it Fun
- We Teach, Grow & Learn
- We Do More, With Less
Role Purpose:
The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, performance, scalability, and availability of RSAWEB's platforms, network services, and customer-facing systems. This role blends software engineering, infrastructure automation, and operations to deliver highly reliable services and improve the efficiency of technical teams.
Key Responsibilities1. Reliability & System Performance
- Maintain high availability and performance across platforms, services, and infrastructure.
- Define, measure, and improve SLIs/SLOs/SLAs for critical systems.
- Troubleshoot system and network reliability issues proactively.
- Build automation for deployments, monitoring, configuration, and operational tasks.
- Improve CI/CD pipelines and assist engineers with release engineering.
- Reduce manual work (toil) by implementing self-service tools and automation workflows.
- Deploy, manage, and optimise cloud and on-prem infrastructure (Linux servers, virtualisation, containers).
- Work with network teams to ensure resilient integration between systems and ISP network elements.
- Manage and scale containerised platforms (Docker, Kubernetes).
- Implement and maintain monitoring, alerting, and logging solutions (e.g., Prometheus, Grafana, ELK, Datadog).
- Ensure actionable, low-noise alerting and system dashboards.
- Use metrics to identify performance bottlenecks and reliability risks.
- Participate in incident response, including root cause analysis and corrective actions.
- Improve monitoring and automation to prevent repeated issues.
- Assist with on-call rotations to support critical services.
- Implement security best practices across systems and deployments.
- Support vulnerability scanning, patching, and secure configurations.
- Ensure compliance with internal and industry standards (ISO, POPIA, etc).
- Work closely with Network Engineering, DevOps, Software Development, and NOC teams.
- Provide technical guidance in system design, scalability, and reliability improvements.
- Improve operational processes through documentation and automation.
Minimum Qualifications
- Diploma or degree in Computer Science, Engineering, Information Technology, or related field.
- Relevant certifications (AWS/Azure/GCP, Linux, Kubernetes, Terraform) are beneficial.
- 3–5+ years in SRE, DevOps, Systems Engineering, or Infrastructure roles.
- Experience supporting large-scale, mission-critical environments (preferably ISP or telecom).
- Strong background in Linux (CentOS, Ubuntu, Debian) administration.
- Experience with container orchestration and Infrastructure as Code.
Technical Skills
- Strong scripting skills (Python, Bash, Go preferred).
- CI/CD tools: GitHub Actions, GitLab CI, Jenkins, ArgoCD, etc.
- IaC: Terraform, Ansible, Pulumi, CloudFormation.
- Cloud platforms: AWS / Azure / GCP (or private cloud / OpenStack).
- Monitoring: Prometheus, Grafana, Zabbix, ELK, Datadog.
- Networking fundamentals: DNS, DHCP, firewalls, load balancing, routing.
- Databases: SQL and NoSQL basics.
- Knowledge of ISP infrastructure such as BNGs, RADIUS, DNS clusters (advantage).
- Medical Aid (Discovery)
- Reduced Gap Cover Rates (Turnberry Premier)
- Retirement Annuity Contribution (Allan Gray)
- Medical Insurance (Momentum - Health4Me)
- Discounted Internet Connectivity
- Free Employee Wellness Programme (Lyra Wellbeing, formerly ICAS)
- Exposure to latest industry technologies and standards
- Lastly, a work environment that rivals the very best
If you have not heard from us within 2 weeks of submitting your application, please consider your application as unsuccessful.
-
SRE (Site Reliability Engineer)
3 weeks ago
Cape Town, South Africa Travelstart Full timeJoin to apply for the SRE (Site Reliability Engineer) role at Travelstart Continue with Google Continue with Google 2 days ago Be among the first 25 applicants Join to apply for the SRE (Site Reliability Engineer) role at Travelstart Get AI-powered advice on this job and more exclusive features. Sign in to access AI-powered advices Continue with Google...
-
SRE Tech Lead
2 days ago
Cape Town, Western Cape, South Africa RSAWEB Full time R120 000 - R240 000 per yearJob InformationDate Opened02/12/2025Job TypeFull timeIndustrySystems EngineeringWork Experience8 yearsEducation LevelDegree/B-TechCityCape TownProvinceWestern CapeCountrySouth AfricaPostal Code7405Job DescriptionEstablished in 2001, RSAWEB is South Africa's fastest growing internet service provider (ISP) with a focus on providing connectivity to home...
-
SRE (Site Reliability Engineer)
3 weeks ago
Cape Town, South Africa Travelstart Full timeJoin to apply for the SRE (Site Reliability Engineer) role at Travelstart Continue with Google Continue with Google 2 days ago Be among the first 25 applicants Join to apply for the SRE (Site Reliability Engineer) role at Travelstart Get AI-powered advice on this job and more exclusive features. Sign in to access AI-powered advices Continue with Google...
-
SRE
3 weeks ago
Cape Town, South Africa Robin AI Full timeOverview Robin AI is on a mission to rebuild the legal industry — starting with making contracts simple for everyone. We are a pioneer in Legal AI, built on proprietary models, licensed data, and deep partnerships with Anthropic and AWS. Since 2019, we’ve expanded our footprint to 4 continents and have been supporting many of the world’s most...
-
SRE (Site Reliability Engineer)
5 days ago
Cape Town, South Africa TravelLab Global AB Full timeOur Travelstart team is seeking an SRE (Site Reliability Engineer) for our Dev Team. This role ensures the reliability, performance, and scalability of the Travelstart systems. This role bridges the gap between software development and system operations, focusing on automating infrastructure and processes to improve reliability and efficiency. (This role is...
-
SRE (Site Reliability Engineer)
5 days ago
Cape Town, South Africa TravelLab Global AB Full timeOur Travelstart team is seeking an SRE (Site Reliability Engineer) for our Dev Team. This role ensures the reliability, performance, and scalability of the Travelstart systems. This role bridges the gap between software development and system operations, focusing on automating infrastructure and processes to improve reliability and efficiency. (This role is...
-
sre
4 days ago
Cape Town, Western Cape, South Africa Robin AI Full time R900 000 - R1 200 000 per yearAbout RobinRobin is on a mission to rebuild the legal industry — starting with making contracts simple for everyone. We are a pioneer in Legal AI, built on proprietary models, licensed data, and deep partnerships with Anthropic and AWS. Since 2019, we've expanded our footprint to 4 continents and have been supporting many of the world's most successful...
-
Lead Sre
7 days ago
Cape Town, South Africa Robin AI Full time**About Robin**: Robin is on a mission to**rebuild the legal industry — starting with making contracts simple for everyone. We are a pioneer in Legal AI, built on proprietary models, licensed data, and deep partnerships with Anthropic and AWS. Since 2019, we’ve expanded our footprint to 4 continents and have been supporting many of the world’s most...
-
Sre
7 days ago
Cape Town, South Africa Robin AI Full time**About Robin**: Robin is on a mission to**rebuild the legal industry — starting with making contracts simple for everyone. We are a pioneer in Legal AI, built on proprietary models, licensed data, and deep partnerships with Anthropic and AWS. Since 2019, we’ve expanded our footprint to 4 continents and have been supporting many of the world’s most...
-
Senior SRE: Cloud, Kubernetes
1 week ago
Cape Town, South Africa LexisNexis Risk Solutions Full timeA global provider of information-based solutions is seeking a Senior Site Reliability Engineer (SRE) in Cape Town. The ideal candidate will lead Kubernetes deployments, collaborate with cross-functional teams, and mentor junior staff. Key qualifications include strong expertise in cloud services, infrastructure as code, and scripting languages. This role...