Site Reliability Engineer
4 weeks ago
We are looking for a skilled Site Reliability Engineer (SRE) with expertise in Ansible and Linux to join our dynamic team. The successful candidate will play a critical role in maintaining the reliability, scalability, and performance of our infrastructure, driving automation, and collaborating with development teams to optimize system efficiency. Key Responsibilities Infrastructure Automation Automate and maintain IT infrastructure using Ansible to streamline operations. System Administration (Linux and Windows) Manage virtual and physical Windows and Linux servers. Automate server patching and updates to ensure systems remain current. Implement automated security measures for all servers. Monitor server performance and health. Maintain comprehensive system documentation, including configuration and troubleshooting guides. Conduct troubleshooting and root cause analysis as needed. Ensure robust backup, disaster recovery, and business continuity plans are in place and followed. Azure Cloud Management Collaborate with DevOps to deploy, configure, and manage Azure virtual machines and resources. Monitor cloud services for availability, performance, and security. Work with the networking team to implement, monitor, and secure cloud networking infrastructure. Ensure backup, disaster recovery, and business continuity plans are maintained for cloud systems. System Monitoring and Optimization Deploy and maintain monitoring tools for proactive system oversight and alerting. Analyze performance data to identify and resolve bottlenecks. Conduct capacity planning to support scalability and meet business needs. Partner with development teams to enhance application performance on infrastructure. Documentation and Collaboration Create and update technical documentation, including system configurations and procedures. Work with cross-functional teams to provide technical support and solutions. Participate in on-call rotations and respond promptly to system emergencies. Stay informed on industry trends, emerging technologies, and best practices in system administration, cloud computing, and virtualization. Qualifications Bachelors degree in Computer Science, Information Technology, or a related field (or equivalent experience). Relevant certifications (e.g., Linux Professional Institute (LPIC), Microsoft Certified: Azure Administrator Associate) are a plus. Experience & Technical Skills Minimum of 8 years in an Enterprise IT environment, with at least 3 years in a DevOps or SRE role. Strong expertise in Ansible for automation and configuration management. Proficient in Linux system administration (installation, configuration, troubleshooting). Hands-on experience with hypervisor technologies (e.g., VMware, Hyper-V, Proxmox). Knowledge of containerization technologies (e.g., Docker, Kubernetes). Experience managing Azure cloud services, including VMs, storage, networking, and security. Proficiency in scripting languages (e.g., Bash, PowerShell, Python) for automation. Skills & Competencies Excellent problem-solving skills and ability to work independently or in a high-performance team. Strong sense of ownership over tasks, projects, and issues. Effective communication and interpersonal skills to collaborate with stakeholders at all levels. #J-18808-Ljbffr
-
Site Reliability Engineer
3 weeks ago
Johannesburg, South Africa SprintHive - Intelligent Customer Onboarding Full timeCompany Description SprintHive is a technology company that automates end-to-end customer onboarding processes. With a mission to reduce onboarding time from days to minutes, SprintHive’s solutions significantly boost conversion rates, reduce costs, and enhance customer experiences. The innovative platform includes necessary checks and risk management to...
-
Site Reliability Engineer
1 week ago
Johannesburg, South Africa Nedbank Full timeNedbank Johannesburg, Gauteng, South Africa Site Reliability Engineer Requisition ID: REQ Recruiter: Keabetswe Modise Closing Date: 05 December 2025 Job Family: Information Technology Career Stream: Application Development Leadership Pipeline: Manage Self: Professional Job Purpose To serve as an IT professional specialising in Site Reliability Engineering...
-
Site Reliability Engineer
1 week ago
Johannesburg, South Africa Nedbank Full timeREQ - Keabetswe Modise Closing Date: 05 December 2025 Job Family Information Technology Application Development Manage Self: Professional Job Purpose To serve as an IT professional specialising in Site Reliability Engineering (SRE) at Nedbank, contributing to the strategic capability of the organisation as part of a dynamic team. The role is focused on...
-
Site Reliability Engineer
5 days ago
Johannesburg, South Africa Impronics Technologies Full timeJoin to apply for the Site Reliability Engineer (SRE) role at Impronics Technologies 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer (SRE) role at Impronics Technologies Get AI-powered advice on this job and more exclusive features. We are seeking a seasoned Site Reliability Engineer (SRE) with a solid background in
-
Site Reliability Engineer
4 days ago
Johannesburg Metropolitan Area, South Africa SprintHive - Intelligent Customer Onboarding Full time R500 000 - R1 200 000 per yearCompany DescriptionSprintHive is a technology company that automates end-to-end customer onboarding processes. With a mission to reduce onboarding time from days to minutes, SprintHive's solutions significantly boost conversion rates, reduce costs, and enhance customer experiences. The innovative platform includes necessary checks and risk management to...
-
Site Reliability Engineer
1 week ago
Johannesburg, Gauteng, South Africa Nedbank Full time R1 800 000 - R2 500 000 per year*Requisition Details & Talent Acquisition Consultant*REQ Keabetswe ModiseClosing Date: 05 December 2025*Job Family*Information Technology*Career Stream*Application Development*Leadership Pipeline*Manage Self: ProfessionalJob PurposeTo serve as an IT professional specialising in Site Reliability Engineering (SRE) at Nedbank, contributing to the strategic...
-
Johannesburg, South Africa nedbank Full timeA leading financial services provider in Johannesburg seeks an experienced IT professional specializing in Site Reliability Engineering. The role involves ensuring the reliability and efficiency of technology solutions while mentoring junior engineers. Candidates should have a strong IT background with a minimum of 8 years' experience in relevant...
-
Site Reliability Engineer
5 days ago
Johannesburg, South Africa Impronics Technologies Full timeJoin to apply for the Site Reliability Engineer (SRE) role at Impronics Technologies 1 day ago Be among the first 25 applicants Join to apply for the Site Reliability Engineer (SRE) role at Impronics Technologies Get AI-powered advice on this job and more exclusive features. We are seeking a seasoned Site Reliability Engineer (SRE) with a solid background in...
-
Site Reliability Engineer
2 weeks ago
Johannesburg, Gauteng, South Africa ExecutivePlacements - The JOB Portal Full time R900 000 - R1 200 000 per yearSite Reliability Engineer (Datadog)Recruiter:Data CentrixJob Ref:JHB006874/LDDate posted:Tuesday, October 7, 2025Location:Johannesburg, South AfricaSUMMARY:Are you aSite Reliability Engineerwith solidDatadogexperience? Our client in the Warehousing and Logistics sector is looking to employ an Engineer to Support the design, implementation, and optimization...
-
Principal Site Reliability Engineer
4 days ago
Johannesburg, Gauteng, South Africa Deimos Full time R120 000 - R180 000 per yearDeimos is a Cloud-native Developer and Security Operations technology services company. We help companies of all sizes adopt the Cloud for improved service delivery to their clients. We're a fully remote African-based team of engineers who are passionate about implementing engineering best practices. We leverage the latest technologies while building...