Site Reliability Engineer

2 days ago


WorkFromHome, South Africa k0deHut Full time

Site Reliability Engineer (SRE II) (Kubernetes/Python) Job Openings Site Reliability Engineer (SRE II) (Kubernetes/Python) About the job Site Reliability Engineer (SRE II) (Kubernetes/Python) Intermediate Site Reliability Engineer (SRE II) Our Client is offering the right candidate a great opportunity to join a fast growing South African fintech that enables seamless and innovative end-to-end customer onboarding services that drive conversion rates, prevent fraud, reduce risk and costs. They provide automated and easy to implement solutions that fully onboard a new customer in under two minutes. You'll work in a small, senior team that operates on trust and high collaboration. The team works remotely most of the time and occasionally comes into the office when more direct collaboration is required. You should be motivated to achieve operational excellence using automation tooling (e.g. Terraform) and enjoy keeping your technical skills current to allow you to contribute to architectural discussions. Naturally, you'll be exposed to many aspects of our business from day one. They will ensure that you have the tools and support to do great work, but you'll also have the freedom to try new things and learn. Infrastructure & Software Stack CI/CD with Jenkins Kong API Gateway LogDNA Falco MongoDB Atlas Microservice Architecture with Event Sourcing and CQRS Your responsibilities will include: Improving and maintaining our infrastructure using Terraform, which includes making effective use of public clouds (primarily Google Cloud and AWS) while considering: Security Maintainability Scalability Ensuring our infrastructure is automated and reproducible across environments Leveraging Kubernetes in an effective manner to host our applications Owning infrastructure projects from start to finish and driving them to completion within agreed timeframes Documenting infrastructure design and how tooling should be used Regularly considering the long-term vision for our infrastructure and our alignment to it Making well-considered tradeoffs between short-term infrastructure requirements and long-term objectives Identifying potential improvements that could enable us to deliver faster without compromising operational objectives Managing our identity platform and enabling enterprise user and system authentication and authorization using OAuth2 Writing, testing and executing change control plans for production changes with an eye for detail to spot potential issues Having a good working understanding of how our systems operate and be able to debug production issues Being part of our on-call rotation. When on-call, you will work on repaying technical debt and deal with operational incidents as and when they occur. This will require you to have or acquire a good general knowledge of production operations for technical support. Being part of our security incident response team Writing operational tooling to automate otherwise manual processes (e.g. Golang, Bash) Performing high quality, ego-free code reviews for your colleagues as well as submitting your code for review by others and accepting their feedback generously Taking ownership of our operational metrics and drive visibility, testing and improvement initiatives Working effectively with the development team to plan and deploy required infrastructure changes or new capabilities ahead of time and unblocking the development team when unforeseen infrastructure blockers arise Accepting feedback willingly and sharing your knowledge freely Flexible working hours and leave (no clock watching) Strong values that are practised Remote work for most days of the week Opportunity to learn and grow being surrounded by a strong technical team #J-18808-Ljbffr



  • WorkFromHome, South Africa Risingsun Softsol Full time

    Risingsun is Hiring SRE (Site Reliability Engineer) Work model: hybrid – 2/3 days at the office per week Open to Visa holders: No – only SA Citizens or SA ID holders Employment type: 12-month contract, renewable Location: JHB Client type: Banking Candidate having skilled and proactive Site Reliability Engineer (SRE) with 5+ Years experience The SRE will...


  • WorkFromHome, South Africa Robin AI Full time

    Robin AI City of Cape Town, Western Cape, South Africa Join or sign in to find your next job Join to apply for the Site Reliability Engineer role at Robin AI Robin AI City of Cape Town, Western Cape, South Africa Join to apply for the Site Reliability Engineer role at Robin AI About RobinRobin is on a mission to rebuild the legal industry — starting with...


  • WorkFromHome, South Africa DuckDuckGo Full time

    1 week ago Be among the first 25 applicants Who We AreHi, we're DuckDuckGo, the online protection company and remote-first team of 300+ on a mission to raise the standard of trust online. Founded in 2008 and profitable since 2014, our annual revenue now exceeds $100 million USD. Millions use our browser on Mac, Windows, iOS, and Android, our search engine,...


  • WorkFromHome, South Africa Canonical Full time

    Overview Site Reliability Engineer role at Canonical. Global remote location. Canonical is a leading provider of open source software and operating systems to the enterprise and technology markets, known for Ubuntu and open source infrastructure platforms. We deploy and run OpenStack, Kubernetes, storage solutions, and open source applications, applying...


  • WorkFromHome, South Africa Sana Commerce Full time

    Company Description What started in 2007 with a pizza and a plan has grown into a fast-moving SaaS company empowering manufacturers, distributors, and wholesalers to thrive in complex B2B commerce. Our mission is simple: help businesses build stronger relationships through seamless digital commerce. At Sana Commerce, you’ll join a team that’s bold,...


  • WorkFromHome, South Africa Duckduckgo Full time

    Who We Are DuckDuckGo is an online protection company and remote‑first team dedicated to raising the standard of trust on the web. Your Team & Role As part of the Site Reliability Team, you will build and maintain world‑class infrastructure that serves millions of users. Your work will involve high‑level languages such as Perl, Go, and Python, and...


  • WorkFromHome, South Africa Sana Commerce Full time

    6 days ago Be among the first 25 applicants Get AI-powered advice on this job and more exclusive features. Company Description What started in 2007 with a pizza and a plan has grown into a fast-moving SaaS company empowering manufacturers, distributors, and wholesalers to thrive in complex B2B commerce. Our mission is simple: help businesses build stronger...


  • WorkFromHome, South Africa Luno Full time

    A leading cryptocurrency platform based in Cape Town is seeking a Site Reliability Engineer to build and scale infrastructure, manage containerized environments using Kubernetes, and apply Infrastructure as Code principles. Candidates should have experience in DevOps roles and managing large infrastructure projects. The role offers flexible working options...


  • WorkFromHome, South Africa Canonical Full time

    Overview Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. The company is a pioneer of global distributed collaboration, with 1200+ colleagues in...


  • WorkFromHome, South Africa Canonical Full time

    Canonical is a leading provider of open‑source software and operating systems for global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation and IoT. Our customers include the world's leading public cloud and silicon providers, and...