Full-time (40 h), as soon as possible, permanent and based in Berlin or remotely in home office.
We’re seeking an experienced Site Reliability Engineer (SRE) with a solid foundation in Python, a passion for performance optimization, and a proactive approach to infrastructure management. In this role, you’ll work closely with development and operations teams to maintain, monitor, and improve the reliability of our systems, leveraging cutting-edge tools and methodologies to ensure peak performance.
Tasks
- Design, implement, and optimize systems to improve the reliability, performance, and scalability of our services.
- Build and maintain observability solutions using tools like Jaeger, Prometheus, and Grafana to enhance monitoring, tracing, and alerting across applications.
- Collaborate with development teams to build, manage, and scale Kubernetes environments, ensuring high availability and robust service delivery.
- Develop automation scripts and tools in Python to enhance system reliability and reduce manual intervention.
- Diagnose and resolve incidents, conduct root-cause analysis, and implement measures to prevent recurrence.
- Participate in on-call rotations, ensuring rapid response to system issues while continuously improving incident management processes.
Requirements
- Proficiency in Python for scripting and automation.
- Experience with tracing tools such as Jaeger or similar to troubleshoot and monitor complex distributed systems.
- Experience with monitoring tools such as Prometheus or similar for collecting and alerting on metrics.
- Experience with dashboarding tools such as Grafana or similar for creating visualizations that aid in system monitoring and diagnostics.
- Experience working in Kubernetes environments, with an understanding of container orchestration, scaling, and resource management.
Preferred Qualifications (Optional):
- Hands-on experience with CI/CD pipelines and DevOps practices.
- Familiarity with cloud platforms (AWS, GCP, Azure) and infrastructure-as-code tools like OpenTofu.
Benefits
- Competitive salary
- Flexible work hours and remote work opportunities.
- A beautiful Gather remote office
- An ambitious and helpful team
- Opportunity to work with cutting-edge technologies and make a significant impact in a fast-growing startup environment
Are you interested?
Then apply right now by sending your CV If available, please include a Github link. A cover letter is not necessary.
If you have any questions, please contact us or just give us a call!