Our client is a leader in providing digital platforms to educational institutions worldwide. Their solutions are used by over 7,000 schools and educational organizations across more than 100 countries. They specialize in helping schools streamline communication, manage enrollments, recruit staff, and support fundraising efforts. The company’s products include a comprehensive content management system, mass communication tools, and advanced marketing and data integration services. Founded in 1999 and headquartered in U.S., with a global team across North America, Europe, South America, and Asia, they provide tailored solutions to meet the complex needs of educational institutions.

 

 

As a Senior Site Reliability Engineer, you’ll be at the forefront of ensuring the stability, visibility, and operational efficiency of the production environment. This role is ideal for someone with a background in both software development and system engineering, with a passion for reliability and automation. You’ll work closely with engineering teams to advocate for site reliability and drive the adoption of best practices across the organization.

 

 

Senior Site Reliability Engineer
Data Center | Krakow or hybrid | Employment Contract or B2B

 

 

Qualification and Skills

 

 

  • Experience across Cloud, DevOps, SRE, or Systems Engineering roles.
  • Extensive experience with public cloud platforms, particularly Google Cloud Platform (GCP) (AWS and Azure are a plus).
  • Deep knowledge of containerization and orchestration technologies (Kubernetes, Docker).
  • Hands-on expertise in infrastructure as code, using tools like Terraform.
  • Proven ability to maintain critical production systems with minimal downtime.
  • Experience in CI/CD pipelines, using platforms such as GitLab or Jenkins.
  • Strong foundational knowledge of networking principles (e.g., routing, DHCP, DNS).
  • Willingness to join a 24/7 on-call rotation to support production environments.
  • Experience mentoring junior engineers and supporting their growth.
  • Familiarity with software development methodologies such as Scrum or SAFe.
  • Knowledge of email delivery protocols like SMTP, SPF, DKIM, and DMARC is a plus.
  • Experience with Helm/Kapitan, Golang, particularly within the Kubernetes API context and familiarity with Ruby on Rails would be nice to have.

 

 

Key Responsibilities

 

 

  • Advocacy and Leadership: Champion the importance of site reliability and articulate its value in both business and technical contexts. Influence key stakeholders to adopt and understand SLIs/SLOs in new systems and software.
  • Fault Tolerance Expertise: Bring in-depth knowledge of both software and system fault tolerance, providing guidance on architectural design patterns that enhance system resilience.
  • Automation Focus: Develop and implement automation solutions to reduce manual intervention and repetitive tasks. Infrastructure as code and other automation practices will be central to your approach.
  • Curiosity and Problem-Solving: Seek not only to resolve issues but to understand the “why” behind them. You’ll drive improvements by investigating root causes and system behaviors.

 

 

What You’ll Love Working Here

 

 

  • Executive Engagement: Receive direct attention and support from top executives, offering a unique opportunity for visibility and impact at an enterprise level.
  • Technological Innovation: Access to cutting-edge technologies, with the freedom to explore innovative solutions and propose your methods for problem-solving.
  • Professional Development: An environment that encourages growth and learning, allowing you to nurture your individual talents and aspirations.

 

 

you will like this
Flexible working hours
Healthcare package
Home office
Personal development
Remote work
 Flexible working hours
 Healthcare package
 Home office
 Personal development
 Remote work
Did you like this offer?
Send us your resume.
Edge Team
We will contact only selected candidates.