SITE RELIABILITY ENGINEER

What will be your key responsibilities:

SITE RELIABILITY ENGINEER

We're seeking a Site Reliability Engineer (SRE) team member with on-call duties to manage and oversee our SAFEQ Cloud print services hosted on AWS. This new colleague will play a critical role in ensuring system reliability, infusing SRE principles into the company culture and processes, and responding to system emergencies in a 24/7 setup.

RESPONSIBILITIES:

  • Monitor and manage the SAFEQ Cloud print service, ensuring high availability and reliability within the AWS environment.
  • Develop and implement tools and practices for automating routine tasks to improve system scalability and resilience.
  • Set up alerts and monitoring metrics for proactive identification and mitigation of system issues.
  • Participate in capacity planning and performance tuning to enhance system performance.
  • Collaborate with software engineering teams to ensure seamless deployment, efficient trouble-resolution, and effective crisis management.
  • Conduct root cause analysis following system incidents - post mortems; define corrective actions and preventative measures.
  • Education and Training: Act as an educator and advocate for SRE best practices. Train and mentor cross-functional teams in SRE principles.

What experience should you have:

  • Fluent English, good communication skills.
  • Experience in an SRE role.
  • Proficiency with AWS and its various services and resources.
  • Solid understanding of the software development life cycle, CI/CD pipelines.
  • Problem-solving skills, with the ability to think systematically.
  • Knowledge of networking, security, and database systems.
  • Availability for on-call duties in a 24/7 setup.

Mám zájem o tuto pozici

Poslat nabídku na e-mail

Další pozice v oboru Informační technologie, region Czech Republic

IT Support Specialist

  • DISPONERO
  • Klášterec nad Ohří
  • Dohodou

Hledáme IT Support Specialistu do výrobního závodu, který zajistí plynulý chod IT infrastruktury, přímou podporu uživatelů a kompletní správu hardwaru včetně související administrativy.

IT Support Specialist

Site Reliability Enginer

  • WMC Group
  • Prague
  • By agreement

About the Role:We are ACTUM Digital, a software house delivering digital solutions for one of the world’s largest auction houses. We are expanding our reliability capabilities and seeking a dedicated…

Site Reliability Enginer

Delivery Lead

  • WMC Group
  • Prague
  • By agreement

We are looking for an experienced Delivery Lead – Salesforce to oversee delivery quality and governance in a Salesforce-focused organization.

Delivery Lead