Job Summary
InfoTech Solutions is seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our global remote team. In this role, you will be responsible for ensuring the reliability, scalability, performance, and availability of our mission-critical platforms used by customers worldwide. As an SRE, you will bridge the gap between software engineering and IT operations, applying engineering principles to automate systems, improve infrastructure resilience, and proactively prevent service disruptions.
You will collaborate closely with development, DevOps, cloud, and security teams to design and maintain highly available distributed systems, optimize system performance, and drive continuous improvement in system reliability. This is an exciting opportunity to work on large-scale systems and play a key role in shaping the future of our digital infrastructure.
Key Responsibilities
-
Design, implement, and maintain reliable, scalable, and highly available cloud infrastructure.
-
Monitor system performance, availability, and capacity using advanced observability tools.
-
Automate operational tasks, deployments, and incident response processes.
-
Participate in on-call rotations to respond to incidents and perform root cause analysis (RCA).
-
Develop and maintain SLIs, SLOs, and SLAs to ensure service quality.
-
Collaborate with engineering teams to improve application reliability and performance.
-
Conduct post-incident reviews and implement preventive measures.
-
Optimize CI/CD pipelines and infrastructure as code (IaC) practices.
-
Ensure security, compliance, and best practices in system operations.
-
Continuously improve system efficiency, reducing manual intervention and operational risk.
Required Skills and Qualifications
-
Strong experience with cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP).
-
Proficiency in at least one programming or scripting language (Python, Go, Java, Bash, or similar).
-
Hands-on experience with containerization and orchestration tools (Docker, Kubernetes).
-
Solid understanding of Linux/Unix system administration.
-
Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, New Relic).
-
Knowledge of Infrastructure as Code tools (Terraform, Ansible, CloudFormation).
-
Familiarity with CI/CD tools (Jenkins, GitHub Actions, GitLab CI, CircleCI).
-
Strong troubleshooting, debugging, and problem-solving skills.
-
Excellent communication and collaboration abilities in a remote environment.
Experience
-
Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
-
3+ years of experience in Site Reliability Engineering, DevOps, Systems Engineering, or a similar role.
-
Proven experience managing production systems in a high-availability environment.
-
Experience working with distributed systems and microservices architectures.
Working Hours
-
Full-time, remote position.
-
Flexible working hours with overlapping availability for global team collaboration.
-
Participation in on-call rotation may be required, with compensatory time off.
Knowledge, Skills and Abilities
-
Deep understanding of system reliability, scalability, and performance principles.
-
Ability to design fault-tolerant and resilient systems.
-
Strong analytical mindset with attention to detail.
-
Capability to work independently and manage priorities in a remote setup.
-
Proactive attitude toward continuous learning and improvement.
-
Ability to document processes and share knowledge across teams.
Benefits
-
Competitive salary and performance-based incentives.
-
Fully remote work with flexible schedules.
-
Health, wellness, and insurance benefits.
-
Learning and development programs, certifications, and training support.
-
Paid time off, holidays, and remote work allowances.
-
Opportunity to work with cutting-edge technologies at global scale.
Why Join InfoTech Solutions?
At InfoTech Solutions, we believe in building reliable systems and empowering our people to innovate. You will be part of a diverse, collaborative, and forward-thinking team that values engineering excellence and continuous improvement. We offer a culture of trust, flexibility, and growth, where your ideas matter and your impact is visible across global platforms.
Joining InfoTech Solutions means contributing to world-class digital solutions while enjoying the freedom and balance of a remote-first environment.
How to Apply
Interested candidates are encouraged to apply by submitting their updated resume and a brief cover letter highlighting their relevant experience. Shortlisted candidates will be contacted for a technical assessment and interview process.