Job Summary
IT sight Technologies is seeking a highly skilled and detail-oriented SRE Observability Engineer to join our remote engineering team. This role is critical in designing, implementing, and optimizing observability solutions that ensure the reliability, performance, and scalability of our production systems. The ideal candidate will have strong hands-on experience with modern monitoring platforms such as Splunk and Datadog, along with a solid foundation in Site Reliability Engineering (SRE) principles. You will collaborate closely with DevOps, platform, and application teams to drive proactive monitoring, incident reduction, and continuous service improvement.
Key Responsibilities
-
Design, implement, and maintain end-to-end observability solutions across distributed systems and cloud environments.
-
Configure and optimize Splunk and Datadog dashboards, alerts, and log pipelines.
-
Establish and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets.
-
Partner with engineering teams to instrument applications for metrics, logs, and traces.
-
Perform root cause analysis for incidents and implement preventive measures.
-
Automate monitoring workflows and improve alert quality to reduce noise and fatigue.
-
Support incident response processes and participate in on-call rotations when required.
-
Continuously evaluate and recommend improvements to observability architecture and tooling.
-
Document monitoring standards, runbooks, and best practices.
Required Skills and Qualifications
-
Strong hands-on experience with Splunk and/or Datadog in enterprise environments.
-
Solid understanding of SRE principles, reliability engineering, and production operations.
-
Experience with cloud platforms such as AWS, Azure, or GCP.
-
Proficiency in scripting or programming (Python, Bash, or similar).
-
Familiarity with distributed tracing, log aggregation, and metrics collection.
-
Experience with containerized environments (Docker, Kubernetes).
-
Strong analytical and troubleshooting skills.
-
Excellent written and verbal communication skills in English.
Experience
-
3–6+ years of experience in Site Reliability Engineering, DevOps, or Observability roles.
-
Proven track record implementing monitoring solutions in production environments.
-
Experience supporting high-availability, customer-facing systems preferred.
-
Prior remote work experience is an advantage.
Working Hours
-
Fully remote role with flexible working arrangements.
-
Expected overlap with core business hours for collaboration.
-
Participation in an on-call rotation may be required based on team needs.
Knowledge, Skills and Abilities
-
Deep understanding of system performance, scalability, and reliability concepts.
-
Ability to correlate logs, metrics, and traces to diagnose complex issues.
-
Strong problem-solving mindset with attention to detail.
-
Ability to work independently in a remote, fast-paced environment.
-
Collaborative team player with a continuous improvement mindset.
-
Capacity to prioritize multiple tasks and manage time effectively.
Benefits
-
Competitive salary and performance-based incentives.
-
Fully remote work environment.
-
Professional development and certification support.
-
Health and wellness benefits (as per company policy).
-
Exposure to modern cloud-native and observability technologies.
-
Collaborative and innovation-driven culture.
Why Join
Joining IT sight Technologies means becoming part of a forward-thinking technology team that values reliability, automation, and engineering excellence. You will work on modern cloud platforms, influence observability strategy, and help build resilient systems that support mission-critical applications. This role offers strong growth potential, flexibility, and the opportunity to make a measurable impact in a rapidly evolving technology landscape.
How to Apply
Interested candidates should submit their updated resume along with a brief cover letter highlighting relevant observability and SRE experience. Applications will be reviewed on a rolling basis, and shortlisted candidates will be contacted for the next steps in the selection process.