Full Time

Site Reliability Engineer

Team Remotely Inc
Remote!
$100,000 - $150,000* / year

Job Description

Job Overview

The Site Reliability Engineer (SRE) at Team Remotely Inc. will play a crucial role in ensuring the reliability, availability, and performance of our systems and services. This position demands a proactive approach toward system monitoring, capacity planning, incident response, and driving improvements through automation. The ideal candidate will apply software engineering principles to system administration tasks, enabling the organization to maintain high standards of service and operational excellence.



Job Responsibilities

  • Design, maintain, and improve system architecture to ensure optimal performance and scalability.
  • Utilize monitoring tools to proactively identify and resolve issues in production environments.
  • Implement automation tools to enhance efficiency and minimize human error.
  • Participate in on-call rotations to provide 24/7 support for critical systems.
  • Collaborate with development teams to ensure the reliability of new features and services.
  • Conduct post-mortem analyses on incidents to identify root causes and propose preventative measures.
  • Create and update documentation for systems, processes, and incident responses.


Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or a related field.
  • 3+ years of experience in system administration, DevOps, or site reliability engineering.
  • Strong programming skills in languages such as Python, Go, or Ruby.
  • Experience with containerization technologies such as Docker and orchestration frameworks like Kubernetes.
  • Proficient in cloud computing platforms such as AWS, Azure, or Google Cloud.
  • Familiarity with CI/CD pipelines and version control systems like Git.
  • Strong analytical and problem-solving skills, with the ability to remain calm under pressure.


Benefits

  • Competitive salary and performance-based bonuses.
  • Comprehensive health, dental, and vision insurance.
  • Flexible work schedule and remote work opportunities.
  • Generous paid time off and holiday leave.
  • Professional development and continuous learning stipends.
  • Retirement savings plan with company matching.


Technologies & Tools

The Site Reliability Engineer will work with a variety of technologies and tools that include cloud infrastructure (AWS, Azure), containerization platforms (Docker, Kubernetes), monitoring solutions (Prometheus, Grafana), and configuration management tools (Ansible, Terraform). Familiarity with logging systems (ELK Stack, Splunk) and CI/CD frameworks (Jenkins, CircleCI) is also highly beneficial for successful candidates.



Ideal Candidates

The ideal candidate for the Site Reliability Engineer position will possess strong technical abilities coupled with excellent communication skills. They should demonstrate a strong sense of ownership and have a collaborative mindset to work effectively within diverse teams. A commitment to continual improvement and a willingness to share knowledge, as well as the ability to adapt in a rapidly changing environment, will greatly benefit the SRE's contribution to Team Remotely Inc.

View Similar Jobs

Matches Jobs

Similar jobs which you may be interested in. Typically using your existing skillset.