Site Reliability Engineer
Job Description
Overview
The Site Reliability Engineer (SRE) at UST will play a critical role in ensuring the stability, reliability, and performance of our cloud infrastructure and services. Collaborating closely with development and operations teams, the SRE will implement best practices in system automation, incident management, and optimization of our cloud services. This position requires a balance of software engineering skills and systems administration expertise to support and improve our production environment while adopting a mindset of continual improvement and innovation.
Job Responsibilities
- Design, maintain, and optimize scalable and resilient cloud infrastructures.
- Implement monitoring, alerting, and incident response to ensure uptime and performance.
- Collaborate with development teams to improve product reliability and performance through automation.
- Troubleshoot and resolve production issues in a timely manner.
- Develop and maintain configuration management and deployment processes.
- Engage in capacity planning and performance tuning for production systems.
- Document procedures, processes, and incident reports for continuous improvement.
Qualifications
- Bachelor’s degree in Computer Science, Engineering, or a related field.
- Minimum of 3 years of experience in Site Reliability Engineering, DevOps, or a similar role.
- Proficiency in cloud platforms such as AWS, Google Cloud, or Microsoft Azure.
- Strong programming skills in languages such as Python, Go, or Java.
- Experience with container orchestration platforms like Kubernetes or Docker.
- Solid understanding of Linux systems and server administration.
- Familiarity with CI/CD pipelines and configuration management tools like Ansible or Terraform.
Benefits
- Competitive salary and performance-based incentives.
- Comprehensive health, dental, and vision insurance.
- Flexible work arrangements and remote work options.
- Generous paid time off and vacation policies.
- 401(k) retirement plan with company match.
- Employee development and professional training opportunities.
- Wellness programs and employee support resources.
Technologies & Tools
In this role, the Site Reliability Engineer will work extensively with a range of tools and technologies, including cloud service platforms such as Amazon Web Services (AWS) and Google Cloud. Additionally, proficiency in Kubernetes for container orchestration and monitoring tools like Prometheus or Grafana will be essential. Familiarity with infrastructure as code tools such as Terraform and configuration management systems like Ansible will also be critical for automating processes and ensuring system reliability.
Ideal Candidates
Ideal candidates for the Site Reliability Engineer position at UST will exhibit a combination of technical expertise and a proactive attitude towards problem-solving. They should possess strong analytical skills, the ability to work collaboratively in a team setting, and a passion for implementing automated solutions. Candidates must be adaptable, willing to learn new technologies, and committed to maintaining high service availability while optimizing performance and infrastructure costs.
??? people applied to this job
View Similar Jobs
Similar jobs which you may be interested in. Typically using your existing skillset.
$100,000 - $150,000/Mo
11 hours ago
Site Reliability Engineer
Team Remotely Inc.San Diego, USA
Site Reliability
Engineer
Infrastructure Maintenance
$100,000 - $150,000/Mo
23 hours ago
Junior Site Reliability Engineer
Patterned Learning Career.San Diego, USA
Site Reliability
Engineering
Junior
$60,000 - $90,000/Mo
23 hours ago
Site Reliability Engineer
Phoenix Recruitment.San Diego, USA
Site Reliability Engineer
DevOps
System Administrator
$95,000 - $160,000/Mo
23 hours ago
Staff Site Reliability Engineer
2K.San Diego, USA
Site Reliability Engineering
Staff Engineer
Infrastructure Operations
$90,000 - $130,000/Mo
20 hours ago
$90,000 - $150,000/Mo
9 hours ago
Sr. Site Reliability Engineer
Veza.San Diego, USA
Site Reliability Engineer
Senior Engineer
Systems Architect
$120,000 - $160,000/Mo
3 hours ago
Senior AWS Cloud & Site Reliability Engineer
INFY8.COM.San Diego, USA
AWS Expert
Cloud Infrastructure Specialist
Senior Site Reliability Engineer
$120,000 - $160,000/Mo
1 hour ago
$110,000 - $150,000/Mo
15 hours ago
Senior Systems Administrator
Ingenta.San Diego, USA
Systems Administration
IT Management
Network Infrastructure
$80,000 - $120,000/Mo
20 hours ago