Salary is 180k to 210k + bonus
Hybrid position
We are seeking a highly skilled and experienced Senior Site Reliability Engineer / SRE / DevOps Engineer with a strong background in AWS, microservices, and cloud-based technologies. This role is ideal for someone who thrives in a dynamic, fast-paced environment and has a passion for building scalable, secure, and efficient systems. If you have 10+ years of DevOps, TechOps, or SRE experience and a deep understanding of cloud technologies, we want to hear from you.
Key Responsibilities:
- Architect, implement, and manage cloud infrastructure in AWS, ensuring scalability, reliability, and security.
- Work with microservices-based architectures using Docker and Kubernetes in production environments.
- Develop and manage automation processes for provisioning, configuration management, and deployment.
- Use Terraform and/or CloudFormation for infrastructure provisioning on cloud platforms.
- Ensure efficient system monitoring by implementing tools such as Prometheus, Grafana, CloudWatch, Splunk, NewRelic, and ELK.
- Collaborate with development teams to streamline CI/CD pipelines and release processes.
- Maintain and optimize databases such as MongoDB, Postgres, and DynamoDB.
- Write maintainable and reusable code, with a strong focus on Python, Ruby, or Java for automation tasks.
- Troubleshoot and resolve complex issues related to infrastructure, networks, and systems.
- Continuously improve operational processes, monitoring, and alerting systems.
- Provide mentorship and technical leadership to junior engineers.
Qualifications:
- 10+ years of experience in DevOps, TechOps, or Site Reliability Engineering (SRE).
- 5+ years of hands-on experience with AWS, including working with EC2, S3, Lambda, VPCs, RDS, and other AWS services.
- Strong experience with microservices architectures and containerization technologies such as Docker and Kubernetes.
- Proficiency in Linux OS-level operations, command-line interfaces, and scripting.
- In-depth knowledge of configuration management principles.
- Familiarity with databases like MongoDB, Postgres, and DynamoDB.
- Experience with monitoring tools such as Prometheus, Grafana, CloudWatch, Splunk, NewRelic, and ELK.
- Strong coding skills with an emphasis on maintainable and reusable code in Python, Ruby, or Java.
- Experience with infrastructure provisioning using Terraform and/or CloudFormation.
- Excellent problem-solving skills, with the ability to quickly diagnose and resolve issues in a production environment.
- Familiarity with Agile development methodologies.
- Strong communication skills and the ability to work well with cross-functional teams.
Preferred Skills:
- AWS certifications (e.g., AWS Certified Solutions Architect or AWS Certified DevOps Engineer) are a plus.
- Knowledge of security best practices in cloud environments.
- Experience with serverless architecture.