Job Title: Solution Architect - SRE/DevOps/Platform Engineering
Location: Atlanta, GA (Onsite)
Duration: Long Term Contract
Job Description:
Responsible for designing, implementing, and optimizing our technology infrastructure to ensure scalability, reliability, and efficiency across our systems.
## Responsibilities
### Architecture Design and Implementation
- Design and implement scalable, resilient, and highly available infrastructure solutions using cloud technologies and best practices leveraging Terraform/Github actions.
- Develop comprehensive technical strategies that align with business objectives and drive operational excellence overall and in specific to High Availability/Disaster recovery Cloud systems.
- Create architectural blueprints and detailed technical specifications for complex systems and applications.
- Develop expertise in event-driven architectures and related technologies (e.g., Apache Kafka/Eventhub, Redis, Mongo Atlas, IoTHub).
### DevOps and SRE Practices
- Establish and promote DevOps and Site Reliability Engineering (SRE) practices throughout HON IA-PSS Group.
- Implement continuous integration and delivery (CI/CD) workflows to streamline software development and deployment processes.
- Design and implement monitoring, alerting, and observability solutions to ensure system reliability and performance.
- Create self-healing systems and automate routine operational tasks to reduce manual intervention.
### Automation and Optimization
- Identify opportunities for automation and develop strategies to implement them across the infrastructure.
- Optimize existing systems and processes to improve efficiency, reduce costs, and enhance performance.
- Implement Infrastructure as Code (IaC) practices using tools like Terraform (Preferred), Ansible, or ARM.
### Collaboration and Leadership
- Work closely with development teams, operations, and other stakeholders to ensure alignment between technical solutions and business needs.
- Provide technical guidance and mentorship to junior engineers and team members.
- Act as a liaison between technical and non-technical stakeholders, translating complex concepts into understandable terms.
### Risk Management and Security
- Assess and mitigate technical risks associated with infrastructure and application architectures.
- Ensure that security best practices are integrated into all aspects of the infrastructure and application design.
## Requirements
- Strong knowledge of cloud platforms - Azure and their associated services.
- Expertise in containerization technologies such as Docker and Kubernetes
- Understanding of Event-driven architecture and database technologies (Mongo Atlas, Azure SQL, PostgresDB )
- Proficient in IaaC tools such as - Terraform and GitHub Actions.
- Proficiency in one or more programming languages - Python/.Net/Java
- Strong understanding of networking concepts, load balancing, and security practices.
Regards,
Vignesh