This is a contract role with possibility of conversion to full time
Okay with remote but prefer someone who can come in onsite intially
NO OPT/H1/VENDOR CANDIDATES
Position Summary
We are seeking an experienced Senior Data Engineer to spearhead the implementation of Airflow (Astronomer version) on Azure Cloud in a highly scalable and performance-driven environment. This role requires a hands-on expert with proven experience designing, building, and optimizing data workflows, as well as managing cloud-based DevOps systems, specifically with Airflow, Databricks, and Azure.
The ideal candidate will possess deep technical knowledge in data engineering, demonstrate leadership in designing and implementing large-scale data systems, and excel in documentation and process standardization. This is a pivotal role in ensuring the successful migration and optimization of existing workflows while creating a robust foundation for future data engineering projects.
Key Responsibilities
Airflow Implementation & Optimization
- Architect, design, and implement Airflow (Astronomer) ecosystems from the ground up in Azure.
- Lead efforts to establish best practices for Airflow, including branching strategies, deployment pipelines, and repository configuration.
- Migrate existing orchestration workflows from Databricks to Airflow, ensuring seamless integration and enhanced performance.
Data Workflow Development & Optimization
- Build, manage, and optimize scalable data workflows using Databricks , PySpark , and SQL to handle large-scale datasets efficiently.
- Collaborate with stakeholders to design and maintain operational data systems that meet performance, scalability, and availability standards.
- Continuously monitor and fine-tune data processes to improve resource utilization and minimize workflow delays.
DevOps & Cloud Environment Management
- Work directly with environment managers to configure, monitor, and optimize development, testing, and production environments in Azure.
- Design cost-efficient, high-performing Azure cloud infrastructure tailored to project requirements.
- Ensure alignment with CI/CD best practices and version control strategies for a seamless development lifecycle.
Documentation & Best Practices
- Develop comprehensive design documents for review and approval by key stakeholders.
- Standardize coding, testing, deployment, and documentation practices to ensure high-quality deliverables.
- Mentor team members on implementing and maintaining robust data engineering standards.
Minimum Qualifications
Experience
- 5+ years of experience in data engineering with a focus on large-scale environments and complex workflows.
- Proven expertise in implementing Airflow and migrating workflows from Databricks.
Technical Proficiencies
- Databricks : Extensive hands-on experience in building and managing pipelines.
- Airflow : Demonstrated ability to set up and optimize Airflow environments, job scheduling, and automation.
- Azure : Strong experience configuring and managing cloud environments on Azure.
- Snowflake : Expertise in data warehousing with Snowflake.
- PySpark & Python : Advanced proficiency in building and optimizing distributed data operations.
- SQL : Mastery in querying, managing, and transforming data.
- DevOps : Experience with environment configuration, cost optimization, and cloud-based deployment pipelines.
Education
- Bachelor’s degree in Computer Science, Information Systems, or related field, or equivalent experience.
Preferred Skills
- Experience with AWS or GCP alongside Azure.
- Hands-on experience with Data Lakes/Delta Lake for large-scale data storage.
- Familiarity with serverless technologies and event-driven data workflows.
- Proficiency in CI/CD pipelines and tools specific to data engineering.
- Exposure to additional tools in the big data ecosystem.