About the Role:
We are seeking a highly motivated and skilled Data Engineer to join our growing data team. In this role, you will play a critical part in designing, developing, and maintaining our data pipelines and infrastructure. You will work closely with data scientists, analysts, and business stakeholders to ensure data quality, accuracy, and timely delivery.
Responsibilities:
- Design, develop, and maintain scalable and efficient data pipelines using Apache Spark, Scala, and other relevant technologies.
- Extract, transform, and load (ETL) data from various sources, including databases, APIs, and cloud storage (e.g., AWS S3, Azure Blob Storage, Google Cloud Storage).
- Build and maintain data warehouses and data lakes using technologies like Hadoop, Hive, and other big data technologies.
- Develop and implement data quality checks and monitoring systems.
- Troubleshoot data quality issues and performance bottlenecks.
- Collaborate with data scientists and analysts to understand their data needs and translate them into technical solutions.
- Participate in all phases of the software development lifecycle, including design, development, testing, and deployment.
- Stay up-to-date with the latest advancements in data engineering technologies and best practices.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 3+ years of experience as a Data Engineer or in a similar role.
- Strong proficiency in Apache Spark, Scala, and Python.
- Experience with Hadoop ecosystem (HDFS, Hive, HBase) or similar big data technologies.
- Experience with data warehousing concepts and methodologies.
- Experience with cloud platforms (AWS, Azure, GCP) is a plus.
- Experience with Agile development methodologies.
- Excellent communication and collaboration skills.
- Strong problem-solving and analytical skills.
- Passion for data and a desire to learn new technologies.