Job Title : Data Engineer with Data Analyst Experience
Location : Pittsburgh, PA
Duration : Full-time
Domain : Banking
Job Summary:
We are seeking a talented Data Engineer with Data Analyst experience to join our team, supporting projects in the banking domain for Incedo The ideal candidate will possess hands-on experience with the Hadoop ecosystem, including technologies such as Spark, HBase, Hive, Pig, Sqoop, Scala, Flume, HDFS, and Map Reduce. The candidate will play a key role in designing and implementing robust data pipelines and analytics solutions, supporting critical banking operations.
Key Responsibilities:
- Design, develop, and maintain scalable data pipelines and ETL workflows within the Hadoop ecosystem.
- Work with banking domain stakeholders to gather requirements and deliver data-driven insights.
- Develop and optimize big data solutions using technologies such as Hadoop, Spark, and Hive.
- Perform data analysis and build dashboards to provide actionable insights for banking operations.
- Integrate various data sources, ensuring data accuracy, quality, and consistency.
- Collaborate with business analysts, data scientists, and development teams to deliver end-to-end solutions.
- Implement data governance and security best practices in line with regulatory compliance for the banking domain.
- Monitor and troubleshoot data workflows, ensuring system performance and reliability.
- Use tools like Sqoop and Flume to transfer and process data between structured and unstructured sources.
Leverage HBase for real-time data storage and querying as required by banking systems.
Qualifications:
Experience : 10+ years of experience in data engineering and analytics, preferably in the banking/financial domain.
Technical Skills :
Proficiency with Hadoop technologies (HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Flume).
Expertise in Spark for distributed data processing and Scala for programming.
Hands-on experience with ETL/ELT workflows and data pipeline development.
Knowledge of data storage solutions like HBase and tools for real-time data ingestion and processing.