Principal Data Engineer (GCP) Location: 100% Remote 6-month contract to hire Our client is seeking a Principal Data Engineer passionate about data in all forms, whether stored within relational databases, data warehouses, data lakes, lakehouses, or in-transit in ETL pipelines.
The role involves architecting and implementing data solutions to deliver insights, visualizations, or better predictions for their clients.
The Principal Data Engineer will support software development teams, data analysts, and data scientists using market-relevant products and services.
Key Responsibilities
- Oversee the entire technical lifecycle of a cloud data platform, including framework decisions, feature breakdowns, technical requirements, and production readiness
- Design and implement a robust, secure data platform in GCP using industry best practices, native security tools, and integrated data governance controls
- Translate defined data governance strategies into technical requirements, implementing controls, documenting processes, and fostering a data-driven culture
- Utilize complex SQL knowledge to work with relational databases, BigQuery, and various other databases
- Build analytics tools that leverage the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other business performance metrics
- Design and implement scalable and reliable data pipelines on GCP
- Implement Change Data Capture (CDC) techniques and manage Delta Live Tables for real-time data integration and analytics
- Configure and manage Data Lakes in GCP to support diverse data types and formats for scalable storage, processing, and analytics
- Design API architecture, including RESTful services and microservices, integrating Machine Learning models into production systems
- Build infrastructure required for ETL of data from various sources using SQL and GCP
- Migrate and create data pipelines and infrastructure from AWS or Azure to GCP
- Write and maintain robust, efficient, scalable Python scripts for data processing and automation
- Utilize a strong understanding of data pipeline design patterns to determine the best solution for use cases
- Work with unstructured datasets and build processes supporting data transformation, structures, metadata, dependency, and workload management
- Collaborate with stakeholders to assist with data-related technical issues and support their data infrastructure needs
- Ensure the stability and security of data in transit and at rest
- Build internal processes, frameworks, and best practices for the data engineering domain
- Foster cross-functional collaboration between engineering and other project disciplines
- Mentor and support the growth of other data engineers
- Participate in the internal leadership of the data engineering domain and provide technical feedback and recommendations
- Assess the technical skills of prospective candidates and provide recommendations to hiring managers
- Assist with sales requests by providing technical recommendations and estimates to prospective clients
Skills & Qualifications
- Bachelor's in Computer Science or related field or equivalent experience required
- At least 15 years of overall technical experience
- 3 years leading large GCP projects
- Extensive knowledge of GCP data services such as BigQuery, Dataflow, Dataproc, and Pub/Sub
- Experience designing and implementing data governance and compliance policies
- Proficiency in Python and SQL
- Experience with migrating data pipelines and infrastructure to GCP
- Deep understanding of data modeling, ETL processes, and data warehousing principles
- Familiarity with data pipeline orchestration tools and practices
- Excellent problem-solving and analytical skills
- Strong communication skills with the ability to convey technical information to non-technical stakeholders
- Proactive collaborator with a history of mentoring colleagues
- Experience building and optimizing big data pipelines and data sets
- Experience with APIs and additional database management systems is a plus