About The Role
Data drives everything at Grubhub, from training our state of the art search and recommendation systems, our real-time monitoring abilities, experimentation and readouts, and augmenting our data lakes for uses outside of engineering to say the least. We are looking for an innately curious, business-minded, results-oriented Data Engineer to work in our Discovery and Ads teams. As a member of this highly collaborative team you will partner with other data engineers, data scientists, engineering, and product to deliver new features and write data pipelines not only in Ads, but across the Top of the Diner funnel as a whole. You will be responsible for creating metrics to validate the performance and efficiency of data pipelines, proposing techniques to enhance our current systems, and designing data validation and quality checks to ensure the accuracy and reliability of our data. Additionally, you will drive best practices in data engineering, guiding the evolution of scalable and resilient data systems. Your responsibilities will also include creating comprehensive documentation accessible to both technical and non-technical audiences, mentoring junior data engineers, automating data pipeline monitoring and alerts, and advising on the best data processing strategies to align with business objectives.
Our team practices end to end project ownership and our work focuses heavily on personalized recommendation and classification from content and clickstream. Deep neural networks, transfer learning from pretrained large scale models, classic regressions, fine tuning, and large language models all have a place in our daily lexicon.
The Impact You will Make
- Design, build, and maintain data pipelines that efficiently move and transform data between production systems and data lakes, ensuring data integrity and availability.
- Collaborate closely with the Ads team to develop tailored data solutions that meet the specific needs of ad-related analytics and reporting, with an eye toward expanding these solutions to the broader platform.
- Optimize data storage and processing using formats like Parquet and other columnar storage solutions, ensuring that our data architecture is both performant and cost-effective.
- Leverage your expertise in Scala Spark, PySpark, or similar technologies to write scalable data processing jobs that handle large volumes of data with high efficiency.
- Work with AWS cloud services to deploy, manage, and monitor data pipelines, ensuring that they are resilient, secure, and scalable to meet the growing demands of our business.
- Collaborate with data scientists, analysts, and other engineers to understand data requirements, troubleshoot issues, and continuously improve our data infrastructure.
- Provide technical leadership and mentorship to junior engineers, promoting best practices in data engineering and helping to build a culture of continuous learning and improvement.
What You Bring To The Table
- Bachelor’s or Master’s Degree in Computer Science, Engineering, or a related field.
- 5+ years of experience in data engineering, with a strong focus on building and optimizing data pipelines in a cloud-based environment, preferably AWS.
- Proficiency in Scala Spark, PySpark, or similar data processing frameworks, with a proven ability to write efficient, scalable data processing jobs.
- Familiarity with data storage formats like Parquet, and experience with columnar storage solutions and optimization techniques.
- Hands-on experience with AWS services, such as EMR, Redshift, or similar tools, to build and manage large-scale data processing workflows.
- Strong understanding of data warehousing concepts, ETL/ELT processes, and data modeling techniques.
- Excellent problem-solving skills, with the ability to troubleshoot complex data pipeline issues and ensure data quality across the system.
- Strong communication and collaboration skills, with experience working in cross-functional teams to deliver data solutions that meet business needs.
- A commitment to staying up-to-date with the latest trends and best practices in data engineering, with a passion for continuous improvement and innovation.