About Voltai
Voltai’s mission is to re-build the physical world through developing super-intelligence to accelerate the pace of hardware innovation. Our focus is to build frontier models that could understand one of the world’s most complex technologies—semiconductor and electronics.
About the Role
We’re looking for a Software Engineer - AI Training Data who thrives in tackling complex challenges and is passionate about building innovative systems for data management. As a Software Engineer at Voltai, your primary responsibility will be to build and optimize the world’s largest semiconductor dataset, leveraging your expertise in software engineering and scalable infrastructure.
You will play a critical role in preparing data for our Machine Learning team and developing systems that manage vast amounts of information across various modalities such as text, images, circuits, and more.
Key Responsibilities:
- Build and manage the world’s largest semiconductor dataset.
- Develop software solutions to efficiently scrape and handle data at an internet scale.
- Extract and clean information from diverse modalities, including text, images, circuits, simulations, and signals.
- Prepare and preprocess data for the Machine Learning team.
- Build systems to handle the transfer of customer data and feedback.
- Parse documents across various formats and structures.
- Develop software pipelines for data labelers and manage workloads across large cloud compute clusters.
- Implement and maintain systems for pre-processing datasets for AI training.
Required Skillsets:
- Proven experience in building scalable software solutions for data pipelines.
- Expertise in PDF parsing and data extraction.
- Strong software engineering skills with a passion for improving data and model performance.
- Experience working with modalities beyond text and demonstrating exceptional work in those areas.
- Ability to build custom data processing libraries from scratch.
- Keeping up with state-of-the-art techniques for preparing AI training data.
- Proficiency in organizing and meticulously managing data across multiple clouds, modalities, and sources.
Bonus Points:
- Background in Electrical Engineering.
- Experience in connecting machine learning model behavior to data distribution and data quality.
- Experience in fine-tuning large language models.
- Experience at a hyper-growth startup.
- Experience building software systems for training foundation models.
Compensation Philosophy
At Voltai, we believe that exceptional work deserves exceptional rewards. Our compensation structure is designed to reflect the value each team member brings to our pioneering efforts in the semiconductor and AI industries. For this role, we offer a highly competitive market-rate salary, tailored to the candidate's experience, expertise, and potential impact. Our approach ensures that compensation aligns with individual contributions and the long-term success of our mission.
Our Benefits
At Voltai, we believe in taking care of our team so they can focus on pushing the boundaries of innovation. Our benefits package is designed to support your well-being and fuel your professional growth.
- Unlimited PTO: We trust you to manage your time and know when you need a break. Recharge when you need it, no questions asked.
- Comprehensive Health Coverage: Your health matters. We offer top-tier medical and dental insurance to keep you and your loved ones covered.
- Commitment to Your Growth: At Voltai, we’re dedicated to your continuous learning and development. Whether it’s through challenging projects or opportunities for professional advancement, we invest in your journey to becoming a leader in your field.
- Visa Sponsorship: We support international talent and offer visa sponsorship to help you join our team, no matter where you are in the world.