26 Clark Rd, Mountain View, CA 94043, USA Req #1012
Thursday, February 13, 2025
ASRC Federal is a leading government contractor furthering missions in space, public health and defense. As an Alaska Native owned corporation, our work helps secure an enduring future for our shareholders. Join our team and discover why we are a top veteran employer and Certified Great Place to Work.
ASRC Federal, InuTeq is searching for a Junior HPC Engineer to support InuTeq LLC out of NASA AMES, CA.
ASRC Federal, InuTeq proudly supports NASA's High-Performance Computing Services program with our site in Mountain View, CA at the Ames Research Center. Make a DIFFERENCE on a program that supports 4 On-site Supercomputers totaling ~17,000 nodes and 25+ combined petaflops. This program provides High-Performance Computing services throughout the HPC lifecycle for computational requirements, architecture, acquisition, to our NASA customer. Our employees embrace innovation and are committed to a culture of continuous, standards-driven process improvement, and assimilation of industry best practices.
Summary : The successful candidate will be an active supporting member of the ASRC Federal team reporting directly to the Manager of the HPC Computer Systems and Storage (CSS) group.
An individual at this skill level should have demonstrated administration experience working with HPC clusters in small to medium size environments. Under direction of senior members of the team, this individual will be engaged in the day-to-day HPC operations and support of the infrastructure resources. Activities may include system patching, OS upgrades, deploying new systems, writing scripts, and troubleshooting system issues. The ability to interact with users to determine symptoms and then reproduce their issues to isolate the causes are critical skills for this work. There will also be activities supporting testing, benchmarking, user tool scripting, and analyzing trouble tickets to find patterns indicating system or user education issues.
Duties and Responsibilities:
- Supercomputing and Infrastructure System Administration that contributes to:
- Installation, provisioning, and/or rebuilding systems in both HPC and infrastructure environments
- Patching of assigned systems to NASA requirements
- Maintain, extend, and develop customized scripts to support user, monitoring, and general system administration
- Day-to-day operational escalation of the Linux HPC clusters and storage systems
- Proactive monitoring, analyze, and correct system issues
- Development of scripts to automate repetitive tasks or tools to enhance support of the HPC and infrastructure systems
- System performance analysis and tuning
- Building, installing, and supporting user-requested software
- Supporting evaluation and assessment of new HPC technology
- Resolving user report issues and manage support tickets requests via Remedy ticketing system
- Staff support resource for a myriad of HPC and Infrastructure Systems
- Operationalize completed projects appropriate knowledge transfer to tiered support groups
- Work extensively with HPC vendors and architects on bug fixes, kernel updates, and feature releases
- Apply best practices in systems engineering, delivering projects on time, on budget, with excellent quality
- After hours/weekend support as required
Requirements:
- Bachelor’s degree in computer science or related field
- Strong computer science background with in-depth systems-level knowledge in operating systems and networking
- A minimum of 3 years of Systems Engineering and Integration experience in heterogeneous, multi-platform HPC and Linux environments
- Solid understanding of the systems engineering process, including requirements, use cases, design, documentation, and testing in a Linux environment
- Demonstrated equivalence of 3 years of Linux/UNIX user support and hands-on experience with administration of Linux systems
- Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash
- Excellent communication and people skills; excellent time management and organizational skills
- Experience with system configuration management tools e.g., puppet, chef, ansible
- Experience with revision control software e.g., CVS, SVN, Git
- Proficiency at technical writing
Preferred Skills:
- Familiarity/proficiency with OpenMP and Message Passing Interface (MPI) programming
- Experience with Lustre, and InfiniBand
- Administration Experience with HPC schedulers (PBS, Slurm, Moab/Torque, SGE)
- Experience with cloud technologies (AWS, Azure, GCP), OpenStack or Kubernetes is a plus
We invest in the lives of our employees, both in and out of the workplace, by providing competitive pay and benefits packages. Benefits offered may include health care, dental, vision, life insurance; 401(k); education assistance; paid time off including PTO, holidays, and any other paid leave required by law.
EEO Statement
ASRC Federal and its Subsidiaries are Equal Opportunity employers. All qualified applicants will receive consideration for employment without regard to race, gender, color, age, sexual orientation, gender identification, national origin, religion, marital status, ancestry, citizenship, disability, protected veteran status, or any other factor prohibited by applicable law.
#J-18808-Ljbffr