GenAI Ops Engineer

job
  • iO Associates - US
Job Summary
Location
Washington ,DC 20022
Job Type
Contract
Visa
Any Valid Visa
Salary
PayRate
Qualification
BCA
Experience
2Years - 10Years
Posted
20 Jan 2025
Share
Job Description

Location: Remote
Contract Duration: 3 month contract (with possibility of extension)


Job Overview:
We are seeking a skilled GenAI Ops Engineer to join a 3+ month platform support project. The ideal candidate will play a critical role in ensuring the smooth operation of AI/ML models and APIs, providing both user-level and project-level support. You will work in an agile environment alongside a small, collaborative team to maintain and optimize platform operations on AWS SageMaker and Kubernetes.

Key Responsibilities:

  • User-Level Support:
    • Provide user support, including troubleshooting access issues, responding to user inquiries, and offering education and documentation to ensure effective usage of GenAI tools and platforms.
  • Project-Level Support:
    • Handle new requests and escalations related to GenAI models and APIs.
    • Provide hands-on maintenance of deployed AI/ML models and ensure the platform is functioning optimally.
  • Platform Maintenance and Engineering:
    • Oversee infrastructure, particularly AWS SageMaker, to ensure model deployments are efficient and reliable.
    • Collaborate with platform engineering teams to support the SageMaker Inference, Kubernetes services, and troubleshoot any issues that arise.
  • Model and API Management:
    • Maintain and optimize the API layer, ensuring fast and reliable access to deployed models.
    • Work with TensorRT, TGI, and similar frameworks to manage inference for Large Language Models (LLMs).

Required Skills and Experience:

  • AWS SageMaker Inference:
    Experience in deploying and managing AI models on AWS SageMaker or a similar ML platform.
  • Kubernetes Service Layer:
    Hands-on experience with Kubernetes, particularly in managing service layers implemented in Golang.
  • TGI or LLM Frameworks:
    Exposure to TensorRT, TGI, or LLM inference frameworks is essential, especially for troubleshooting and optimizing model performance.
  • Golang:
    Experience with Golang is a plus, particularly if you've worked with proxies or backend services in Golang.

Nice to Have:

  • Experience working in an agile team environment.
  • Experience with troubleshooting application issues at both the platform and application levels.
Other Smiliar Jobs
 
  • Tampa, FL
  • 7 Days ago
  • Atlanta, GA
  • 7 Days ago
  • New York, NY
  • 7 Days ago
  • Tampa, FL
  • 7 Days ago
  • Virginia Beach, VA
  • 7 Days ago
  • Tampa, FL
  • 7 Days ago
  • Atlanta, GA
  • 7 Days ago
  • Richmond, VA
  • 6 Days ago
  • Tampa, FL
  • 2 Days ago
  • Tampa, FL
  • 1 Days ago