Technical/Functional Skills :
LLM inference, optimization, and deployment at scale
Rust / C++, Python, Pytorch, ONNx, Backend - API development,
Experience Required:
Experience with backend development, including RESTful APIs, microservices, and database management. LLM Inferencing
Roles & Responsibilities :
Work with teams to optimize LLM inference performance and resource utilization in production systems.
Develop high-performance, low-latency code using Rust/C++ to support machine learning applications and inference tasks.
Ensure the reliability and scalability of backend services that leverage ML models, including implementing monitoring, logging, and debugging tools.