EugeneRecruiter Since 2001
the smart solution for Eugene jobs

Staff Site Reliability Engineer

Company: VirtualVocations
Location: Eugene
Posted on: April 1, 2025

Job Description:

A company is looking for a Staff Site Reliability Engineer focused on Machine Learning Infrastructure.
Key Responsibilities

Design and implement robust ML infrastructure for training, deployment, monitoring, and scaling of machine learning models
Improve reliability, availability, and scalability of ML infrastructure while ensuring efficient workflows
Collaborate with various teams to identify infrastructure requirements and streamline the ML lifecycle

Required Qualifications

7+ years of experience in Site Reliability Engineering, DevOps, or infrastructure engineering roles
Expertise with on-premises infrastructure for machine learning workloads (e.g., Kubernetes, Docker)
Proficiency with infrastructure automation and configuration management tools (e.g., Terraform, Ansible)
Experience with observability, monitoring, and logging for ML systems (e.g., Prometheus, Grafana)
Familiarity with popular Python-based ML frameworks (e.g., PyTorch, TensorFlow)

Keywords: VirtualVocations, Eugene , Staff Site Reliability Engineer, Professions , Eugene, Oregon

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Oregon jobs by following @recnetOR on Twitter!

Eugene RSS job feeds