All jobs
Engineering Hybrid Mid

Software Engineer, GPU Infrastructure

OpenAI San Francisco, CA Full-time

As an engineer within Fleet infrastructure, you will design, write, deploy, and operate infrastructure systems for model deployment and training on one of the world’s largest GPU fleets. You will work on job scheduling, cluster management, snapshot delivery, and CI/CD systems at unprecedented scale.

Responsibilities

  • Design, implement, and operate compute fleet components: job scheduling, cluster management
  • Build and maintain CI/CD systems for GPU infrastructure
  • Ensure reliability and uptime of model training and inference infrastructure
  • Optimize fleet utilization and resource allocation
  • Collaborate with research teams on infrastructure requirements

Qualifications

  • 3+ years in infrastructure engineering, distributed systems, or SRE
  • Proficiency in Python, Go, or similar systems languages
  • Strong Linux, networking, and server hardware knowledge
  • Experience with Kubernetes and container orchestration at scale
  • Understanding of GPU computing and accelerator architectures
Apply on OpenAI
PostedFeb 10, 2026
ExperienceMid
TypeFull-time
Work modelHybrid

What this hire signals

Fleet infra engineer for “one of the world’s largest GPU fleets” — OpenAI’s in-house compute is now at a scale that requires dedicated fleet management software.

Related signals

Track what these hires are building

Permits, grid filings, and regulatory signals — scored by probability, updated continuously, built for investment teams.