Software Engineer, GPU Infrastructure
As an engineer within Fleet infrastructure, you will design, write, deploy, and operate infrastructure systems for model deployment and training on one of the world’s largest GPU fleets. You will work on job scheduling, cluster management, snapshot delivery, and CI/CD systems at unprecedented scale.
Responsibilities
- Design, implement, and operate compute fleet components: job scheduling, cluster management
- Build and maintain CI/CD systems for GPU infrastructure
- Ensure reliability and uptime of model training and inference infrastructure
- Optimize fleet utilization and resource allocation
- Collaborate with research teams on infrastructure requirements
Qualifications
- 3+ years in infrastructure engineering, distributed systems, or SRE
- Proficiency in Python, Go, or similar systems languages
- Strong Linux, networking, and server hardware knowledge
- Experience with Kubernetes and container orchestration at scale
- Understanding of GPU computing and accelerator architectures
Listing aggregated from a public source. GreenCIO is not affiliated with, endorsed by, or recruiting on behalf of OpenAI. You will apply directly with the employer.
What this hire signals
Fleet infra engineer for “one of the world’s largest GPU fleets” — OpenAI’s in-house compute is now at a scale that requires dedicated fleet management software.
Related signalsTrack what these hires are building
Permits, grid filings, and regulatory signals — scored by probability, updated continuously, built for investment teams.