X Corp. Logo

X Corp.

Site Reliability Engineer - High Performance Computing / AI-ML

Job Posted 23 Days Ago Posted 23 Days Ago
5 Locations
120K-297K Annually
Junior
5 Locations
120K-297K Annually
Junior
As a Site Reliability Engineer, you will manage large-scale HPC systems, ensure stability and performance, and automate deployment processes.
The summary above was generated by AI

Role: Site Reliability Engineer - HPC / AI-ML (All Levels)
Location: Palo Alto, New York, Seattle or Austin
Base Salary Range: $120,000 to $297,000 + Equity

_

Who We Are:

At X, we’re pioneering the frontier of technology with our innovative Everything App. Our mission is to revolutionize how people connect, share ideas, and engage in meaningful conversations. We champion freedom of speech and strive to create a platform that embraces diverse perspectives. Our commitment is to foster open dialogue and empower individuals to express themselves freely.

What You’ll Do:

As a Site Reliability Engineer (SRE) supporting HPC (High Performance Computing) + AI/ML initiatives at X, you will play a crucial role in maintaining and enhancing the reliability, availability, and performance of our large-scale systems. Your responsibilities will include:

  • Managing and troubleshooting large scale clusters to ensure the stability and efficiency of our platform (primarily Linux + Kubernetes)

  • Collaborating with cross-functional teams, including hardware engineers and software developers, to support and improve our infrastructure

  • Automating the provisioning and deployment of systems to enhance long-term health and scalability

  • Ensuring the robustness of our HPC environments and storage clusters

  • Writing and maintaining scripts and tools for automation and monitoring

  • Addressing system failures and performance issues, identifying root causes, and implementing preventive measures

  • Working closely with end-users to understand changing needs as our environment evolves. 

Who You Are:

We're looking for exceptional engineers who are passionate about our mission and have a strong desire to make a meaningful impact. The ideal candidate will have:

  • 2+ years of professional software development experience 

  • Extensive experience with Kubernetes and container orchestration

  • Proficiency in one or more object-oriented programming languages (e.g. Python, Java, C++, Scala)

  • Proficiency in scripting languages (Python, Bash, etc.)

  • Strong experience in configuration management (e.g., puppet, ansible, chef, etc.)

  • Familiarity with Ethernet networking at scale and distributed systems

  • Strong troubleshooting skills and experience with HPC environments

  • Experience managing large-scale systems, ideally supporting thousands of machines

  • Working understanding of the storage systems required to support such environments

  • Experience with various GPU / accelerator architectures and ability to optimize performance on such platforms.

  • Ability to think outside the box and come up with innovative solutions to complicated problems.

  • Extremely committed, willing to work in a fast paced environment

  • Excellent communication and interpersonal skills

At X, our small but fast-paced team values innovation, creativity, and a strong commitment to our mission. As a Site Reliability Engineer, you'll have the opportunity to make a significant impact on the future of X and our aspiration to build the Everything App.

Top Skills

Ansible
Bash
C++
Chef
Java
Kubernetes
Linux
Puppet
Python
Scala

X Corp. Seattle, Washington, USA Office

1501 4th Ave, Seattle, Washington, United States, 98101

Similar Jobs

2 Hours Ago
Remote
Hybrid
68 Locations
99K-219K Annually
Senior level
99K-219K Annually
Senior level
Artificial Intelligence • Professional Services • Business Intelligence • Consulting • Cybersecurity • Generative AI
Lead and mentor a software development team in creating innovative software solutions, ensuring project success and aligning with client needs and business objectives.
Top Skills: AngularAzure DevopsHeadspinJavaScriptJestMochaReactSaucelabsSeleniumTypescript
2 Hours Ago
Remote
Hybrid
4 Locations
120K-120K Annually
Junior
120K-120K Annually
Junior
eCommerce • Legal Tech • Professional Services • Software • Data Privacy
Develop scalable nationwide phone services, collaborating with Telecom PM on product deployment and improvements, ensuring documentation and regulatory compliance.
Top Skills: JavaScriptPythonRubySipVoip
2 Hours Ago
Remote
Hybrid
2 Locations
106K-197K Annually
Senior level
106K-197K Annually
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The Sr. DevOps Engineer will manage and maintain SaaS infrastructure, automate deployments, ensure site stability, and collaborate across departments.
Top Skills: AWSKubernetesLinuxPythonRubyShell Scripts

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account