Flip Logo

Flip

Technical Lead - Senior ML Infrastructure Software Engineer

Sorry, this job was removed Sorry, this job was removed at 06:04 p.m. (PST) on Tuesday, Apr 08, 2025
Remote
Remote

Technical Lead - Senior Machine Learning Infrastructure Software Engineer

Location: Hybrid in New York City, or US remote.


About Flip.shop:

Welcome to Flip.shop, where innovation meets the social commerce revolution! Fresh off our Series C funding round, we've raised $144 million, propelling our valuation to an impressive $1.05 billion. We’re redefining the shopping experience by giving consumers a voice in a space dominated by tech giants. Join us on this exhilarating journey where your technical skills will play a pivotal role in shaping the future of social commerce!


Why Join Us?

At Flip.shop, you’ll be at the forefront of innovation in social commerce. This isn’t just a job—it’s a chance to build infrastructure that empowers our AI-driven platform to scale and deliver personalized shopping experiences. You will have the opportunity to directly partner, work with and learn from the very best engineers and scientists who joined us from some of the leading big-tech companies! 

If you thrive in a fast-paced, collaborative environment where you can develop high-performance systems, we want to hear from you!


Role Overview:

We are seeking an experienced ML Infrastructure Lead to design, build, and optimize the infrastructure that powers our machine learning systems. You’ll drive the scalability, reliability, and performance of our recommendation and ads systems. This role involves leading the design, implementation, and optimization of our serving infrastructure to support high-throughput, low-latency workloads.

Furthermore, you'll ensure the efficient deployment, scaling, and monitoring of machine learning models, and will help streamline the development lifecycle. This role offers the opportunity to create scalable, production-level systems that support real-time recommendations and drive business growth.


You will work closely with our engineering and machine learning leaders to ensure our platform can scale efficiently and reliably as we grow.


Key Responsibilities:

  • Infrastructure Development: Design and implement scalable ML infrastructure for deploying, monitoring, and maintaining machine learning models in production environments. Ensure high availability, reliability, and performance of serving and infra systems.
  • Tooling & Automation: Build tools to automate workflows for model training, testing, and deployment, ensuring that machine learning models can move quickly from development to production.
  • Cloud Infrastructure: Leverage cloud platforms to create efficient, scalable systems for large-scale machine learning workloads.
  • Performance Optimization: Ensure the infrastructure supports high-performance model inference at scale, with a focus on minimizing latency and maximizing throughput.
  • Collaboration: Work closely with data scientists, machine learning engineers, and DevOps teams to create seamless integration between development and production environments.
  • Monitoring & Maintenance: Build robust monitoring systems to track model performance and infrastructure health, ensuring reliability and uptime of machine learning services.
  • Security & Compliance: Implement best practices in infrastructure security, data privacy, and compliance, particularly when handling sensitive user data.

Requirements:

  • Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
  • Experience: 7+ years of experience in infrastructure engineering, DevOps, or similar domains, with a focus on supporting machine learning workflows in production.
  • Technical Skills: Strong proficiency in cloud platforms (AWS, GCP, or Azure), containerization (Docker, Kubernetes), CI/CD pipelines, and infrastructure-as-code tools (Terraform, Ansible). Experience with SageMaker is a bonus. 
  • ML Workflow Knowledge: Experience working with machine learning frameworks (TensorFlow, PyTorch, or similar) and expertise with MLOps practices.
  • Performance & Scalability: Proven track record of optimizing infrastructure for performance, scalability, and reliability in production environments.
  • Collaboration: Strong teamwork skills, with the ability to partner with ML engineers and data scientists to streamline workflows.
  • Communication: Ability to communicate complex infrastructure solutions to technical and non-technical stakeholders.
  • Problem-Solving: Passion for solving infrastructure challenges that support real-time machine learning at scale.

Preferred Qualifications:

  • Experienced with using node.js for backend development
  • Experienced with infrastructure & tools of AWS
  • Experienced with message Queue such as RabbitMQ.

Why You’ll Love Working Here:

At Flip.shop, you’ll have the opportunity to build the backbone of our AI-driven platform, working on cutting-edge infrastructure that powers personalized shopping experiences for millions of users. Your work will directly contribute to scaling our machine learning systems, ensuring they run efficiently in a high-performance production environment. This is your chance to have a lasting impact and help Flip.shop shape the future of social commerce.


Ready to Build the Future?

If you're passionate about building scalable infrastructure and driving innovation in machine learning at scale, join us at Flip.shop! Let’s redefine the future of online shopping together.


Compensation & Benefits:

Base salary and total compensation will vary based on factors including but not limited to location, experience, and performance. Please note the base salary is just one component of the company’s total rewards package for exempt employees. Other rewards may include equity, bonuses, long term incentives, a PTO policy, and other progressive benefits.

Similar Jobs

32 Minutes Ago
Remote
Hybrid
US
120K-165K Annually
Senior level
120K-165K Annually
Senior level
Artificial Intelligence • eCommerce • Information Technology • Internet of Things • Automation
The ServiceNow Technical Architect will develop services offerings, support product sales, implement solutions, and provide technical leadership in ITAM and ITSM.
Top Skills: Cloud ApplicationsIt Asset ManagementIt Service ManagementSaaSServicenow
46 Minutes Ago
Remote
Hybrid
2 Locations
205K-234K Annually
Senior level
205K-234K Annually
Senior level
Fintech • Machine Learning • Payments • Software • Financial Services
Lead diverse technology projects and a team of developers to create cloud-based solutions while mentoring engineering community members.
Top Skills: AWSDockerGoHTML/CSSJavaJavaScriptKubernetesPythonSQLTypescript
47 Minutes Ago
Remote
Hybrid
Plano, TX, USA
117K-221K Annually
Senior level
117K-221K Annually
Senior level
Artificial Intelligence • Fintech • Insurance • Marketing Tech • Software • Analytics
The Principal SAP SAC Reporting Configuration Engineer analyzes, develops, and configures SAP Reporting tools to support financial operations and assists in month-end close processes.
Top Skills: AbapCore Data ServicesOdata ServicesSap Analytics CloudSap BpcSap BwSap S/4 Hana

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account