Many of Cloudflare's critical internal services run on Kubernetes. These services include those responsible for Cloudflare's control plane and APIs, data analytics and other internal tools used to manage our global network. These Kubernetes platforms are purpose-built from the ground up and run on bare metal Linux in different regions around the world. The scale we work at involves tens of thousands of containers and terabits per second of network traffic. The team takes pride in knowing this platform helps run parts of the global Internet.
As an engineer on the Kubernetes platform team, you'll be building the tools to help engineers deploy and operate the services that make Cloudflare work. Our mission is to provide a reliable, yet flexible, platform to help product teams release new software efficiently and safely. The role includes both software engineering and DevOps operational responsibilities.
What You'll Do
- Improve Kubernetes, Ceph and Prometheus automation, configuration management and other tooling
- Design scalable and resilient systems that can keep up with company growth
- Improve the efficiency of managing resources such as CPU, bandwidth and storage
- Harden the platform against security threats and resource contention issues
- Improve our GitOps systems and practices
- Work with app teams to understand their potential challenges and help them choose the best way to architect their systems on Kubernetes
- Contribute back to the open source community
- Some of our favorite open source projects include: Prometheus, Rook.io, Kubevirt, Contour, Envoy, Consul, cdk8s, Vault, Ceph, Cloudprober, Etcd, Calico, Terraform
- Help respond and prevent incidents impacting core platforms
What You'll Need
- Experience managing production Kubernetes or similar orchestration platforms
- Recent experience with configuration management frameworks such as SaltStack or Ansible
- Knowledge of how container runtimes work inside of Linux (isolation, storage, and networking)
- Ability to work with codebases in Bash, TypeScript and Go
- A firm grasp of IP networking including routing and iptables
- Excellent debugging skills in a Linux environment
- Source control experience including branching, merging and rebasing
- The ability to break down complex problems into smaller pieces, provide options, talk through trade-offs and drive the effort to solve the problem
Bonus Points
- Experience operating Kubernetes on-premise at scale in capacities including SRE, systems design or architecture
- Providing guidance and building platforms across multiple zones and regions as foundation for other teams to build distributed highly-available applications
- Operational experience with Etcd, Prometheus, Ceph, Rook, SaltStack, Vault, Calico, other common CNIs like Cilium
Compensation
Compensation may be adjusted depending on work location.
- For Colorado-based hires: Estimated annual salary of $137,000 - $167,000
- For New York City, Washington, and California (excluding Bay Area) based hires: Estimated annual salary of $154,000 - $188,000
- For Bay Area-based hires: Estimated annual salary of $162,000 - $198,000
Equity
This role is eligible to participate in Cloudflare's equity plan.
Benefits
Cloudflare offers a complete package of benefits and programs to support you and your family. Our benefits programs can help you pay health care expenses, support caregiving, build capital for the future and make life a little easier and fun! The below is a description of our benefits for employees in the United States, and benefits may vary for employees based outside the U.S.
Health & Welfare Benefits
- Medical/Rx Insurance
- Dental Insurance
- Vision Insurance
- Flexible Spending Accounts
- Commuter Spending Accounts
- Fertility & Family Forming Benefits
- On-demand mental health support and Employee Assistance Program
- Global Travel Medical Insurance
Financial Benefits
- Short and Long Term Disability Insurance
- Life & Accident Insurance
- 401(k) Retirement Savings Plan
- Employee Stock Participation Plan
Time Off
- Flexible paid time off covering vacation and sick leave
- Leave programs, including parental, pregnancy health, medical, and bereavement leave
Top Skills
What We Do
Cloudflare, Inc. is on a mission to help build a better Internet. Cloudflare’s suite of products protect and accelerate any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare have all web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was awarded by Reuters Events for Global Responsible Business in 2020, named to Fast Company's Most Innovative Companies in 2021, and ranked among Newsweek's Top 100 Most Loved Workplaces in 2022.
Gallery
Cloudflare Offices
Hybrid Workspace
Employees engage in a combination of remote and on-site work.
We are committed to developing a global team that is distributed with a flexible working approach. Doing this equitably and inclusively is essential to our success. Visit our careers site for more on 'How & Where We Work.'