Writer Logo

Writer

Site reliability engineer

Job Posted Yesterday Posted Yesterday
Remote
2 Locations
Senior level
Remote
2 Locations
Senior level
As a foundational member of the Cloud Infrastructure team, you will lead the design and maintenance of Writer's cloud systems, focusing on reliability, scalability, and performance. This includes automating infrastructure management, developing monitoring systems, conducting post-mortem analyses, and mentoring junior engineers.
The summary above was generated by AI

✍🏽 About Writer

Writer is the full-stack generative AI platform delivering transformative ROI for the world’s leading enterprises. Named one of the top 50 companies in AI by Forbes and one of the best places to work by Inc. Magazine, Writer empowers hundreds of customers like Accenture, Intuit, L’Oreal, Mars, Salesforce, and Vanguard to transform the way they work.

Writer’s fully integrated solution makes it easy to deploy secure and reliable AI applications and agents that solve mission-critical business challenges. Our suite of development tools is powered by Palmyra – Writer’s state-of-the-art family of LLMs — alongside our industry-leading graph-based RAG and customizable AI guardrails.

Founded in 2020 with office hubs in San Francisco, New York City, Austin, Chicago, and London, our team of over 250 employees thinks big and moves fast, and we’re looking for smart, hardworking builders and scalers to join us on our journey to create a better future of work.

📐 About this role 

We are looking for a foundational member of the Cloud Infrastructure team at Writer. This role will involve contributing to the development and implementation of our Site Reliability Engineering (SRE) program. The ideal candidate will ensure the reliability, scalability, performance, and security of Writer’s critical systems, taking a proactive approach to guarantee that our high-ROI products reach our customers seamlessly.
🦸🏻‍♀️ Your responsibilities:

  • Lead the design, implementation, and maintenance of Writer, Inc.’s cloud infrastructure to ensure high availability and performance

  • Design and implement scalable cloud automation to support seamless deployment for our largest enterprise customers

  • Automate infrastructure provisioning and management using Terraform & Python

  • Collaborate with development teams to optimize cloud resources and enhance system reliability

  • Develop and maintain monitoring and alerting systems to proactively identify and resolve issues affecting the reliability of our writing solutions

  • Conduct post-mortem analyses of system failures to identify root causes and implement preventive measures

  • Optimize and scale our cloud infrastructure to support growing user demand and ensure cost efficiency

  • Ensure the security and compliance of our systems, adhering to industry standards and regulations

  • Provide mentorship and technical guidance to junior engineers, fostering a culture of reliability and continuous improvement

  • Stay current with emerging technologies and industry trends to continuously improve our site reliability practices

⭐ Is this you? 

  • Proven expertise in Site Reliability Engineering with a minimum of 7 years of hands-on experience

  • Deep understanding of system architecture and infrastructure design to ensure high availability and performance

  • Bachelor’s degree in Computer Science, Engineering, or a related technical field

  • Strong proficiency in programming languages such as Python, Java, Go for automation and monitoring

  • Experience with cloud platforms like AWS, Azure, or GCP, and their respective services for scalable and resilient systems

  • Expertise in containerization technologies (e.g., Docker, Kubernetes) and orchestration tools

  • Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) to maintain system health and performance

  • Ability to lead and mentor junior engineers in best practices for reliability and system optimization

  • Excellent communication skills to collaborate effectively with cross-functional teams and stakeholders

  • Proactive approach to identifying and mitigating potential system failures and performance bottlenecks

  • Preferred Skills & Experience:

    • Software engineering expertise

    • Terraform

    • Python

    • Kubernetes

    • Scala

    • AWS/GCP

Curious to learn more about who we are and how we operate? Visit us here

🍩 Benefits & perks (US Full-time employees)

  • Generous PTO, plus company holidays

  • Medical, dental, and vision coverage for you and your family

  • Paid parental leave for all parents (12 weeks)

  • Fertility and family planning support

  • Early-detection cancer testing through Galleri

  • Flexible spending account and dependent FSA options

  • Health savings account for eligible plans with company contribution

  • Annual work-life stipends for:

    • Home office setup, cell phone, internet

    • Wellness stipend for gym, massage/chiropractor, personal training, etc.

    • Learning and development stipend

  • Company-wide off-sites and team off-sites

  • Competitive compensation, company stock options and 401k

Writer is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page, you acknowledge and agree to Writer's Global Candidate Privacy Notice.


#BI-Remote

Similar Jobs

Yesterday
Remote
San Francisco, CA, USA
Junior
Junior
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
The Site Reliability Engineer will enhance cloud services by overseeing caching infrastructure and automation, ensuring high availability and performance. The role involves monitoring, debugging, and improving code while scaling distributed software in production environments. Responsibilities include communication across technical levels and implementing best practices in service reliability.
4 Days Ago
Remote
New York, NY, USA
Senior level
Senior level
Artificial Intelligence • Enterprise Web • Machine Learning • Natural Language Processing • Software • Conversational AI • Automation
As a Site Reliability Engineer at Kustomer, you will build and maintain cloud infrastructure, automate deployment processes, improve scalability and performance, and collaborate on best practices across engineering teams. Responsibilities include managing on-call practices, security compliance, and infrastructure upgrades while leading system migrations and optimizing developer environments.
5 Days Ago
Remote
United States
145K-195K Annually
Senior level
145K-195K Annually
Senior level
Information Technology • Marketing Tech
The Staff Site Reliability Engineer for Telecom & SMS will lead the design and implementation of SMS reliability strategies, manage telecom providers, optimize performance, analyze operations, and drive technical solutions while collaborating across teams to enhance customer experience.
Top Skills: AnsibleAsteriskAWSAzureDockerElasticsearchGCPGitGitlabHaproxyInterconnectsJenkinsK8SKannelLinuxMySQLNatNginxOpensipsRestRtpSipSmsSngrepSnmpSpring BootTerraformTomcatVpnWafsWireshark

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account