JPMorganChase Logo

JPMorganChase

Lead Site Reliability Engineer - High Performance Compute

Job Posted 24 Days Ago Posted 24 Days Ago
Be an Early Applicant
Hybrid
Jersey City, NJ
Senior level
Hybrid
Jersey City, NJ
Senior level
As a Lead Site Reliability Engineer, you will manage complex applications and infrastructures, ensuring reliability, availability, and scalability through code and cloud resources. Responsibilities include guiding designs, automating deployment, and responding to incidents. You will collaborate within a team to adopt and promote best practices in site reliability engineering.
The summary above was generated by AI

Job Description
As a Lead Site Reliability Engineer at JPMorgan Chase within the Markets Engineering & Architecture team, you will solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform.
Job responsibilities

  • Guide and assist others in building appropriate level designs and gaining consensus from peers where appropriate.
  • Collaborate with software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines.
  • Design, develop, test, and implement availability, reliability, scalability, and solutions in applications in collaboration with other software engineers and teams.
  • Implement infrastructure, configuration, and network as code for applications and platforms within your remit.
  • Collaborate with technical experts, key stakeholders, and team members to resolve complex problems.
  • Understand service level indicators and utilize service level objectives to proactively resolve issues before they impact customers.
  • Support the adoption of site reliability engineering best practices within your team.
  • Lead incident response efforts as a subject matter expert on the High Performance Computing platform, including restoration of service, root cause analysis, and engineering preventative measures.
  • Contribute to client teams by providing resilient architecture implementations and running chaos simulations to validate platform resiliency.
  • Build and maintain standard infrastructure as code modules for reuse by development teams and other business units.
  • Participate in architecture resiliency reviews to provide guidance on cloud design decisions, standards, and operational practices, while developing skills to attain Subject Matter Expertise in at least one technical implementation within a technical domain.


Required qualifications, capabilities, and skills

  • Formal training or certification in site reliability engineering concepts and 5+ years of applied experience.
  • Demonstrate applied experience in contributing to the reliability of production applications, with proficiency in site reliability culture and principles, and familiarity with implementing site reliability within applications or platforms.
  • Possess 5+ years of experience in at least one programming language such as Python, Java/Spring Boot, or Golang, along with proficient knowledge of software applications and technical processes within a given technical discipline, including supporting and delivering public cloud applications and monitoring technologies like Graphic Processing Units and IBM Symphony.
  • Have 5+ years of experience in observability practices, including white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, and Splunk.
  • Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, and Spinnaker, as well as hands-on experience with container and container orchestration technologies such as ECS, Kubernetes, and Docker, and troubleshooting common networking technologies and issues.
  • Contribute to large and collaborative teams by presenting information logically and timely with compelling language and limited supervision, while proactively recognizing roadblocks and demonstrating an interest in learning technologies that facilitate innovation.
  • Experience working with Arm-based servers, Terraform, Amazon Machine Images, and setting up and configuring OpenTelemetry agents and collectors, with proficiency in AWS and cloud automation tools and technologies like Lambda, CodePipeline, Ansible, and Terraform.


Preferred qualifications, capabilities, and skills

  • Ability to contribute to large and collaborative teams by presenting information in a logical and timely manner with compelling language and limited supervision
  • Ability to identify new technologies and relevant solutions to ensure design constraints are met by the software team.


About Us
JPMorganChase, one of the oldest financial institutions, offers innovative financial solutions to millions of consumers, small businesses and many of the world's most prominent corporate, institutional and government clients under the J.P. Morgan and Chase brands. Our history spans over 200 years and today we are a leader in investment banking, consumer and small business banking, commercial banking, financial transaction processing and asset management.
We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
JPMorgan Chase & Co. is an Equal Opportunity Employer, including Disability/Veterans
About the Team
J.P. Morgan's Commercial & Investment Bank is a global leader across banking, markets, securities services and payments. Corporations, governments and institutions throughout the world entrust us with their business in more than 100 countries. The Commercial & Investment Bank provides strategic advice, raises capital, manages risk and extends liquidity in markets around the world.

Similar Jobs at JPMorganChase

22 Hours Ago
Hybrid
Jersey City, NJ, USA
Senior level
Senior level
Financial Services
As a Senior Lead Software Engineer, you will architect ML platforms, support ML experiments, and collaborate with various teams to deliver scalable solutions.
Top Skills: AirflowAWSDockerHadoopJaxJenkinsKubernetesPythonPyTorchScikit-LearnSparkSpinnakerTensorFlowTerraform
22 Hours Ago
Hybrid
Jersey City, NJ, USA
Senior level
Senior level
Financial Services
The Principal Cloud Platform Engineer will lead the design and development of cloud infrastructure, optimize platform performance, and enhance operational efficiency for AI/ML initiatives.
Top Skills: Ai/MlAWSAzureBitbucketCrossplaneDatadogDockerEcsGCPGitGoGrafanaJenkinsKubernetesPrometheusPulumiPythonSpinnakerSplunkTerraform
22 Hours Ago
Hybrid
Jersey City, NJ, USA
Mid level
Mid level
Financial Services
As a Software Engineer III, you will design and deliver technology products, develop applications, improve coding hygiene, and contribute to team culture.
Top Skills: AWSCore JavaKubernetesReactSpring Boot

What you need to know about the Seattle Tech Scene

Home to tech titans like Microsoft and Amazon, Seattle punches far above its weight in innovation. But its surrounding mountains, sprinkled with world-famous hiking trails and climbing routes, make the city a destination for outdoorsy types as well. Established as a logging town before shifting to shipbuilding and logistics, the Emerald City is now known for its contributions to aerospace, software, biotech and cloud computing. And its status as a thriving tech ecosystem is attracting out-of-town companies looking to establish new tech and engineering hubs.

Key Facts About Seattle Tech

  • Number of Tech Workers: 287,000; 13% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Amazon, Microsoft, Meta, Google
  • Key Industries: Artificial intelligence, cloud computing, software, biotechnology, game development
  • Funding Landscape: $3.1 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Madrona, Fuse, Tola, Maveron
  • Research Centers and Universities: University of Washington, Seattle University, Seattle Pacific University, Allen Institute for Brain Science, Bill & Melinda Gates Foundation, Seattle Children’s Research Institute
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account