Research Scientist - Applied AI/LLM

Posted 23 Days Ago
Be an Early Applicant
Bellevue, WA
143K-181K Annually
1-3 Years Experience
Big Data • Software
The Role
Conduct foundational research into solutions for analyzing data using natural language and integrating them into products. Develop and deploy AI models, implement ML infrastructure, and contribute to AI community through research and open-source projects.
Summary Generated by Built In

P-1131

At Databricks, we are obsessed with enabling data teams to solve the world's toughest problems. We do this by building and running the world's best data and AI infrastructure platform, so our customers can focus on the high value challenges that are central to their own missions. Founded in 2013 by the original creators of Apache Spark™, Databricks has grown from a tiny corner office in Berkeley, California to a global organization with over 1000 employees. Thousands of organizations, from small to Fortune 100, trust Databricks with their mission-critical workloads, making us one of the fastest growing SaaS companies in the world.

You’ll work with teams across Databricks to conduct foundational research into the feasibility and effectiveness of solutions that help customers analyze data using natural language, and then bring those solutions into our products to make data analysis easier and more approachable for all of our customers. More broadly, our teams work on some of the hardest, most interesting problems facing the business, ranging from designing large-scale distributed AI/ML systems, to optimizing distributed GPU model serving to developing novel modeling methodologies that scale to production use cases. 

The impact you will have:

  • Shape the direction of our applied ML areas and intelligence features in our products, helping customers translate unstructured text into structured code, queries and data.
  • Drive the development and deployment of state-of-the-art AI models and systems that directly impact the capabilities and performance of Databricks' products and services.
  • Architect and implement robust, scalable ML infrastructure, including data storage, processing, and model serving components, to support seamless integration of AI/ML models into production environments.
  • Develop novel data collection, fine-tuning, and pre-training strategies that achieve optimal performance on specific tasks and domains.
  • Design and implement automated ML pipelines for data preprocessing, feature engineering, model training, hyperparameter tuning, and model evaluation, enabling rapid experimentation and iteration.
  • Implement advanced model compression and optimization techniques to reduce the resource footprint of language models while preserving their performance
  • Contribute to the broader AI community by publishing research, presenting at conferences, and actively participating in open-source projects, enhancing Databricks' reputation as an industry leader.

What we look for:

  • PhD in Computer Science, strongly preferred, or a related field or equivalent practical experience
  • 2+ years of machine learning engineering experience in high-velocity, high-growth companies.  Alternatively, a strong background in relevant ML research in academia will be considered as an equivalent qualification.
  • Experience developing AI/ML systems at scale in production or in high-impact research environments.
  • Strong track record of working with language modeling technologies. This could include the following: Developing generative and embedding techniques, modern model architectures, fine tuning / pre-training datasets, and evaluation benchmarks.
  • Strong coding and software engineering skills, and familiarity with software engineering principles around testing, code reviews and deployment.
  • Experience deploying and scaling language models in production; deep understanding of the unique infrastructure challenges posed by training and serving LLMs.
  • Strong understanding of computer science fundamentals.
  • Prior experience with Natural Language Processing and transforming unstructured text into structured code, queries and data is a plus.
  • Contributions to well-used open-source projects.

Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents base salary range for non-commissionable roles or on-target earnings for commissionable roles.  Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job-related skills, depth of experience, relevant certifications and training, and specific work location. Based on the factors above, Databricks utilizes the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus, equity, and the benefits listed above. For more information regarding which range your location is in visit our page here.


Local Pay Range

$142,500$180,500 USD

Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region, please visit https://www.mybenefitsnow.com/databricks. 

At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.

Top Skills

Python
The Company
Seattle, WA
0 Employees
Hybrid Workplace
Year Founded: 2013

What We Do

As the leader in Unified Data Analytics, Databricks helps organizations make all their data ready for analytics, empower data science and data-driven decisions across the organization, and rapidly adopt machine learning to outpace the competition.

Gallery

Gallery

Similar Companies Hiring

Opendoor Thumbnail
Software • Real Estate • PropTech • Fintech • eCommerce
Seattle, WA
1900 Employees
Atlassian Thumbnail
Software • Security • Productivity • Information Technology • Cloud • Automation • App development
US
11000 Employees
2K Thumbnail
Software • Mobile • Information Technology • Gaming • eSports
Kirkland, WA
3505 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account