Scientific Lead, Applied Intelligence for Discovery
Company: Eli Lilly and Company
Location: San Francisco
Posted on: March 2, 2026
|
|
|
Job Description:
At Lilly, we unite caring with discovery to make life better for
people around the world. We are a global healthcare leader
headquartered in Indianapolis, Indiana. Our employees around the
world work to discover and bring life-changing medicines to those
who need them, improve the understanding and management of disease,
and give back to our communities through philanthropy and
volunteerism. We give our best effort to our work, and we put
people first. We’re looking for people who are determined to make
life better for people around the world. The Opportunity We are
building something unprecedented, an AI foundation that will
fundamentally change how drug discovery research is conducted. The
Applied Intelligence for Discovery (AI4D) team is a newly formed
group within Lilly Research Laboratories that operates at the
intersection of scientific delivery and core platform development.
AI4D’s mission is to connecting scientists to petabyte-scale data
through natural language interfaces, automated analysis workflows,
and intelligent search — and to convert early deployments into
repeatable system standards and evaluation practices that scale
across therapeutic areas. As a Generative AI Engineer, you will
design, build, and operate the core AI systems that power this
transformation: retrieval-augmented generation over internal
scientific documents, text-to-SQL over complex omics databases,
agentic workflows that automate multi-step analyses, and the
evaluation infrastructure that able the next-generation of
medicines for patients. Key Responsibilities Design, build, and
optimize RAG pipelines over internal publications, study reports,
electronic lab notebooks, and other scientific documents Build
hybrid retrieval systems combining vector search with structured
metadata, knowledge graphs, and ontology-aware filtering Build and
optimize text-to-SQL systems over Lilly’s databases, enabling
scientists to query gene expression, proteomics, pathway, and
variant data through natural language Develop schema documentation,
semantic annotations, and gold-standard question/SQL pairs that
bridge how scientists think about data and how it is stored
Implement multi-step reasoning approaches (chain-of-thought,
self-correction, Reflexion loops) to improve accuracy on complex
scientific queries Design agentic AI workflows that chain database
queries, bioinformatics tools, literature search, and visualization
into automated multi-step scientific analyses Evaluate and
integrate emerging orchestration frameworks (LangGraph, CrewAI,
custom architectures) for scientific use cases Build evaluation
frameworks measuring accuracy, reliability, and scientific validity
of AI outputs Basic Qualifications PhD in Computer Science, Data
Science, or a related technical field with 0-3 years of experience;
or equivalent experience building production LLM systems; MS in
Computer Science, Data Science, or a related technical field with 5
years of experience; or equivalent experience building production
LLM systems Additional Skills/Preferences Experience building
LLM-powered applications, including at least two of: RAG systems,
text-to-SQL, agentic workflows, or fine-tuning pipelines Strong
software engineering skills in Python with experience building
production-grade systems Deep familiarity with the modern LLM
ecosystem: embedding models, vector databases, and orchestration
frameworks Experience designing evaluation frameworks for LLM
systems — systematic approaches to measuring accuracy, detecting
hallucinations, and tracking regressions Comfort working with
complex, heterogeneous data — databases with hundreds of tables,
specialized schemas, or domain-specific vocabularies Familiarity
with cloud computing environments (AWS preferred), containerization
(Docker), and CI/CD practices Experience in pharmaceutical,
biotech, or life sciences environments Familiarity with biomedical
data types (omics, clinical, molecular) or scientific databases
Experience with MLOps/LLMOps tooling: experiment tracking, model
registries, prompt versioning, A/B testing for AI systems Knowledge
of biomedical ontologies (Gene Ontology, MeSH, ChEBI) or experience
integrating domain-specific knowledge into LLM systems Experience
building for regulated environments where auditability,
reproducibility, and explainability are requirements Lilly is
dedicated to helping individuals with disabilities to actively
engage in the workforce, ensuring equal opportunities when vying
for positions. If you require accommodation to submit a resume for
a position at Lilly, please complete the accommodation request form
( https://careers.lilly.com/us/en/workplace-accommodation ) for
further assistance. Please note this is for individuals to request
an accommodation as part of the application process and any other
correspondence will not receive a response. Lilly is proud to be an
EEO Employer and does not discriminate on the basis of age, race,
color, religion, gender identity, sex, gender expression, sexual
orientation, genetic information, ancestry, national origin,
protected veteran status, disability, or any other legally
protected status. Our employee resource groups (ERGs) offer strong
support networks for their members and are open to all employees.
Our current groups include: Africa, Middle East, Central Asia
Network, Black Employees at Lilly, Chinese Culture Network,
Japanese International Leadership Network (JILN), Lilly India
Network, Organization of Latinx at Lilly (OLA), PRIDE (LGBTQ
Allies), Veterans Leadership Network (VLN), Women’s Initiative for
Leading at Lilly (WILL), enAble (for people with disabilities).
Learn more about all of our groups. Actual compensation will depend
on a candidate’s education, experience, skills, and geographic
location. The anticipated wage for this position is $166,500 -
$266,200 Full-time equivalent employees also will be eligible for a
company bonus (depending, in part, on company and individual
performance). In addition, Lilly offers a comprehensive benefit
program to eligible employees, including eligibility to participate
in a company-sponsored 401(k); pension; vacation benefits;
eligibility for medical, dental, vision and prescription drug
benefits; flexible benefits (e.g., healthcare and/or dependent day
care flexible spending accounts); life insurance and death
benefits; certain time off and leave of absence benefits; and
well-being benefits (e.g., employee assistance program, fitness
benefits, and employee clubs and activities).Lilly reserves the
right to amend, modify, or terminate its compensation and benefit
programs in its sole discretion and Lilly’s compensation practices
and guidelines will apply regarding the details of any promotion or
transfer of Lilly employees. WeAreLilly
Keywords: Eli Lilly and Company, San Francisco , Scientific Lead, Applied Intelligence for Discovery, Science, Research & Development , San Francisco, California