Venture Capital

Hippocratic AI Debuts With $50 Million Seed Round

PALO ALTO — Hippocratic AI launched out of stealth this month to announce the industry’s first safety-focused Large Language Model (LLM) designed specifically for healthcare, as well as a $50M seed round co-led by General Catalyst and Andreessen Horowitz.

Large language models (LLMs) and Foundation Models (FMs) like ChatGPT and GPT-4 have surprised the world with their abilities. While researchers have shown that these AI models can pass the USMLE (US Medical Licensing Exam), no company has built a commercial model specifically tuned for healthcare applications. Hippocratic AI is building the first LLM for Healthcare with an initial focus on non-diagnostic, patient-facing applications. This will allow the company to ensure patient safety while improving healthcare access and outcomes.

“The healthcare industry needs its own AI platform, one that is focused on empowering the workforce, reducing burnout, and improving patient safety and experiences with the healthcare system. We joined forces with the Hippocratic AI team, our health assurance ecosystem, and the a16z team to build this platform. Our goal is to fundamentally increase the supply and scalability of healthcare professionals. This is the key to achieving the health assurance vision: a more proactive, more affordable, and equitable system of care for all,” said Hemant Taneja, CEO and Managing Director at General Catalyst.

Hippocratic AI was founded by a group of physicians, hospital administrators, Medicare professionals, and artificial intelligence researchers from El Camino Health, Johns Hopkins, Washington University in St. Louis, Stanford, UPenn, Google, and Nvidia.

“After working with Munjal and team for years in his prior company, we know that his lived experience as a healthcare and tech operator gives him an edge in understanding what it takes to bring high-ROI products to market – especially at a time when existing industry players are in such dire need of better operating leverage and financial sustainability. We believe Hippocratic AI’s cross-disciplinary, safety-first approach is what the healthcare industry needs to be able to maintain trust in the power of responsible deployment of generative AI solutions,” said Julie Yoo, General Partner at Andreessen Horowitz.

To build a safer large language model the company has focused on three main things: certification, RLHF via healthcare professionals, and bedside manner.

Certification

Passing the USMLE is not enough to ensure a model is ready for the wide variety of healthcare roles that exist in care and payor settings. Therefore, Hippocratic AI focused on testing its model on a wide variety of 114 healthcare certifications and roles. The company also strived to not just get a passing score but to outperform existing state-of-the-art language models such as GPT-4 and other commercially available models. The company was able to outperform GPT-4 on 105 of the 114 tests and certifications, outperform by 5% or more on 74 of the certifications, and outperform by 10% or more on 43 of their certifications. Below are some sample results. Full results here: (www.HippocraticAI.com/benchmarks)

Name Commercial
LLM #1
Commercial
LLM #2
GPT-4 Hippocratic Δ Improvement
vs Best
Competitor
NAPLEX North American
Pharmacist
Licensure
Examination
51.0% 0.0% 70.9% 91.1% 20.2%
NCLEX-RN Registered Nurse 58.8% 25.8% 76.2% 88.6% 12.4%
CPNP-AC Acute Care
Certified Pediatric
NP
64.0% 22.0% 86.7% 96.0% 9.3%
CPC Certified
Professional
Coder
54.7% 50.0% 65.3% 71.0% 5.7%
ABOG American Board of
Obstetrics and
Gynecology
Licensing Exam
44.00% 24.00% 80.30% 92.33% 12.03%
ABU American Board of
Urology –
Licensing Exam
42.09% 24.24% 67.30% 77.10% 9.80%
Hospital Safety
Training
Hospital Safety
Training
Compliance Quiz
39.4% 27.3% 48.5% 72.7% 24.2%
RD Registered
Dietician
57.1% 46.9% 71.4% 83.7% 12.3%
CLC Certified Lactation
Consultant
60.9% 51.7% 79.3% 98.9% 19.6%
CPCO Certified
Professional
Compliance
Officer
60.7% 54.0% 67.3% 86.0% 18.7%


RLHF with Healthcare professionals

Hippocratic AI has decided that the best people to determine LLM readiness for deployment in the healthcare system are the experts who serve in that role in today’s system. In large language models, there is a technique to mold the AI using human feedback: Reinforcement Learning with Human Feedback (RLHF). Many believe this technique is what led to the remarkable performance of ChatGPT compared to that of prior versions of OpenAI’s language models.

In building Hippocratic AI, the company has engaged healthcare professionals to help guide and train the LLM by rating its responses.

“RLHF with healthcare professionals isn’t just a feature but is really our commitment to partner deeply with the industry,” said Munjal Shah, Co-Founder and CEO of Hippocratic AI. “We aren’t just saying these professions will help us evaluate our system. We are saying we won’t launch each unique role for the LLM unless the professionals who do that exact task today agree the system is ready and safe.”

Some of the roles and tasks the company is exploring include patient navigator, dietician, genetic counselor, enrollment specialist, medication reminders, and more.

Bedside Manner

“In healthcare settings, it isn’t just important to answer the patient accurately. It is equally important that it is done with great bedside manner. Many studies have shown that bedside manner impacts emotional well-being and quality of outcomes. This isn’t just true for doctors but also true for everyone interacting with patients: billing agents, schedulers, and more,” said Meenesh Bhimani MD, Co-Founder and Chief Medical Officer of Hippocratic AI.

To date there are no benchmarks for evaluating the bedside manner of a language model when interacting with patients. Hippocratic AI will be releasing the first of many bedside manner benchmarks for the entire community to use. Below are the initial results the company has achieved against these benchmarks.

Name Commercial LLM #1 GPT-4 Hippocratic Δ Improvement vs
Best Competitor
Shows Empathy 30.0% 68.3% 75.0% 6.7%
Shows care and
compassion
43.3% 75.0% 85.0% 10.0%
Making Patient feel at
ease
5.0% 29.2% 57.5% 28.3%
Taking a personal
interest in patient’s
life
33.3% 63.3% 70.0% 6.7%
Helps patient take
control
35.0% 61.7% 65.0% 3.3%