Data4Life – We make health data ready for research

How an Open-Source Chatbot Is Accelerating Research at Mount Sinai

In collaboration with the Mount Sinai Health System and the Hasso Plattner Institute, we are building a secure, scalable and transparent LLM infrastructure for clinical research.

Health data holds enormous potential – but only when it’s accessible, understandable, and actionable. That’s where our latest AI project with Mount Sinai comes in: An open-source, locally hosted chatbot designed to support researchers with data exploration, literature search, and documentation tasks – faster, easier, and always secure.

Mount Sinai Health System is one of the most renowned research hospitals in the world. In collaboration with the Hasso Plattner Institute, Data4Life is helping build and extend AIR·MS - a secure research environment for structured access to anonymized health data. Our latest addition: a Large Language Model (LLM)-powered chatbot that lives inside the system and supports researchers and clinicians directly in their workflows.

Designed for research, not for profit

Instead of relying on commercial services, we developed a chatbot based entirely on open-source tools and open-weight language models. This ensures full transparency, local data sovereignty, and maximum compliance with hospital security standards. The chatbot uses a combination of Ollama, Open WebUI, and models such as LLaMa, Mistral, and gpt-oss, integrated into Mount Sinai’s own Azure-based infrastructure.

This architecture isn’t just technically robust – it’s also purpose-built for the specific challenges of health research.

From manual search to instant insight

One core use case is literature summarization. Researchers often need to read dozens of studies or clinical guidelines before drawing a conclusion. With the chatbot, they can paste long documents or search queries and receive precise, structured summaries in seconds – saving time and cognitive load.

Another practical scenario is question answering on clinical data models. Many datasets follow the OMOP common data model, which can be complex to navigate. The chatbot helps users understand how a specific concept (e.g., “heart failure hospitalization”) is defined, how it maps to codes, and where it appears in the data.

Supporting structured data analysis

The LLM assistant is also being integrated into Data2Evidence, our analytics tooling within AIR·MS. Here, the chatbot can support tasks such as building cohort definitions, generating SQL queries, and explaining ETL processes. This makes the platform more accessible to clinical researchers who may not have a technical background – and reduces the dependency on engineering teams.

The goal: Lower the barrier to data-driven research and allow more people to ask meaningful questions – and get meaningful answers.

Built-in privacy, built for scale

Security was a top priority from day one. The chatbot is integrated into the AIR·MS Application Tier, which adds a reliable application environment using containers, and uses Single Sign-On (with Microsoft Entra). Through an independent penetration testing exercise, no critical vulnerabilities were found – validating the architecture’s robustness.

So far, ~100 early users from various departments across Mount Sinai have been testing the chatbot in their daily work. Their use cases range from reviewing documentation to preparing research protocols and identifying relevant publications in PubMed. Their feedback is actively shaping the next development steps – including more medical knowledge integration, user interface refinement, and the ability to interact with structured datasets (like lab results or diagnoses) in natural language.

From pilot to research companion

The AIR·MS chatbot is more than a feature – it’s the foundation of a broader AI strategy. In the future, we plan to extend its capabilities toward:

automated image annotation workflows using the Visian platform,
machine learning experiment tracking with MLflow,
and deeper integration with translational use cases in oncology, cardiology, and rare disease research.

Our shared vision with Mount Sinai is clear: AI should support science without compromising trust. With a secure, locally hosted, open-source infrastructure, we’re building exactly that.

You are a researcher and want to learn more about AIR·MS?

AIR·MS is open to researchers who want to make a difference. If you're working with clinical data, exploring real-world evidence, or building models for better health outcomes, AIR·MS gives you secure, structured access to one of the richest anonymized hospital datasets in the U.S. – with direct support, modern tools, and room to grow.

How an Open-Source Chatbot Is Accelerating Research at Mount Sinai

You are a researcher and want to learn more about AIR·MS?

Digital solutions for a healthier world.

Our work

Knowledge

Contact

How an Open-Source Chatbot Is Accelerating Research at Mount Sinai

You are a researcher and want to learn more about AIR·MS?

Digital solutions for a healthier world.

You can also find us on:

Our work

Knowledge

Contact