Instructions to use Sunanhe/MedDr_0401 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sunanhe/MedDr_0401 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Sunanhe/MedDr_0401", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Sunanhe/MedDr_0401", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Sunanhe/MedDr_0401 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Sunanhe/MedDr_0401"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sunanhe/MedDr_0401",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Sunanhe/MedDr_0401

SGLang

How to use Sunanhe/MedDr_0401 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Sunanhe/MedDr_0401" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sunanhe/MedDr_0401",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Sunanhe/MedDr_0401" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sunanhe/MedDr_0401",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use Sunanhe/MedDr_0401 with Docker Model Runner:
```
docker model run hf.co/Sunanhe/MedDr_0401
```

MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning

A generalist foundation model for healthcare capable of handling diverse medical data modalities.

Authors: Sunan He*, Yuxiang Nie*, Zhixuan Chen, Zhiyuan Cai, Hongmei Wang, Shu Yang, Hao Chen**
(*Equal Contribution, **Corresponding Author)
Institution: SMART Lab, Hong Kong University of Science and Technology

Model Summary

MedDr is a large-scale generalist vision-language model for healthcare. It is built upon InternVL and trained using a diagnosis-guided bootstrapping strategy that leverages both image and label information to construct high-quality vision-language datasets.

MedDr supports diverse medical imaging modalities:

🫁 Radiology (X-ray, CT, MRI)
🔬 Pathology
🧴 Dermatology
👁️ Retinography
🔭 Endoscopy

During inference, MedDr employs a retrieval-augmented medical diagnosis strategy to enhance generalization ability.

Capabilities

Visual Question Answering (VQA) for medical images
Medical report generation
Medical image diagnosis across multiple modalities

Usage

Environment Setup

This model is built on InternVL. Please follow the INSTALLATION.md to set up the environment.

Quick Demo

# Clone the GitHub repository
# git clone https://github.com/sunanhe/MedDr.git

# Edit demo.py and set model_path to your local checkpoint directory
# Then run:
# python3 demo.py

See demo.py in the GitHub repository for a full example.

Citation

If you find MedDr useful in your research, please consider citing:

@article{he2024meddr,
  title={MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning},
  author={He, Sunan and Nie, Yuxiang and Chen, Zhixuan and Cai, Zhiyuan and Wang, Hongmei and Yang, Shu and Chen, Hao},
  journal={arXiv preprint arXiv:2404.15127},
  year={2024}
}

Acknowledgements

This work builds upon InternVL. We thank the InternVL team for their outstanding contributions to the open-source VLM community.

Downloads last month: 119

Safetensors

Model size

40B params

Tensor type

BF16

Model tree for Sunanhe/MedDr_0401

Base model

OpenGVLab/InternVL-Chat-V1-2

Finetuned

(2)

this model

Paper for Sunanhe/MedDr_0401

MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning

Paper • 2404.15127 • Published Apr 23, 2024