This repository contains the machine learning model weights and artifacts required to run the RCT-Reviewer application. It includes models for Risk of Bias assessment, PICO extraction, and RCT classification.
Acknowledgment
This repository contains joblib-converted model artifacts originally developed in RobotReviewer. The models are redistributed here for ease of deployment in the RCT-Reviewer Streamlit application.
Related
- RCT-Reviewer: https://github.com/aurumz-rgb/RCT-Reviewer
- RCT-Reviewer web: https://rct-reviewer.github.io
- RCT-Reviewer Zenodo: https://zenodo.org/records/20618339
- Original RobotReviewer: https://github.com/ijmarshall/robotreviewer
- RobotReviewer Zenodo: https://zenodo.org/records/6855718
π¦ Model Formats
This repository strictly provides models in modern, Python-native formats optimized for the RCT-Reviewer application:
.joblib: Compressed serialized models (typically classifiers or vectorizers)..npz: NumPy/Numpy sparse matrices (typically model weights or embeddings).
Note: Legacy .pickle, .pck, and TensorFlow/CNN files are not included in this repository. You may find those here: https://huggingface.co/Aurumz/RCT-Reviewer-pickle
π οΈ How to Use
You can load these models directly in Python using joblib and scipy/numpy.
Prerequisites
Ensure you have the necessary libraries installed:
pip install joblib scikit-learn scipy numpy huggingface_hub
1. Downloading Models (Automated)
The RCT-Reviewer application uses the huggingface_hub library to download the models to a local cache directory. You can replicate this behavior using the following snippet:
from huggingface_hub import snapshot_download
from pathlib import Path
# Define cache directory
models_dir = Path.home() / ".cache" / "rct_reviewer" / "models"
# Download the entire repository
snapshot_download(
repo_id="Aurumz/RCT-Reviewer",
repo_type="model",
local_dir=models_dir,
max_workers=1
)
print(f"Models downloaded to: {models_dir}")
2. Loading .joblib Models
Use the joblib library to load classifier artifacts directly.
import joblib
from huggingface_hub import hf_hub_download
# Example: Downloading a specific joblib model
model_path = hf_hub_download(
repo_id="Aurumz/RCT-Reviewer",
filename="data/bias/bias_classifier.joblib"
)
# Load the model
model = joblib.load(model_path)
print(f"Model loaded successfully: {type(model)}")
3. Loading .npz Weight Files
The .npz files typically contain sparse matrices used for Linear SVM weights or TF-IDF vectors.
import numpy as np
from scipy.sparse import load_npz
from huggingface_hub import hf_hub_download
# Example: Downloading sparse weights
weights_path = hf_hub_download(
repo_id="Aurumz/RCT-Reviewer",
filename="data/rct/rct_svm_weights.npz"
)
# Load the sparse matrix
weights = load_npz(weights_path)
# If it is a standard dense numpy array, use:
# weights = np.load(weights_path)
print(f"Weights shape: {weights.shape}")
π File Structure
The artifacts are organized by task within the data directory:
data/
βββ bias/ # Risk of Bias models (.npz, .joblib)
βββ pico/ # PICO extraction models (.npz)
βββ rct/ # RCT classification weights (.npz)
βββ vocab/ # Vocabulary and embedding files (.npz)