--- language: - en license: mit library_name: transformers tags: - reranking - information-retrieval - pointwise - ranknet - llama base_model: meta-llama/Llama-3.1-8B datasets: - Tevatron/msmarco-passage - abdoelsayed/DeAR-COT pipeline_tag: text-classification --- # DeAR-8B-Reranker-RankNet-v1 ## Model Description **DeAR-8B-Reranker-RankNet-v1** is an 8B parameter neural reranker trained with RankNet loss and knowledge distillation. This model is part of the [DeAR framework](https://github.com/DataScienceUIBK/DeAR-Reranking) family and achieves strong performance on standard IR benchmarks while being significantly faster than larger teacher models. ## Model Details - **Model Type:** Pointwise Reranker (Sequence Classification) - **Base Model:** LLaMA-3.1-8B - **Parameters:** 8 billion - **Training Method:** Knowledge Distillation + RankNet Loss - **Teacher Model:** [LLaMA2-13B-RankLLaMA](https://huggingface.co/abdoelsayed/llama2-13b-rankllama-teacher) - **Training Data:** MS MARCO - **Precision:** BFloat16 ## Key Features ✅ **High Performance:** Competitive with 13B teacher on BEIR benchmarks ✅ **Fast Inference:** 2.2s average latency on standard GPU ✅ **Memory Efficient:** Fits on single 24GB GPU ✅ **Knowledge Distillation:** Enhanced with Chain-of-Thought reasoning ## Performance | Benchmark | NDCG@10 | |-----------|---------| | TREC DL19 | 74.5 | | TREC DL20 | 72.8 | | BEIR (Avg) | 45.2 | | MS MARCO Dev | 68.9 | ## Usage ### Quick Start ```python import torch from transformers import AutoTokenizer, AutoModelForSequenceClassification # Load model model_path = "abdoelsayed/dear-8b-reranker-ranknet-v1" tokenizer = AutoTokenizer.from_pretrained(model_path) model = AutoModelForSequenceClassification.from_pretrained( model_path, torch_dtype=torch.bfloat16 ) model.eval().cuda() # Score a query-document pair query = "What is machine learning?" document = "Machine learning is a subset of artificial intelligence..." inputs = tokenizer( f"query: {query}", f"document: {document}", return_tensors="pt", truncation=True, max_length=228, # q_max_len(32) + p_max_len(196) padding="max_length" ) inputs = {k: v.cuda() for k, v in inputs.items()} with torch.no_grad(): score = model(**inputs).logits.squeeze().item() print(f"Relevance score: {score}") ``` ### Batch Reranking ```python def rerank_documents(query, documents, model, tokenizer, batch_size=64): """ Rerank a list of documents for a query. Args: query: Search query string documents: List of (title, text) tuples model: Loaded reranker model tokenizer: Loaded tokenizer batch_size: Batch size for inference Returns: List of (index, score) tuples sorted by relevance """ scores = [] for i in range(0, len(documents), batch_size): batch_docs = documents[i:i + batch_size] # Prepare inputs queries = [f"query: {query}"] * len(batch_docs) docs = [f"document: {title} {text}" for title, text in batch_docs] inputs = tokenizer( queries, docs, return_tensors="pt", truncation=True, max_length=228, padding=True ) inputs = {k: v.to(model.device) for k, v in inputs.items()} # Get scores with torch.no_grad(): logits = model(**inputs).logits.squeeze(-1) scores.extend(logits.cpu().tolist()) # Sort by score (descending) ranked = sorted(enumerate(scores), key=lambda x: x[1], reverse=True) return ranked # Example usage query = "When was the Eiffel Tower built?" documents = [ ("Eiffel Tower", "The Eiffel Tower was built in 1889 for the World's Fair."), ("Paris", "Paris is the capital of France."), ("Architecture", "Modern architecture has evolved significantly."), ] ranking = rerank_documents(query, documents, model, tokenizer) print(ranking) # Output: [(0, 8.23), (1, 2.45), (2, -1.87)] ``` ## Training Details ### Training Data - **Primary Dataset:** MS MARCO Passage Ranking ### Hardware - **GPUs:** 4x NVIDIA A100 (40GB) - **Training Time:** ~36 hours - **DeepSpeed:** ZeRO Stage 2 ### Loss Function **RankNet Loss** with Knowledge Distillation: ``` L_total = (1 - α) * L_RankNet + α * L_KD where: - L_RankNet: Pairwise ranking loss - L_KD: KL divergence with teacher (temperature=2) - α: 0.1 (distillation weight) ``` ## Evaluation Results ### TREC Deep Learning | Dataset | NDCG@10 | NDCG@20 | MAP | |---------|---------|---------|-----| | DL19 | 74.50 | 70.23 | 45.67 | | DL20 | 72.80 | 69.15 | 43.21 | ### BEIR Benchmark | Dataset | NDCG@10 | |---------|---------| | MS MARCO | 68.9 | | NQ | 52.3 | | HotpotQA | 61.8 | | FiQA | 47.2 | | ArguAna | 59.4 | | SciFact | 73.6 | | TREC-COVID | 85.2 | | NFCorpus | 39.8 | ### Efficiency | Metric | Value | |--------|-------| | Inference Time (100 docs) | 2.2s | | GPU Memory (inference) | 18GB | | Throughput | ~45 docs/sec | ## Comparison with Other Models | Model | Size | TREC DL19 | BEIR Avg | Inference (s) | |-------|------|-----------|----------|---------------| | MonoT5-3B | 3B | 71.8 | 43.5 | 3.5 | | **DeAR-P-8B-RL** | 8B | **74.5** | **45.2** | **2.2** | | Teacher (13B) | 13B | 73.8 | 44.8 | 5.8 | ## Model Architecture ``` Input: "query: [Q] [SEP] document: [D]" ↓ LLaMA-3.1-8B Encoder ↓ [CLS] Token Representation ↓ Linear Classification Head ↓ Relevance Score (scalar) ``` ## Limitations - **Domain Adaptation:** Trained primarily on MS MARCO; may require fine-tuning for specialized domains - **Query Length:** Optimized for queries up to 32 tokens - **Document Length:** Truncated to 196 tokens; longer documents lose information - **Language:** English only - **Numerical Reasoning:** Limited capability for queries requiring calculations ## Bias and Fairness This model inherits biases present in: - Base LLaMA-3.1-8B model - MS MARCO training data - Teacher model annotations Users should evaluate fairness for their specific use cases. ## Ethical Considerations - **Search Ranking:** Can influence information access and visibility - **Training Data:** May contain biased or sensitive content - **Misuse Potential:** Should not be used for surveillance or discriminatory ranking ## Related Models **DeAR Family:** - [DeAR-8B-CE](https://huggingface.co/abdoelsayed/dear-8b-reranker-ce-v1) - Binary Cross-Entropy variant - [DeAR-8B-Listwise](https://huggingface.co/abdoelsayed/dear-8b-reranker-listwise-v1) - Listwise reranking - [DeAR-8B-RankNet-LoRA](https://huggingface.co/abdoelsayed/dear-8b-reranker-ranknet-lora-v1) - LoRA adapter **Teacher:** - [LLaMA2-13B-RankLLaMA-Teacher](https://huggingface.co/abdoelsayed/llama2-13b-rankllama-teacher) **Dataset:** - [DeAR-COT](https://huggingface.co/datasets/abdoelsayed/DeAR-COT) ## Citation ```bibtex @article{abdallah2025dear, title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation}, author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam}, journal={arXiv preprint arXiv:2508.16998}, year={2025} } ``` ## License MIT License ## Contact - **GitHub:** [DataScienceUIBK/DeAR-Reranking](https://github.com/DataScienceUIBK/DeAR-Reranking) - **Paper:** [arXiv:2508.16998](https://arxiv.org/abs/2508.16998) - **Collection:** [DeAR Models](https://huggingface.co/collections/abdoelsayed/dear-reranking)