Fast, interpretable document relevance prediction in Python.
50K+ inferences per second. No GPU required.
pip install markrel
Unlike fixed thresholds, markrel learns P(relevance) from your specific domain. Each similarity bin discovers its own probability.
50,000+ predictions per second on CPU. Perfect for real-time applications and high-throughput pipelines.
See exactly why each document was selected. Inspect P(relevance) for every similarity bin.
Works with BERT, OpenAI, sentence-transformers, or TF-IDF. Use the embeddings that work best for you.
Tested on WikiQA dataset (6,165 question-answer pairs)
| Embedding Model | Dimensions | F1 Score | AUC | Speed |
|---|---|---|---|---|
| BGE-M3 ⭐ | 1024 | 0.343 | 0.815 | 51K/s |
| RoBERTa-large | 1024 | 0.323 | 0.828 | 54K/s |
| MiniLM-L6 | 384 | 0.322 | 0.799 | 61K/s |
# Install
pip install markrel
# Import and train
from markrel import MarkovRelevanceModel
model = MarkovRelevanceModel(
metrics=["euclidean"], # Best single metric
n_bins=35 # Optimized for F1
)
model.fit(queries, documents, labels)
# Predict relevance
probs = model.predict_proba(new_queries, new_documents)
# [0.82, 0.15, 0.91, ...]
from sentence_transformers import SentenceTransformer
# Load BGE-M3 (best per benchmarks)
encoder = SentenceTransformer('BAAI/bge-m3')
# Encode your texts
query_emb = encoder.encode(["what is ML?"])
doc_emb = encoder.encode(["machine learning is..."])
# Train with embeddings
model = MarkovRelevanceModel(
metrics=["euclidean"],
use_text_vectorizer=False
)
model.fit(query_emb, doc_emb, [1])