Scoring (Similarity Scoring)

Use the LLM.score() method to calculate similarity scores between text pairs for binary classification and reranker models.

This example demonstrates how to use the LLM.score() method for calculating similarity scores between text pairs. This is applicable to binary classification models, including Qwen3-Reranker models or models converted using as_binary_seq_cls_model.

Python API Example

The following example demonstrates 1-to-1, 1-to-N, and N-to-N scoring with PoolingParams:

python

from furiosa_llm import LLM, PoolingParams

# Load a reranker or binary classification model
with LLM("furiosa-ai/Qwen3-Reranker-8B") as llm:
    # ============================================================
    # Example 1: 1-to-1 scoring (single query, single document)
    # ============================================================
    query = "What is machine learning?"
    document = "Machine learning is a subset of artificial intelligence."
    outputs = llm.score(query, document)
    print(f"Similarity score: {outputs[0].outputs.score}")
    print("-" * 80)

    # ============================================================
    # Example 2: 1-to-N scoring (single query, multiple documents)
    # ============================================================
    query = "What is deep learning?"
    documents = [
        "Deep learning uses neural networks with multiple layers.",
        "Python is a popular programming language.",
        "Machine learning is a field of artificial intelligence.",
        "Neural networks are inspired by the human brain.",
    ]
    outputs = llm.score(query, documents)
    for i, output in enumerate(outputs):
        print(f"Document {i}: score = {output.outputs.score:.4f}")
        print(f"  Text: {documents[i][:50]}...")
    print("-" * 80)

    # ============================================================
    # Example 3: N-to-N scoring (multiple queries, paired documents)
    # ============================================================
    queries = [
        "What is Python?",
        "What is JavaScript?",
        "What is SQL?",
    ]
    documents = [
        "Python is a programming language.",
        "JavaScript is used for web development.",
        "SQL is a database query language.",
    ]
    outputs = llm.score(queries, documents)
    for i, (q, d, output) in enumerate(zip(queries, documents, outputs)):
        print(f"Pair {i}: score = {output.outputs.score:.4f}")
        print(f"  Query: {q}")
        print(f"  Document: {d}")
    print("-" * 80)

    # ============================================================
    # Example 4: Using PoolingParams for truncation
    # ============================================================
    # Truncate long documents to fit within model limits
    pooling_params = PoolingParams(truncate_prompt_tokens=512)
    query = "What is the capital of France?"
    long_documents = [
        "Paris is the capital and most populous city of France. " * 50,  # Long document
        "London is the capital of the United Kingdom. " * 50,
    ]
    outputs = llm.score(query, long_documents, pooling_params=pooling_params)
    for i, output in enumerate(outputs):
        print(f"Document {i} score: {output.outputs.score:.4f}")

Use Cases

The LLM.score() method is useful for:

Document Retrieval: Finding the most relevant documents for a query
Semantic Similarity: Measuring how similar two pieces of text are
Question Answering: Identifying which document best answers a question
Duplicate Detection: Finding similar or duplicate content
Content Recommendation: Suggesting related articles or documents

For ranking multiple documents by relevance, see Rerank API example.

Server API Example

You can also use the scoring functionality through the OpenAI-compatible server:

python

import os

import requests

# Start server with: furiosa-llm serve path/to/reranker-model

base_url = os.getenv("OPENAI_BASE_URL", "http://localhost:8000/v1")

# 1-to-N scoring via HTTP API
response = requests.post(
    f"{base_url}/score",
    json={
        "model": "reranker",
        "text_1": "What is machine learning?",
        "text_2": [
            "Machine learning is a subset of AI.",
            "Python is a programming language.",
            "Deep learning uses neural networks.",
        ],
    },
)

data = response.json()
for item in data["data"]:
    print(f"Index {item['index']}: score = {item['score']:.4f}")

See Score API Reference for complete server API documentation.

Scoring (Similarity Scoring)

Python API Example

Use Cases

Server API Example

On this page