ComplexityScorer¶

Score instructions based on their complexity using an LLM.

ComplexityScorer is a pre-defined task used to rank a list of instructions based in their complexity. It's an implementation of the complexity score task from the paper 'What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning'.

Attributes¶

_template: a Jinja2 template used to format the input for the LLM.

Input & Output Columns¶

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[instructions]
        end
        subgraph New columns
            OCOL0[scores]
            OCOL1[model_name]
        end
    end

    subgraph ComplexityScorer
        StepInput[Input Columns: instructions]
        StepOutput[Output Columns: scores, model_name]
    end

    ICOL0 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepInput --> StepOutput

Inputs¶

instructions (List[str]): The list of instructions to be scored.

Outputs¶

scores (List[float]): The score for each instruction.
model_name (str): The model name used to generate the scores.

Examples¶

Evaluate the complexity of your instructions¶

from distilabel.steps.tasks import ComplexityScorer
from distilabel.llms.huggingface import InferenceEndpointsLLM

# Consider this as a placeholder for your actual LLM.
scorer = ComplexityScorer(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    )
)

scorer.load()

result = next(
    scorer.process(
        [{"instructions": ["plain instruction", "highly complex instruction"]}]
    )
)
# result
# [{'instructions': ['plain instruction', 'highly complex instruction'], 'model_name': 'test', 'scores': [1, 5], 'distilabel_metadata': {'raw_output_complexity_scorer_0': 'output'}}]

References¶

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning