complexity_scorer

`ComplexityScorerTask` `dataclass` ¶

Bases: PreferenceTaskNoRationale

A PreferenceTask following the Complexity Scorer specification for rating instructions in terms of complexity.

This task is inspired by the Evol Complexity Scorer in the Deita framework: Deita is an open-sourced project designed to facilitate Automatic Data Selection for instruction tuning in Large Language Models (LLMs).

The task is defined as follows: Ask an LLM (in the original paper they used ChatGPT) to rate the instructions (the number of instructions is dynamic in the sense that you can compare any number, in Deita the chose 6) to obtain a complexity score c for each instruction.

This task will only need to receive the list of generations in a dataset to generate the scores.

Parameters:

Name	Type	Description	Default
`system_prompt`	`str`	the system prompt to be used. Not defined for this task.	`''`

References

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Source code in src/distilabel/tasks/preference/complexity_scorer.py

@dataclass
class ComplexityScorerTask(PreferenceTaskNoRationale):
    """A `PreferenceTask` following the `Complexity Scorer` specification for rating instructions
    in terms of complexity.

    This task is inspired by the Evol Complexity Scorer in the Deita framework: *Deita is an open-sourced project
    designed to facilitate Automatic Data Selection for instruction tuning in Large Language Models (LLMs).*

    The task is defined as follows:
    Ask an LLM (in the original paper they used ChatGPT) to rate the instructions (the number of instructions
    is dynamic in the sense that you can compare any number, in *Deita* the chose 6) to obtain a complexity
    score *c* for each instruction.

    This task will only need to receive the list of `generations` in a dataset to generate the scores.

    Args:
        system_prompt (str, optional): the system prompt to be used. Not defined for this task.

    References:
        - [`What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning`](https://arxiv.org/abs/2312.15685)
    """

    system_prompt: str = ""

    __jinja2_template__: str = _COMPLEXITY_SCORER_TEMPLATE

    def generate_prompt(self, generations: List[str], **_: Any) -> Prompt:
        """Generates a prompt following the *Evol Complexity* specification in *Deita*.

        Args:
            generations (List[str]): the generations to be used for the prompt.

        Returns:
            Prompt: the generated prompt.

        Examples:
            >>> from distilabel.tasks import ComplexityScorerTask
            >>> task = ComplexityScorerTask()
            >>> task.generate_prompt(["instruction 1", "instruction 2"])
            Prompt(system_prompt="", formatted_prompt="Ranking the following questions...")
        """
        render_kwargs = {"instructions": generations}
        return Prompt(
            system_prompt=self.system_prompt,
            formatted_prompt=self.template.render(**render_kwargs),
        )

    @property
    def input_args_names(self) -> List[str]:
        """Returns the names of the input arguments of the task."""
        return ["generations"]

    def parse_output(self, output: str) -> Dict[str, List[str]]:
        """Parses the output of the task, returning a list with the rank/score of each instruction.

        Args:
            output (str): The output of the LLM raw.

        Returns:
            Dict[str, List[str]]: A dict with containing the ranks/scores of each instruction.
        """
        output = output.lower().split("\n")
        scores = [float(re.sub(r"\[\d+\] score:", "", o).strip()) for o in output]
        return {self.output_args_names[0]: scores}

`input_args_names: List[str]` `property` ¶

Returns the names of the input arguments of the task.

`generate_prompt(generations, **_)` ¶

Generates a prompt following the Evol Complexity specification in Deita.

Parameters:

Name	Type	Description	Default
`generations`	`List[str]`	the generations to be used for the prompt.	required

Returns:

Name	Type	Description
`Prompt`	`Prompt`	the generated prompt.

Examples:

>>> from distilabel.tasks import ComplexityScorerTask
>>> task = ComplexityScorerTask()
>>> task.generate_prompt(["instruction 1", "instruction 2"])
Prompt(system_prompt="", formatted_prompt="Ranking the following questions...")

Source code in src/distilabel/tasks/preference/complexity_scorer.py

def generate_prompt(self, generations: List[str], **_: Any) -> Prompt:
    """Generates a prompt following the *Evol Complexity* specification in *Deita*.

    Args:
        generations (List[str]): the generations to be used for the prompt.

    Returns:
        Prompt: the generated prompt.

    Examples:
        >>> from distilabel.tasks import ComplexityScorerTask
        >>> task = ComplexityScorerTask()
        >>> task.generate_prompt(["instruction 1", "instruction 2"])
        Prompt(system_prompt="", formatted_prompt="Ranking the following questions...")
    """
    render_kwargs = {"instructions": generations}
    return Prompt(
        system_prompt=self.system_prompt,
        formatted_prompt=self.template.render(**render_kwargs),
    )

`parse_output(output)` ¶

Parses the output of the task, returning a list with the rank/score of each instruction.

Parameters:

Name	Type	Description	Default
`output`	`str`	The output of the LLM raw.	required

Returns:

Type	Description
`Dict[str, List[str]]`	Dict[str, List[str]]: A dict with containing the ranks/scores of each instruction.

Source code in src/distilabel/tasks/preference/complexity_scorer.py

def parse_output(self, output: str) -> Dict[str, List[str]]:
    """Parses the output of the task, returning a list with the rank/score of each instruction.

    Args:
        output (str): The output of the LLM raw.

    Returns:
        Dict[str, List[str]]: A dict with containing the ranks/scores of each instruction.
    """
    output = output.lower().split("\n")
    scores = [float(re.sub(r"\[\d+\] score:", "", o).strip()) for o in output]
    return {self.output_args_names[0]: scores}

complexity_scorer