Bases: EvolInstructTask
A TextGenerationTask
following the EvolComplexity
specification for building prompts. This is a special case
of the original EvolInstructTask, where the evolution method is fixed to "constraints", "deepen", "concretizing" or "reasoning".
Additionally, an additional elimation step should be executed to screen out instructions that are not useful.
From the reference repository: Evol-Instruct is a novel method using LLMs instead of humans to automatically mass-produce
open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.
The task is defined as follows:
Starting from an initial (simpler) instruction, select in-depth or in-breadth evolving to upgrade the simple instruction
to a more complex one or create a new one (to increase diversity).
The In-depth Evolving includes the following operations: "constraints", "deepen", "concretizing" or "reasoning".
The In-breadth Evolving is mutation, i.e., generating a completely new instruction based on the given instruction.
Given the evolved instructions are generated from LLMs, sometimes the evolving will fail. We adopt an instruction eliminator
to filter the failed instructions, called Elimination Evolving, but we don't apply the step of asking again to the LLM it the
answer is a copy from the same used prompt.
This evolutionary process can be repeated for several rounds to obtain instruction data containing various complexities.
Currently the task is implemented as a single step, so to generate multiple evolutions you can "repeat" the instructions
in the original dataset. An example can be seen at the following script:
examples/pipeline-evol-instruct-alpaca.py
Parameters:
Name |
Type |
Description |
Default |
system_prompt |
str
|
the system prompt to be used. Not defined for this task.
|
''
|
References
Source code in src/distilabel/tasks/text_generation/evol_complexity.py
| @dataclass
class EvolComplexityTask(EvolInstructTask):
"""A `TextGenerationTask` following the `EvolComplexity` specification for building prompts. This is a special case
of the original EvolInstructTask, where the evolution method is fixed to "constraints", "deepen", "concretizing" or "reasoning".
Additionally, an additional elimation step should be executed to screen out instructions that are not useful.
From the reference repository: *Evol-Instruct is a novel method using LLMs instead of humans to automatically mass-produce
open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.*
The task is defined as follows:
Starting from an initial (simpler) instruction, select in-depth or in-breadth evolving to upgrade the simple instruction
to a more complex one or create a new one (to increase diversity).
The In-depth Evolving includes the following operations: "constraints", "deepen", "concretizing" or "reasoning".
The In-breadth Evolving is mutation, i.e., generating a completely new instruction based on the given instruction.
Given the evolved instructions are generated from LLMs, sometimes the evolving will fail. We adopt an instruction eliminator
to filter the failed instructions, called Elimination Evolving, but we don't apply the step of asking again to the LLM it the
answer is a copy from the same used prompt.
This evolutionary process can be repeated for several rounds to obtain instruction data containing various complexities.
Currently the task is implemented as a single step, so to generate multiple evolutions you can "repeat" the instructions
in the original dataset. An example can be seen at the following script:
[examples/pipeline-evol-instruct-alpaca.py](https://github.com/argilla-io/distilabel/tree/main/examples/pipeline-evol-instruct-alpaca.py)
Args:
system_prompt (str, optional): the system prompt to be used. Not defined for this task.
References:
- [`WizardLM: Empowering Large Language Models to Follow Complex Instructions`](https://arxiv.org/abs/2304.12244)
- [`What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning`](https://arxiv.org/abs/2312.15685)
"""
system_prompt: str = ""
__jinja2_template__: str = _EVOL_COMPLEXITY_TEMPLATE
def generate_prompt(
self, input: str, evolution_method: Optional[EvolutionMethod] = None, **_: Any
) -> Prompt:
"""Generates a prompt following the Evol-Complexity specification of the Deita Paper.
Args:
input (str): the input to be used for the prompt.
evolution_method (str, optional): The evolution method to be used. If not provided (the default), a random one is chosen
like the original paper. Available ones are "constraints", "deepen", "concretizing" or "reasoning".
Returns:
Prompt: the generated prompt.
Examples:
>>> from distilabel.tasks.text_generation import EvolComplexityGeneratorTask
>>> task = EvolComplexityGeneratorTask()
>>> task.generate_prompt("Give three tips for staying healthy.")
Prompt(
system_prompt="",
formatted_prompt="I want you to act as a Prompt ...",
)
"""
evolution_method = self._get_evolution_method(evolution_method, EvolutionMethod)
return super().generate_prompt(input, evolution_method=evolution_method, **_)
|
generate_prompt(input, evolution_method=None, **_)
Generates a prompt following the Evol-Complexity specification of the Deita Paper.
Parameters:
Name |
Type |
Description |
Default |
input |
str
|
the input to be used for the prompt.
|
required
|
evolution_method |
str
|
The evolution method to be used. If not provided (the default), a random one is chosen
like the original paper. Available ones are "constraints", "deepen", "concretizing" or "reasoning".
|
None
|
Returns:
Name | Type |
Description |
Prompt |
Prompt
|
|
Examples:
>>> from distilabel.tasks.text_generation import EvolComplexityGeneratorTask
>>> task = EvolComplexityGeneratorTask()
>>> task.generate_prompt("Give three tips for staying healthy.")
Prompt(
system_prompt="",
formatted_prompt="I want you to act as a Prompt ...",
)
Source code in src/distilabel/tasks/text_generation/evol_complexity.py
| def generate_prompt(
self, input: str, evolution_method: Optional[EvolutionMethod] = None, **_: Any
) -> Prompt:
"""Generates a prompt following the Evol-Complexity specification of the Deita Paper.
Args:
input (str): the input to be used for the prompt.
evolution_method (str, optional): The evolution method to be used. If not provided (the default), a random one is chosen
like the original paper. Available ones are "constraints", "deepen", "concretizing" or "reasoning".
Returns:
Prompt: the generated prompt.
Examples:
>>> from distilabel.tasks.text_generation import EvolComplexityGeneratorTask
>>> task = EvolComplexityGeneratorTask()
>>> task.generate_prompt("Give three tips for staying healthy.")
Prompt(
system_prompt="",
formatted_prompt="I want you to act as a Prompt ...",
)
"""
evolution_method = self._get_evolution_method(evolution_method, EvolutionMethod)
return super().generate_prompt(input, evolution_method=evolution_method, **_)
|