EvolInstruct¶
Evolve instructions using an LLM.
WizardLM: Empowering Large Language Models to Follow Complex Instructions
Attributes¶
- 
num_evolutions: The number of evolutions to be performed. 
- 
store_evolutions: Whether to store all the evolutions or just the last one. Defaults to False.
- 
generate_answers: Whether to generate answers for the evolved instructions. Defaults to False.
- 
include_original_instruction: Whether to include the original instruction in the evolved_instructionsoutput column. Defaults toFalse.
- 
mutation_templates: The mutation templates to be used for evolving the instructions. Defaults to the ones provided in the utils.pyfile.
- 
seed: The seed to be set for numpyin order to randomly pick a mutation method. Defaults to42.
Runtime Parameters¶
- seed: The seed to be set for numpyin order to randomly pick a mutation method.
Input & Output Columns¶
graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[instruction]
        end
        subgraph New columns
            OCOL0[evolved_instruction]
            OCOL1[evolved_instructions]
            OCOL2[model_name]
            OCOL3[answer]
            OCOL4[answers]
        end
    end
    subgraph EvolInstruct
        StepInput[Input Columns: instruction]
        StepOutput[Output Columns: evolved_instruction, evolved_instructions, model_name, answer, answers]
    end
    ICOL0 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepOutput --> OCOL2
    StepOutput --> OCOL3
    StepOutput --> OCOL4
    StepInput --> StepOutput
Inputs¶
- instruction (str): The instruction to evolve.
Outputs¶
- 
evolved_instruction ( str): The evolved instruction ifstore_evolutions=False.
- 
evolved_instructions ( List[str]): The evolved instructions ifstore_evolutions=True.
- 
model_name ( str): The name of the LLM used to evolve the instructions.
- 
answer ( str): The answer to the evolved instruction ifgenerate_answers=Trueandstore_evolutions=False.
- 
answers ( List[str]): The answers to the evolved instructions ifgenerate_answers=Trueandstore_evolutions=True.
Examples¶
Evolve an instruction using an LLM¶
from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM
# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
)
evol_instruct.load()
result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [{'instruction': 'common instruction', 'evolved_instruction': 'evolved instruction', 'model_name': 'model_name'}]
Keep the iterations of the evolutions¶
from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM
# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
    store_evolutions=True,
)
evol_instruct.load()
result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [
#     {
#         'instruction': 'common instruction',
#         'evolved_instructions': ['initial evolution', 'final evolution'],
#         'model_name': 'model_name'
#     }
# ]
Generate answers for the instructions in a single step¶
from distilabel.steps.tasks import EvolInstruct
from distilabel.models import InferenceEndpointsLLM
# Consider this as a placeholder for your actual LLM.
evol_instruct = EvolInstruct(
    llm=InferenceEndpointsLLM(
        model_id="mistralai/Mistral-7B-Instruct-v0.2",
    ),
    num_evolutions=2,
    generate_answers=True,
)
evol_instruct.load()
result = next(evol_instruct.process([{"instruction": "common instruction"}]))
# result
# [
#     {
#         'instruction': 'common instruction',
#         'evolved_instruction': 'evolved instruction',
#         'answer': 'answer to the instruction',
#         'model_name': 'model_name'
#     }
# ]