SelfInstruct¶
Generate instructions based on a given input using an LLM
.
SelfInstruct
is a pre-defined task that, given a number of instructions, a
certain criteria for query generations, an application description, and an input,
generates a number of instruction related to the given input and following what
is stated in the criteria for query generation and the application description.
It is based in the SelfInstruct framework from the paper "Self-Instruct: Aligning
Language Models with Self-Generated Instructions".
Attributes¶
-
num_instructions: The number of instructions to be generated. Defaults to 5.
-
criteria_for_query_generation: The criteria for the query generation. Defaults to the criteria defined within the paper.
-
application_description: The description of the AI application that one want to build with these instructions. Defaults to
AI assistant
.
Input & Output Columns¶
graph TD
subgraph Dataset
subgraph Columns
ICOL0[input]
end
subgraph New columns
OCOL0[instructions]
OCOL1[model_name]
end
end
subgraph SelfInstruct
StepInput[Input Columns: input]
StepOutput[Output Columns: instructions, model_name]
end
ICOL0 --> StepInput
StepOutput --> OCOL0
StepOutput --> OCOL1
StepInput --> StepOutput
Inputs¶
- input (
str
): The input to generate the instructions. It's also called seed in the paper.
Outputs¶
-
instructions (
List[str]
): The generated instructions. -
model_name (
str
): The model name used to generate the instructions.
Examples¶
Generate instructions based on a given input¶
from distilabel.steps.tasks import SelfInstruct
from distilabel.llms.huggingface import InferenceEndpointsLLM
self_instruct = SelfInstruct(
llm=InferenceEndpointsLLM(
model_id="mistralai/Mistral-7B-Instruct-v0.2",
),
num_instructions=5, # This is the default value
)
self_instruct.load()
result = next(self_instruct.process([{"input": "instruction"}]))
# result
# [
# {
# 'input': 'instruction',
# 'model_name': 'mistralai/Mistral-7B-Instruct-v0.2',
# 'instructions': ["instruction 1", "instruction 2", "instruction 3", "instruction 4", "instruction 5"],
# }
# ]