Base
EvolQuality
¶
Bases: Task
The EvolQuality
task is used to evolve the quality of the responses given a prompt,
by generating a new response with a language model. This step implements the evolution
quality task from the paper 'What Makes Good Data for Alignment? A Comprehensive Study of
Automatic Data Selection in Instruction Tuning'.
Attributes:
Name | Type | Description |
---|---|---|
num_evolutions |
int
|
The number of evolutions to be performed on the responses. |
store_evolutions |
bool
|
Whether to store all the evolved responses or just the last one.
Defaults to |
include_original_response |
bool
|
Whether to include the original response within the evolved
responses. Defaults to |
mutation_templates |
Dict[str, str]
|
The mutation templates to be used to evolve the responses. |
seed |
RuntimeParameter[int]
|
The seed to be set for |
Runtime parameters
seed
: The seed to be set fornumpy
in order to randomly pick a mutation method.
Input columns
- instruction (
str
): The instruction that was used to generate theresponses
. - response (
str
): The responses to be rewritten.
Output columns
- evolved_response (
str
): The evolved response ifstore_evolutions=False
. - evolved_responses (
List[str]
): The evolved responses ifstore_evolutions=True
. - model_name (
str
): The name of the LLM used to evolve the responses.
Source code in src/distilabel/steps/tasks/evol_quality/base.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 |
|
inputs: List[str]
property
¶
The input for the task are the instruction
and response
.
mutation_templates_names: List[str]
property
¶
Returns the names i.e. keys of the provided mutation_templates
enum.
outputs: List[str]
property
¶
The output for the task are the evolved_response/s
and the model_name
.
format_input(input)
¶
The input is formatted as a ChatType
assuming that the instruction
is the first interaction from the user within a conversation. And the
system_prompt
is added as the first message if it exists.
Source code in src/distilabel/steps/tasks/evol_quality/base.py
format_output(responses)
¶
The output for the task is a dict with: evolved_response
or evolved_responses
,
depending whether the value is either False
or True
for store_evolutions
, respectively;
and, finally, the model_name
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
responses |
Union[str, List[str]]
|
The responses to be included within the output. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
if |
Dict[str, Any]
|
if |
Source code in src/distilabel/steps/tasks/evol_quality/base.py
model_post_init(__context)
¶
Override this method to perform additional initialization after __init__
and model_construct
.
This is useful if you want to do some validation that requires the entire model to be initialized.
Source code in src/distilabel/steps/tasks/evol_quality/base.py
process(inputs)
¶
Processes the inputs of the task and generates the outputs using the LLM.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inputs |
StepInput
|
A list of Python dictionaries with the inputs of the task. |
required |
Returns:
Type | Description |
---|---|
StepOutput
|
A list of Python dictionaries with the outputs of the task. |