APIGenSemanticChecker¶
Generate queries and answers for the given functions in JSON format.
The APIGenGenerator
is inspired by the APIGen pipeline, which was designed to generate
verifiable and diverse function-calling datasets. The task generates a set of diverse queries
and corresponding answers for the given functions in JSON format.
Attributes¶
-
system_prompt: System prompt for the task. Has a default one.
-
exclude_failed_execution: Whether to exclude failed executions (won't run on those rows that have a False in
keep_row_after_execution_check
column, which comes from runningAPIGenExecutionChecker
). Defaults to True.
Input & Output Columns¶
graph TD
subgraph Dataset
subgraph Columns
ICOL0[func_desc]
ICOL1[query]
ICOL2[answers]
ICOL3[execution_result]
end
subgraph New columns
OCOL0[thought]
OCOL1[keep_row_after_semantic_check]
end
end
subgraph APIGenSemanticChecker
StepInput[Input Columns: func_desc, query, answers, execution_result]
StepOutput[Output Columns: thought, keep_row_after_semantic_check]
end
ICOL0 --> StepInput
ICOL1 --> StepInput
ICOL2 --> StepInput
ICOL3 --> StepInput
StepOutput --> OCOL0
StepOutput --> OCOL1
StepInput --> StepOutput
Inputs¶
-
func_desc (
str
): Description of what the function should do. -
query (
str
): Instruction from the user. -
answers (
str
): JSON encoded list with arguments to be passed to the function/API. Should be loaded usingjson.loads
. -
execution_result (
str
): Result of the function/API executed.
Outputs¶
-
thought (
str
): Reasoning for the output on whether to keep this output or not. -
keep_row_after_semantic_check (
bool
): True or False, can be used to filter afterwards.
Examples¶
Semantic checker for generated function calls (original implementation)¶
from distilabel.steps.tasks import APIGenSemanticChecker
from distilabel.llms import InferenceEndpointsLLM
llm=InferenceEndpointsLLM(
model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
generation_kwargs={
"temperature": 0.7,
"max_new_tokens": 1024,
},
)
semantic_checker = APIGenSemanticChecker(
use_default_structured_output=False,
llm=llm
)
semantic_checker.load()
res = next(
semantic_checker.process(
[
{
"func_desc": "Fetch information about a specific cat breed from the Cat Breeds API.",
"query": "What information can be obtained about the Maine Coon cat breed?",
"answers": json.dumps([{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]),
"execution_result": "The Maine Coon is a big and hairy breed of cat",
}
]
)
)
res
# [{'func_desc': 'Fetch information about a specific cat breed from the Cat Breeds API.',
# 'query': 'What information can be obtained about the Maine Coon cat breed?',
# 'answers': [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}],
# 'execution_result': 'The Maine Coon is a big and hairy breed of cat',
# 'thought': '',
# 'keep_row_after_semantic_check': True,
# 'raw_input_a_p_i_gen_semantic_checker_0': [{'role': 'system',
# 'content': 'As a data quality evaluator, you must assess the alignment between a user query, corresponding function calls, and their execution results.\nThese function calls and results are generated by other models, and your task is to ensure these results accurately reflect the user’s intentions.\n\nDo not pass if:\n1. The function call does not align with the query’s objective, or the input arguments appear incorrect.\n2. The function call and arguments are not properly chosen from the available functions.\n3. The number of function calls does not correspond to the user’s intentions.\n4. The execution results are irrelevant and do not match the function’s purpose.\n5. The execution results contain errors or reflect that the function calls were not executed successfully.\n'},
# {'role': 'user',
# 'content': 'Given Information:\n- All Available Functions:\nFetch information about a specific cat breed from the Cat Breeds API.\n- User Query: What information can be obtained about the Maine Coon cat breed?\n- Generated Function Calls: [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]\n- Execution Results: The Maine Coon is a big and hairy breed of cat\n\nNote: The query may have multiple intentions. Functions may be placeholders, and execution results may be truncated due to length, which is acceptable and should not cause a failure.\n\nThe main decision factor is wheather the function calls accurately reflect the query\'s intentions and the function descriptions.\nProvide your reasoning in the thought section and decide if the data passes (answer yes or no).\nIf not passing, concisely explain your reasons in the thought section; otherwise, leave this section blank.\n\nYour response MUST strictly adhere to the following JSON format, and NO other text MUST be included.\n```\n{\n "thought": "Concisely describe your reasoning here",\n "pass": "yes" or "no"\n}\n```\n'}]},
# 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct'}]
Semantic checker for generated function calls (structured output)¶
from distilabel.steps.tasks import APIGenSemanticChecker
from distilabel.llms import InferenceEndpointsLLM
llm=InferenceEndpointsLLM(
model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
generation_kwargs={
"temperature": 0.7,
"max_new_tokens": 1024,
},
)
semantic_checker = APIGenSemanticChecker(
use_default_structured_output=True,
llm=llm
)
semantic_checker.load()
res = next(
semantic_checker.process(
[
{
"func_desc": "Fetch information about a specific cat breed from the Cat Breeds API.",
"query": "What information can be obtained about the Maine Coon cat breed?",
"answers": json.dumps([{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]),
"execution_result": "The Maine Coon is a big and hairy breed of cat",
}
]
)
)
res
# [{'func_desc': 'Fetch information about a specific cat breed from the Cat Breeds API.',
# 'query': 'What information can be obtained about the Maine Coon cat breed?',
# 'answers': [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}],
# 'execution_result': 'The Maine Coon is a big and hairy breed of cat',
# 'keep_row_after_semantic_check': True,
# 'thought': '',
# 'raw_input_a_p_i_gen_semantic_checker_0': [{'role': 'system',
# 'content': 'As a data quality evaluator, you must assess the alignment between a user query, corresponding function calls, and their execution results.\nThese function calls and results are generated by other models, and your task is to ensure these results accurately reflect the user’s intentions.\n\nDo not pass if:\n1. The function call does not align with the query’s objective, or the input arguments appear incorrect.\n2. The function call and arguments are not properly chosen from the available functions.\n3. The number of function calls does not correspond to the user’s intentions.\n4. The execution results are irrelevant and do not match the function’s purpose.\n5. The execution results contain errors or reflect that the function calls were not executed successfully.\n'},
# {'role': 'user',
# 'content': 'Given Information:\n- All Available Functions:\nFetch information about a specific cat breed from the Cat Breeds API.\n- User Query: What information can be obtained about the Maine Coon cat breed?\n- Generated Function Calls: [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]\n- Execution Results: The Maine Coon is a big and hairy breed of cat\n\nNote: The query may have multiple intentions. Functions may be placeholders, and execution results may be truncated due to length, which is acceptable and should not cause a failure.\n\nThe main decision factor is wheather the function calls accurately reflect the query\'s intentions and the function descriptions.\nProvide your reasoning in the thought section and decide if the data passes (answer yes or no).\nIf not passing, concisely explain your reasons in the thought section; otherwise, leave this section blank.\n'}]},
# 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct'}]