Skip to content

APIGenSemanticChecker

Generate queries and answers for the given functions in JSON format.

The APIGenGenerator is inspired by the APIGen pipeline, which was designed to generate verifiable and diverse function-calling datasets. The task generates a set of diverse queries and corresponding answers for the given functions in JSON format.

Attributes

  • system_prompt: System prompt for the task. Has a default one.

  • exclude_failed_execution: Whether to exclude failed executions (won't run on those rows that have a False in keep_row_after_execution_check column, which comes from running APIGenExecutionChecker). Defaults to True.

Input & Output Columns

graph TD
    subgraph Dataset
        subgraph Columns
            ICOL0[func_desc]
            ICOL1[query]
            ICOL2[answers]
            ICOL3[execution_result]
        end
        subgraph New columns
            OCOL0[thought]
            OCOL1[keep_row_after_semantic_check]
        end
    end

    subgraph APIGenSemanticChecker
        StepInput[Input Columns: func_desc, query, answers, execution_result]
        StepOutput[Output Columns: thought, keep_row_after_semantic_check]
    end

    ICOL0 --> StepInput
    ICOL1 --> StepInput
    ICOL2 --> StepInput
    ICOL3 --> StepInput
    StepOutput --> OCOL0
    StepOutput --> OCOL1
    StepInput --> StepOutput

Inputs

  • func_desc (str): Description of what the function should do.

  • query (str): Instruction from the user.

  • answers (str): JSON encoded list with arguments to be passed to the function/API. Should be loaded using json.loads.

  • execution_result (str): Result of the function/API executed.

Outputs

  • thought (str): Reasoning for the output on whether to keep this output or not.

  • keep_row_after_semantic_check (bool): True or False, can be used to filter afterwards.

Examples

Semantic checker for generated function calls (original implementation)

from distilabel.steps.tasks import APIGenSemanticChecker
from distilabel.models import InferenceEndpointsLLM

llm=InferenceEndpointsLLM(
    model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
    generation_kwargs={
        "temperature": 0.7,
        "max_new_tokens": 1024,
    },
)
semantic_checker = APIGenSemanticChecker(
    use_default_structured_output=False,
    llm=llm
)
semantic_checker.load()

res = next(
    semantic_checker.process(
        [
            {
                "func_desc": "Fetch information about a specific cat breed from the Cat Breeds API.",
                "query": "What information can be obtained about the Maine Coon cat breed?",
                "answers": json.dumps([{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]),
                "execution_result": "The Maine Coon is a big and hairy breed of cat",
            }
        ]
    )
)
res
# [{'func_desc': 'Fetch information about a specific cat breed from the Cat Breeds API.',
# 'query': 'What information can be obtained about the Maine Coon cat breed?',
# 'answers': [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}],
# 'execution_result': 'The Maine Coon is a big and hairy breed of cat',
# 'thought': '',
# 'keep_row_after_semantic_check': True,
# 'raw_input_a_p_i_gen_semantic_checker_0': [{'role': 'system',
#     'content': 'As a data quality evaluator, you must assess the alignment between a user query, corresponding function calls, and their execution results.\nThese function calls and results are generated by other models, and your task is to ensure these results accurately reflect the user’s intentions.\n\nDo not pass if:\n1. The function call does not align with the query’s objective, or the input arguments appear incorrect.\n2. The function call and arguments are not properly chosen from the available functions.\n3. The number of function calls does not correspond to the user’s intentions.\n4. The execution results are irrelevant and do not match the function’s purpose.\n5. The execution results contain errors or reflect that the function calls were not executed successfully.\n'},
#     {'role': 'user',
#     'content': 'Given Information:\n- All Available Functions:\nFetch information about a specific cat breed from the Cat Breeds API.\n- User Query: What information can be obtained about the Maine Coon cat breed?\n- Generated Function Calls: [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]\n- Execution Results: The Maine Coon is a big and hairy breed of cat\n\nNote: The query may have multiple intentions. Functions may be placeholders, and execution results may be truncated due to length, which is acceptable and should not cause a failure.\n\nThe main decision factor is wheather the function calls accurately reflect the query\'s intentions and the function descriptions.\nProvide your reasoning in the thought section and decide if the data passes (answer yes or no).\nIf not passing, concisely explain your reasons in the thought section; otherwise, leave this section blank.\n\nYour response MUST strictly adhere to the following JSON format, and NO other text MUST be included.\n```\n{\n   "thought": "Concisely describe your reasoning here",\n   "pass": "yes" or "no"\n}\n```\n'}]},
# 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct'}]

Semantic checker for generated function calls (structured output)

from distilabel.steps.tasks import APIGenSemanticChecker
from distilabel.models import InferenceEndpointsLLM

llm=InferenceEndpointsLLM(
    model_id="meta-llama/Meta-Llama-3.1-70B-Instruct",
    generation_kwargs={
        "temperature": 0.7,
        "max_new_tokens": 1024,
    },
)
semantic_checker = APIGenSemanticChecker(
    use_default_structured_output=True,
    llm=llm
)
semantic_checker.load()

res = next(
    semantic_checker.process(
        [
            {
                "func_desc": "Fetch information about a specific cat breed from the Cat Breeds API.",
                "query": "What information can be obtained about the Maine Coon cat breed?",
                "answers": json.dumps([{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]),
                "execution_result": "The Maine Coon is a big and hairy breed of cat",
            }
        ]
    )
)
res
# [{'func_desc': 'Fetch information about a specific cat breed from the Cat Breeds API.',
# 'query': 'What information can be obtained about the Maine Coon cat breed?',
# 'answers': [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}],
# 'execution_result': 'The Maine Coon is a big and hairy breed of cat',
# 'keep_row_after_semantic_check': True,
# 'thought': '',
# 'raw_input_a_p_i_gen_semantic_checker_0': [{'role': 'system',
#     'content': 'As a data quality evaluator, you must assess the alignment between a user query, corresponding function calls, and their execution results.\nThese function calls and results are generated by other models, and your task is to ensure these results accurately reflect the user’s intentions.\n\nDo not pass if:\n1. The function call does not align with the query’s objective, or the input arguments appear incorrect.\n2. The function call and arguments are not properly chosen from the available functions.\n3. The number of function calls does not correspond to the user’s intentions.\n4. The execution results are irrelevant and do not match the function’s purpose.\n5. The execution results contain errors or reflect that the function calls were not executed successfully.\n'},
#     {'role': 'user',
#     'content': 'Given Information:\n- All Available Functions:\nFetch information about a specific cat breed from the Cat Breeds API.\n- User Query: What information can be obtained about the Maine Coon cat breed?\n- Generated Function Calls: [{"name": "get_breed_information", "arguments": {"breed": "Maine Coon"}}]\n- Execution Results: The Maine Coon is a big and hairy breed of cat\n\nNote: The query may have multiple intentions. Functions may be placeholders, and execution results may be truncated due to length, which is acceptable and should not cause a failure.\n\nThe main decision factor is wheather the function calls accurately reflect the query\'s intentions and the function descriptions.\nProvide your reasoning in the thought section and decide if the data passes (answer yes or no).\nIf not passing, concisely explain your reasons in the thought section; otherwise, leave this section blank.\n'}]},
# 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct'}]

References