Skip to content

Task Typing

ChatType = List[ChatItem] module-attribute

ChatType is a type alias for a list of dicts following the OpenAI conversational format.

FormattedInput = Union[StandardInput, StructuredInput] module-attribute

FormattedInput is an alias for the union of StandardInput and StructuredInput as generated by format_input and expected by the LLMs.

StandardInput = ChatType module-attribute

StandardInput is an alias for ChatType that defines the default / standard input produced by format_input.

StructuredInput = Tuple[StandardInput, Union[StructuredOutputType, None]] module-attribute

StructuredInput defines a type produced by format_input when using either StructuredGeneration or a subclass of it.

StructuredOutputType = Union[OutlinesStructuredOutputType, InstructorStructuredOutputType] module-attribute

StructuredOutputType is an alias for the union of OutlinesStructuredOutputType and InstructorStructuredOutputType.

InstructorStructuredOutputType

Bases: TypedDict

TypedDict to represent the structured output configuration from instructor.

Source code in src/distilabel/steps/tasks/typing.py
class InstructorStructuredOutputType(TypedDict, total=False):
    """TypedDict to represent the structured output configuration from `instructor`."""

    schema: Union[Type[BaseModel], Dict[str, Any]]
    """The schema to use for the structured output, a `pydantic.BaseModel` class. """
    mode: Optional[str]
    """Generation mode. Take a look at `instructor.Mode` for more information, if not informed it will
    be determined automatically. """
    max_retries: int
    """Number of times to reask the model in case of error, if not set will default to the model's default. """

max_retries: int instance-attribute

Number of times to reask the model in case of error, if not set will default to the model's default.

mode: Optional[str] instance-attribute

Generation mode. Take a look at instructor.Mode for more information, if not informed it will be determined automatically.

schema: Union[Type[BaseModel], Dict[str, Any]] instance-attribute

The schema to use for the structured output, a pydantic.BaseModel class.

OutlinesStructuredOutputType

Bases: TypedDict

TypedDict to represent the structured output configuration from outlines.

Source code in src/distilabel/steps/tasks/typing.py
class OutlinesStructuredOutputType(TypedDict, total=False):
    """TypedDict to represent the structured output configuration from `outlines`."""

    format: Literal["json", "regex"]
    """One of "json" or "regex"."""
    schema: Union[str, Type[BaseModel], Dict[str, Any]]
    """The schema to use for the structured output. If "json", it
    can be a pydantic.BaseModel class, or the schema as a string,
    as obtained from `model_to_schema(BaseModel)`, if "regex", it
    should be a regex pattern as a string.
    """
    whitespace_pattern: Optional[Union[str, List[str]]]
    """If "json" corresponds to a string or a list of
    strings with a pattern (doesn't impact string literals).
    For example, to allow only a single space or newline with
    `whitespace_pattern=r"[\n ]?"`
    """

format: Literal['json', 'regex'] instance-attribute

One of "json" or "regex".

schema: Union[str, Type[BaseModel], Dict[str, Any]] instance-attribute

The schema to use for the structured output. If "json", it can be a pydantic.BaseModel class, or the schema as a string, as obtained from model_to_schema(BaseModel), if "regex", it should be a regex pattern as a string.

whitespace_pattern: Optional[Union[str, List[str]]] instance-attribute

If "json" corresponds to a string or a list of strings with a pattern (doesn't impact string literals). For example, to allow only a single space or newline with whitespace_pattern=r"[ ]?"