Huggingface
PushToHub
¶
Bases: GlobalStep
A GlobalStep which creates a datasets.Dataset with the input data and pushes
it to the Hugging Face Hub.
Attributes:
| Name | Type | Description |
|---|---|---|
repo_id |
RuntimeParameter[str]
|
The Hugging Face Hub repository ID where the dataset will be uploaded. |
split |
RuntimeParameter[str]
|
The split of the dataset that will be pushed. Defaults to |
private |
RuntimeParameter[bool]
|
Whether the dataset to be pushed should be private or not. Defaults to
|
token |
Optional[RuntimeParameter[str]]
|
The token that will be used to authenticate in the Hub. If not provided, the
token will be tried to be obtained from the environment variable |
Runtime parameters
repo_id: The Hugging Face Hub repository ID where the dataset will be uploaded.split: The split of the dataset that will be pushed.private: Whether the dataset to be pushed should be private or not.token: The token that will be used to authenticate in the Hub.
Input columns
- dynamic, based on the existing data within inputs
Source code in src/distilabel/steps/globals/huggingface.py
process(inputs)
¶
Method that processes the input data, respecting the datasets.Dataset formatting,
and pushes it to the Hugging Face Hub based on the RuntimeParameters attributes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs |
StepInput
|
that input data within a single object (as it's a GlobalStep) that
will be transformed into a |
required |
Yields:
| Type | Description |
|---|---|
StepOutput
|
Propagates the received inputs so that the |
StepOutput
|
the last step of the |
StepOutput
|
steps. |