Tasks Gallery¶
Category Overview
The gallery page showcases the different types of components within distilabel.
| Icon | Category | Description | 
|---|---|---|
| text-generation | Text generation steps are used to generate text based on a given prompt. | |
| chat-generation | Chat generation steps are used to generate text based on a conversation. | |
| text-classification | Text classification steps are used to classify text into a category. | |
| text-manipulation | Text manipulation steps are used to manipulate or rewrite an input text. | |
| evol | Evol steps are used to rewrite input text and evolve it to a higher quality. | |
| critique | Critique steps are used to provide feedback on the quality of the data with a written explanation. | |
| scorer | Scorer steps are used to evaluate and score the data with a numerical value. | |
| preference | Preference steps are used to collect preferences on the data with numerical values or ranks. | |
| embedding | Embedding steps are used to generate embeddings for the data. | |
| clustering | Clustering steps are used to group similar data points together. | |
| columns | Columns steps are used to manipulate columns in the data. | |
| filtering | Filtering steps are used to filter the data based on some criteria. | |
| format | Format steps are used to format the data. | |
| load | Load steps are used to load the data. | |
| execution | Executes python functions. | |
| save | Save steps are used to save the data. | |
| image-generation | Image generation steps are used to generate images based on a given prompt. | |
| labelling | Labelling steps are used to label the data. | 
- 
APIGenGenerator 
 Generate queries and answers for the given functions in JSON format. 
- 
Genstruct 
 Generate a pair of instruction-response from a document using an LLM.
- 
Magpie 
 Generates conversations using an instruct fine-tuned LLM. 
- 
MathShepherdCompleter 
 Math Shepherd Completer and auto-labeller task. 
- 
MathShepherdGenerator 
 Math Shepherd solution generator. 
- 
SelfInstruct 
 Generate instructions based on a given input using an LLM.
- 
TextGeneration 
 Text generation with an LLMgiven a prompt.
- 
TextGenerationWithImage 
 Text generation with images with an LLMgiven a prompt.
- 
URIAL 
 Generates a response using a non-instruct fine-tuned model. 
- 
MagpieGenerator 
 Generator task the generates instructions or conversations using Magpie. 
- 
ChatGeneration 
 Generates text based on a conversation. 
- 
ArgillaLabeller 
 Annotate Argilla records based on input fields, example records and question settings. 
- 
TextClassification 
 Classifies text into one or more categories or labels. 
- 
EvolInstruct 
 Evolve instructions using an LLM.
- 
EvolComplexity 
 Evolve instructions to make them more complex using an LLM.
- 
EvolQuality 
 Evolve the quality of the responses using an LLM.
- 
EvolInstructGenerator 
 Generate evolved instructions using an LLM.
- 
EvolComplexityGenerator 
 Generate evolved instructions with increased complexity using an LLM.
- 
InstructionBacktranslation 
 Self-Alignment with Instruction Backtranslation. 
- 
PrometheusEval 
 Critique and rank the quality of generations from an LLMusing Prometheus 2.0.
- 
ComplexityScorer 
 Score instructions based on their complexity using an LLM.
- 
QualityScorer 
 Score responses based on their quality using an LLM.
- 
CLAIR 
 Contrastive Learning from AI Revisions (CLAIR). 
- 
UltraFeedback 
 Rank generations focusing on different aspects using an LLM.
- 
PairRM 
 Rank the candidates based on the input using the LLMmodel.
- 
GenerateSentencePair 
 Generate a positive and negative (optionally) sentences given an anchor sentence. 
- 
GenerateEmbeddings 
 Generate embeddings using the last hidden state of an LLM.
- 
TextClustering 
 Task that clusters a set of texts and generates summary labels for each cluster. 
- 
TextClustering 
 Task that clusters a set of texts and generates summary labels for each cluster. 
- 
APIGenSemanticChecker 
 Generate queries and answers for the given functions in JSON format. 
- 
ImageGeneration 
 Image generation with an image to text model given a prompt. 
- 
GenerateTextRetrievalData 
 Generate text retrieval data with an LLMto later on train an embedding model.
- 
GenerateShortTextMatchingData 
 Generate short text matching data with an LLMto later on train an embedding model.
- 
GenerateLongTextMatchingData 
 Generate long text matching data with an LLMto later on train an embedding model.
- 
GenerateTextClassificationData 
 Generate text classification data with an LLMto later on train an embedding model.
- 
StructuredGeneration 
 Generate structured content for a given instructionusing anLLM.
- 
MonolingualTripletGenerator 
 Generate monolingual triplets with an LLMto later on train an embedding model.
- 
BitextRetrievalGenerator 
 Generate bitext retrieval data with an LLMto later on train an embedding model.
- 
EmbeddingTaskGenerator 
 Generate task descriptions for embedding-related tasks using an LLM.