Tutorials¶
- End-to-end tutorials provide detailed step-by-step explanations and the code used for end-to-end workflows.
- Paper implementations provide reproductions of fundamental papers in the synthetic data domain.
- Examples don't provide explenations but simply show code for different tasks.
End-to-end tutorials¶
-
Generate a preference dataset
Learn about synthetic data generation for ORPO and DPO.
-
Clean an existing preference dataset
Learn about how to provide AI feedback to clean an existing dataset.
-
Retrieval and reranking models
Learn about synthetic data generation for fine-tuning custom retrieval and reranking models.
Paper Implementations¶
-
Deepseek Prover
Learn about an approach to generate mathematical proofs for theorems generated from informal math problems.
-
DEITA
Learn about prompt, response tuning for complexity and quality and LLMs as judges for automatic data selection.
-
Instruction Backtranslation
Learn about automatically labeling human-written text with corresponding instructions.
-
Prometheus 2
Learn about using open-source models as judges for direct assessment and pair-wise ranking.
-
UltraFeedback
Learn about a large-scale, fine-grained, diverse preference dataset, used for training powerful reward and critic models.
-
APIGen
Learn how to create verifiable high-quality datases for function-calling applications.
-
CLAIR
Learn Contrastive Learning from AI Revisions (CLAIR), a data-creation method which leads to more contrastive preference pairs.
Examples¶
-
Benchmarking with distilabel
Learn about reproducing the Arena Hard benchmark with disitlabel.
-
Structured generation with outlines
Learn about generating RPG characters following a pydantic.BaseModel with outlines in distilabel.
-
Structured generation with instructor
Learn about answering instructions with knowledge graphs defined as pydantic.BaseModel objects using instructor in distilabel.
-
Create a social network with FinePersonas
Learn how to leverage FinePersonas to create a synthetic social network and fine-tune adapters for Multi-LoRA.