Paper Implementations¶

Contains some implementations for synthetic data generation papers, using distilabel, providing reproducible pipelines so that anyone can play around with those approaches and customize that to their needs. We strongly believe that better data leads to better models, and synthetic data has proven to be really effective towards improving LLMs, so we aim to bridge the gap between research and practice by providing these implementations.