Installation¶
Note
Since distilabel
v1.0.0 was recently released, we refactored most of the stuff, so the installation below only applies to distilabel
v1.0.0 and above.
You will need to have at least Python 3.8 or higher, up to Python 3.12, since support for the latter is still a work in progress.
To install the latest release of the package from PyPI you can use the following command:
Alternatively, you may also want to install it from source i.e. the latest unreleased version, you can use the following command:
Note
We are installing from develop
since that's the branch we use to collect all the features, bug fixes, and improvements that will be part of the next release. If you want to install from a specific branch, you can replace develop
with the branch name.
Extras¶
Additionally, as part of distilabel
some extra dependencies are available, mainly to add support for some of the LLM integrations we support. Here's a list of the available extras:
-
anthropic
: for using models available in Anthropic API via theAnthropicLLM
integration. -
argilla
: for exporting the generated datasets to Argilla. -
cohere
: for using models available in Cohere via theCohereLLM
integration. -
groq
: for using models available in Groq usinggroq
Python client via theGroqLLM
integration. -
hf-inference-endpoints
: for using the Hugging Face Inference Endpoints via theInferenceEndpointsLLM
integration. -
hf-transformers
: for using models available in transformers package via theTransformersLLM
integration. -
litellm
: for usingLiteLLM
to call any LLM using OpenAI format via theLiteLLM
integration. -
llama-cpp
: for using llama-cpp-python Python bindings forllama.cpp
via theLlamaCppLLM
integration. -
mistralai
: for using models available in Mistral AI API via theMistralAILLM
integration. Note that themistralai
Python client can only be installed from Python 3.9 onwards, so this is the onlydistilabel
dependency that's not supported in Python 3.8. -
ollama
: for using Ollama and their available models viaOllamaLLM
integration. -
openai
: for using OpenAI API models via theOpenAILLM
integration, or the rest of the integrations based on OpenAI and relying on its client asAnyscaleLLM
,AzureOpenAILLM
, andTogetherLLM
. -
vertexai
: for using Google Vertex AI proprietary models via theVertexAILLM
integration. -
vllm
: for using vllm serving engine via thevLLM
integration.
Recommendations / Notes¶
The mistralai
dependency requires Python 3.9 or higher, so if you're willing to use the distilabel.llms.MistralLLM
implementation, you will need to have Python 3.9 or higher.
In some cases like transformers
and vllm
the installation of flash-attn
is recommended if you are using a GPU accelerator, since it will speed up the inference process, but the installation needs to be done separately, as it's not included in the distilabel
dependencies.
Also, if you are willing to use the llama-cpp-python
integration for running local LLMs, note that the installation process may get a bit trickier depending on which OS are you using, so we recommend you to read through their Installation section in their docs.