Skip to content

Embedding Gallery

This section contains the existing Embeddings subclasses implemented in distilabel.

embeddings

SentenceTransformerEmbeddings

Bases: Embeddings, CudaDevicePlacementMixin

sentence-transformers library implementation for embedding generation.

Attributes:

Name Type Description
model str

the model Hugging Face Hub repo id or a path to a directory containing the model weights and configuration files.

device Optional[RuntimeParameter[str]]

the name of the device used to load the model e.g. "cuda", "mps", etc. Defaults to None.

prompts Optional[Dict[str, str]]

a dictionary containing prompts to be used with the model. Defaults to None.

default_prompt_name Optional[str]

the default prompt (in prompts) that will be applied to the inputs. If not provided, then no prompt will be used. Defaults to None.

trust_remote_code bool

whether to allow fetching and executing remote code fetched from the repository in the Hub. Defaults to False.

revision Optional[str]

if model refers to a Hugging Face Hub repository, then the revision (e.g. a branch name or a commit id) to use. Defaults to "main".

token Optional[str]

the Hugging Face Hub token that will be used to authenticate to the Hugging Face Hub. If not provided, the HF_TOKEN environment or huggingface_hub package local configuration will be used. Defaults to None.

truncate_dim Optional[int]

the dimension to truncate the sentence embeddings. Defaults to None.

model_kwargs Optional[Dict[str, Any]]

extra kwargs that will be passed to the Hugging Face transformers model class. Defaults to None.

tokenizer_kwargs Optional[Dict[str, Any]]

extra kwargs that will be passed to the Hugging Face transformers tokenizer class. Defaults to None.

config_kwargs Optional[Dict[str, Any]]

extra kwargs that will be passed to the Hugging Face transformers configuration class. Defaults to None.

precision Optional[Literal['float32', 'int8', 'uint8', 'binary', 'ubinary']]

the dtype that will have the resulting embeddings. Defaults to "float32".

normalize_embeddings RuntimeParameter[bool]

whether to normalize the embeddings so they have a length of 1. Defaults to None.

Examples:

Generating sentence embeddings:

from distilabel.embeddings import SentenceTransformerEmbeddings

embeddings = SentenceTransformerEmbeddings(model="mixedbread-ai/mxbai-embed-large-v1")

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
# [
#   [-0.05447685346007347, -0.01623094454407692, ...],
#   [4.4889533455716446e-05, 0.044016145169734955, ...],
# ]
Source code in src/distilabel/embeddings/sentence_transformers.py
class SentenceTransformerEmbeddings(Embeddings, CudaDevicePlacementMixin):
    """`sentence-transformers` library implementation for embedding generation.

    Attributes:
        model: the model Hugging Face Hub repo id or a path to a directory containing the
            model weights and configuration files.
        device: the name of the device used to load the model e.g. "cuda", "mps", etc.
            Defaults to `None`.
        prompts: a dictionary containing prompts to be used with the model. Defaults to
            `None`.
        default_prompt_name: the default prompt (in `prompts`) that will be applied to the
            inputs. If not provided, then no prompt will be used. Defaults to `None`.
        trust_remote_code: whether to allow fetching and executing remote code fetched
            from the repository in the Hub. Defaults to `False`.
        revision: if `model` refers to a Hugging Face Hub repository, then the revision
            (e.g. a branch name or a commit id) to use. Defaults to `"main"`.
        token: the Hugging Face Hub token that will be used to authenticate to the Hugging
            Face Hub. If not provided, the `HF_TOKEN` environment or `huggingface_hub` package
            local configuration will be used. Defaults to `None`.
        truncate_dim: the dimension to truncate the sentence embeddings. Defaults to `None`.
        model_kwargs: extra kwargs that will be passed to the Hugging Face `transformers`
            model class. Defaults to `None`.
        tokenizer_kwargs: extra kwargs that will be passed to the Hugging Face `transformers`
            tokenizer class. Defaults to `None`.
        config_kwargs: extra kwargs that will be passed to the Hugging Face `transformers`
            configuration class. Defaults to `None`.
        precision: the dtype that will have the resulting embeddings. Defaults to `"float32"`.
        normalize_embeddings: whether to normalize the embeddings so they have a length
            of 1. Defaults to `None`.

    Examples:
        Generating sentence embeddings:

        ```python
        from distilabel.embeddings import SentenceTransformerEmbeddings

        embeddings = SentenceTransformerEmbeddings(model="mixedbread-ai/mxbai-embed-large-v1")

        embeddings.load()

        results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
        # [
        #   [-0.05447685346007347, -0.01623094454407692, ...],
        #   [4.4889533455716446e-05, 0.044016145169734955, ...],
        # ]
        ```
    """

    model: str
    device: Optional[RuntimeParameter[str]] = Field(
        default=None,
        description="The device to be used to load the model. If `None`, then it"
        " will check if a GPU can be used.",
    )
    prompts: Optional[Dict[str, str]] = None
    default_prompt_name: Optional[str] = None
    trust_remote_code: bool = False
    revision: Optional[str] = None
    token: Optional[str] = None
    truncate_dim: Optional[int] = None
    model_kwargs: Optional[Dict[str, Any]] = None
    tokenizer_kwargs: Optional[Dict[str, Any]] = None
    config_kwargs: Optional[Dict[str, Any]] = None
    precision: Optional[Literal["float32", "int8", "uint8", "binary", "ubinary"]] = (
        "float32"
    )
    normalize_embeddings: RuntimeParameter[bool] = Field(
        default=True,
        description="Whether to normalize the embeddings so the generated vectors"
        " have a length of 1 or not.",
    )

    _model: Union["SentenceTransformer", None] = PrivateAttr(None)

    def load(self) -> None:
        """Loads the Sentence Transformer model"""
        super().load()

        if self.device == "cuda":
            CudaDevicePlacementMixin.load(self)

        try:
            from sentence_transformers import SentenceTransformer
        except ImportError as e:
            raise ImportError(
                "`sentence-transformers` package is not installed. Please install it using"
                " `pip install sentence-transformers`."
            ) from e

        self._model = SentenceTransformer(
            model_name_or_path=self.model,
            device=self.device,
            prompts=self.prompts,
            default_prompt_name=self.default_prompt_name,
            trust_remote_code=self.trust_remote_code,
            revision=self.revision,
            token=self.token,
            truncate_dim=self.truncate_dim,
            model_kwargs=self.model_kwargs,
            tokenizer_kwargs=self.tokenizer_kwargs,
            config_kwargs=self.config_kwargs,
        )

    @property
    def model_name(self) -> str:
        """Returns the name of the model."""
        return self.model

    def encode(self, inputs: List[str]) -> List[List[Union[int, float]]]:
        """Generates embeddings for the provided inputs.

        Args:
            inputs: a list of texts for which an embedding has to be generated.

        Returns:
            The generated embeddings.
        """
        return self._model.encode(  # type: ignore
            sentences=inputs,
            batch_size=len(inputs),
            convert_to_numpy=True,
            precision=self.precision,  # type: ignore
            normalize_embeddings=self.normalize_embeddings,  # type: ignore
        ).tolist()  # type: ignore

    def unload(self) -> None:
        del self._model
        if self.device == "cuda":
            CudaDevicePlacementMixin.unload(self)
        super().unload()
model_name: str property

Returns the name of the model.

load()

Loads the Sentence Transformer model

Source code in src/distilabel/embeddings/sentence_transformers.py
def load(self) -> None:
    """Loads the Sentence Transformer model"""
    super().load()

    if self.device == "cuda":
        CudaDevicePlacementMixin.load(self)

    try:
        from sentence_transformers import SentenceTransformer
    except ImportError as e:
        raise ImportError(
            "`sentence-transformers` package is not installed. Please install it using"
            " `pip install sentence-transformers`."
        ) from e

    self._model = SentenceTransformer(
        model_name_or_path=self.model,
        device=self.device,
        prompts=self.prompts,
        default_prompt_name=self.default_prompt_name,
        trust_remote_code=self.trust_remote_code,
        revision=self.revision,
        token=self.token,
        truncate_dim=self.truncate_dim,
        model_kwargs=self.model_kwargs,
        tokenizer_kwargs=self.tokenizer_kwargs,
        config_kwargs=self.config_kwargs,
    )
encode(inputs)

Generates embeddings for the provided inputs.

Parameters:

Name Type Description Default
inputs List[str]

a list of texts for which an embedding has to be generated.

required

Returns:

Type Description
List[List[Union[int, float]]]

The generated embeddings.

Source code in src/distilabel/embeddings/sentence_transformers.py
def encode(self, inputs: List[str]) -> List[List[Union[int, float]]]:
    """Generates embeddings for the provided inputs.

    Args:
        inputs: a list of texts for which an embedding has to be generated.

    Returns:
        The generated embeddings.
    """
    return self._model.encode(  # type: ignore
        sentences=inputs,
        batch_size=len(inputs),
        convert_to_numpy=True,
        precision=self.precision,  # type: ignore
        normalize_embeddings=self.normalize_embeddings,  # type: ignore
    ).tolist()  # type: ignore

vLLMEmbeddings

Bases: Embeddings, CudaDevicePlacementMixin

vllm library implementation for embedding generation.

Attributes:

Name Type Description
model str

the model Hugging Face Hub repo id or a path to a directory containing the model weights and configuration files.

dtype str

the data type to use for the model. Defaults to auto.

trust_remote_code bool

whether to trust the remote code when loading the model. Defaults to False.

quantization Optional[str]

the quantization mode to use for the model. Defaults to None.

revision Optional[str]

the revision of the model to load. Defaults to None.

enforce_eager bool

whether to enforce eager execution. Defaults to True.

seed int

the seed to use for the random number generator. Defaults to 0.

extra_kwargs Optional[RuntimeParameter[Dict[str, Any]]]

additional dictionary of keyword arguments that will be passed to the LLM class of vllm library. Defaults to {}.

_model LLM

the vLLM model instance. This attribute is meant to be used internally and should not be accessed directly. It will be set in the load method.

References

Examples:

Generating sentence embeddings:

from distilabel.embeddings import vLLMEmbeddings

embeddings = vLLMEmbeddings(model="intfloat/e5-mistral-7b-instruct")

embeddings.load()

results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
# [
#   [-0.05447685346007347, -0.01623094454407692, ...],
#   [4.4889533455716446e-05, 0.044016145169734955, ...],
# ]
Source code in src/distilabel/embeddings/vllm.py
class vLLMEmbeddings(Embeddings, CudaDevicePlacementMixin):
    """`vllm` library implementation for embedding generation.

    Attributes:
        model: the model Hugging Face Hub repo id or a path to a directory containing the
            model weights and configuration files.
        dtype: the data type to use for the model. Defaults to `auto`.
        trust_remote_code: whether to trust the remote code when loading the model. Defaults
            to `False`.
        quantization: the quantization mode to use for the model. Defaults to `None`.
        revision: the revision of the model to load. Defaults to `None`.
        enforce_eager: whether to enforce eager execution. Defaults to `True`.
        seed: the seed to use for the random number generator. Defaults to `0`.
        extra_kwargs: additional dictionary of keyword arguments that will be passed to the
            `LLM` class of `vllm` library. Defaults to `{}`.
        _model: the `vLLM` model instance. This attribute is meant to be used internally
            and should not be accessed directly. It will be set in the `load` method.

    References:
        - [Offline inference embeddings](https://docs.vllm.ai/en/latest/getting_started/examples/offline_inference_embedding.html)

    Examples:
        Generating sentence embeddings:

        ```python
        from distilabel.embeddings import vLLMEmbeddings

        embeddings = vLLMEmbeddings(model="intfloat/e5-mistral-7b-instruct")

        embeddings.load()

        results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
        # [
        #   [-0.05447685346007347, -0.01623094454407692, ...],
        #   [4.4889533455716446e-05, 0.044016145169734955, ...],
        # ]
        ```
    """

    model: str
    dtype: str = "auto"
    trust_remote_code: bool = False
    quantization: Optional[str] = None
    revision: Optional[str] = None

    enforce_eager: bool = True

    seed: int = 0

    extra_kwargs: Optional[RuntimeParameter[Dict[str, Any]]] = Field(
        default_factory=dict,
        description="Additional dictionary of keyword arguments that will be passed to the"
        " `vLLM` class of `vllm` library. See all the supported arguments at: "
        "https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/llm.py",
    )

    _model: "_vLLM" = PrivateAttr(None)

    def load(self) -> None:
        """Loads the `vLLM` model using either the path or the Hugging Face Hub repository id."""
        super().load()

        CudaDevicePlacementMixin.load(self)

        try:
            from vllm import LLM as _vLLM

        except ImportError as ie:
            raise ImportError(
                "vLLM is not installed. Please install it using `pip install vllm`."
            ) from ie

        self._model = _vLLM(
            self.model,
            dtype=self.dtype,
            trust_remote_code=self.trust_remote_code,
            quantization=self.quantization,
            revision=self.revision,
            enforce_eager=self.enforce_eager,
            seed=self.seed,
            **self.extra_kwargs,  # type: ignore
        )

    def unload(self) -> None:
        """Unloads the `vLLM` model."""
        CudaDevicePlacementMixin.unload(self)
        super().unload()

    @property
    def model_name(self) -> str:
        """Returns the name of the model."""
        return self.model

    def encode(self, inputs: List[str]) -> List[List[Union[int, float]]]:
        """Generates embeddings for the provided inputs.

        Args:
            inputs: a list of texts for which an embedding has to be generated.

        Returns:
            The generated embeddings.
        """
        return [output.outputs.embedding for output in self._model.encode(inputs)]
model_name: str property

Returns the name of the model.

load()

Loads the vLLM model using either the path or the Hugging Face Hub repository id.

Source code in src/distilabel/embeddings/vllm.py
def load(self) -> None:
    """Loads the `vLLM` model using either the path or the Hugging Face Hub repository id."""
    super().load()

    CudaDevicePlacementMixin.load(self)

    try:
        from vllm import LLM as _vLLM

    except ImportError as ie:
        raise ImportError(
            "vLLM is not installed. Please install it using `pip install vllm`."
        ) from ie

    self._model = _vLLM(
        self.model,
        dtype=self.dtype,
        trust_remote_code=self.trust_remote_code,
        quantization=self.quantization,
        revision=self.revision,
        enforce_eager=self.enforce_eager,
        seed=self.seed,
        **self.extra_kwargs,  # type: ignore
    )
unload()

Unloads the vLLM model.

Source code in src/distilabel/embeddings/vllm.py
def unload(self) -> None:
    """Unloads the `vLLM` model."""
    CudaDevicePlacementMixin.unload(self)
    super().unload()
encode(inputs)

Generates embeddings for the provided inputs.

Parameters:

Name Type Description Default
inputs List[str]

a list of texts for which an embedding has to be generated.

required

Returns:

Type Description
List[List[Union[int, float]]]

The generated embeddings.

Source code in src/distilabel/embeddings/vllm.py
def encode(self, inputs: List[str]) -> List[List[Union[int, float]]]:
    """Generates embeddings for the provided inputs.

    Args:
        inputs: a list of texts for which an embedding has to be generated.

    Returns:
        The generated embeddings.
    """
    return [output.outputs.embedding for output in self._model.encode(inputs)]