VertexAILLM¶
VertexAILLM
¶
Bases: AsyncLLM
VertexAI LLM implementation running the async API clients for Gemini.
- Gemini API: https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/gemini
To use the VertexAILLM is necessary to have configured the Google Cloud authentication
using one of these methods:
- Setting
GOOGLE_CLOUD_CREDENTIALSenvironment variable - Using
gcloud auth application-default logincommand - Using
vertexai.initfunction from thegoogle-cloud-aiplatformlibrary
Attributes:
| Name | Type | Description |
|---|---|---|
model |
str
|
the model name to use for the LLM e.g. "gemini-1.0-pro". Supported models. |
_aclient |
Optional[GenerativeModel]
|
the |
Icon
:simple-googlecloud:
Source code in src/distilabel/llms/vertexai.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 | |
model_name: str
property
¶
Returns the model name used for the LLM.
agenerate(input, temperature=None, top_p=None, top_k=None, max_output_tokens=None, stop_sequences=None, safety_settings=None, tools=None)
async
¶
Generates num_generations responses for the given input using the VertexAI async client definition.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input |
StandardInput
|
a single input in chat format to generate responses for. |
required |
temperature |
Optional[float]
|
Controls the randomness of predictions. Range: [0.0, 1.0]. Defaults to |
None
|
top_p |
Optional[float]
|
If specified, nucleus sampling will be used. Range: (0.0, 1.0]. Defaults to |
None
|
top_k |
Optional[int]
|
If specified, top-k sampling will be used. Defaults to |
None
|
max_output_tokens |
Optional[int]
|
The maximum number of output tokens to generate per message. Defaults to |
None
|
stop_sequences |
Optional[List[str]]
|
A list of stop sequences. Defaults to |
None
|
safety_settings |
Optional[Dict[str, Any]]
|
Safety configuration for returned content from the API. Defaults to |
None
|
tools |
Optional[List[Dict[str, Any]]]
|
A potential list of tools that can be used by the API. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
GenerateOutput
|
A list of lists of strings containing the generated responses for each input. |
Source code in src/distilabel/llms/vertexai.py
load()
¶
Loads the GenerativeModel class which has access to generate_content_async to benefit from async requests.