Vertexai
VertexaiLLM¶
VertexAILLM
¶
Bases: AsyncLLM
VertexAI LLM implementation running the async API clients for Gemini.
- Gemini API: https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/gemini
To use the VertexAILLM
is necessary to have configured the Google Cloud authentication
using one of these methods:
- Setting
GOOGLE_CLOUD_CREDENTIALS
environment variable - Using
gcloud auth application-default login
command - Using
vertexai.init
function from thegoogle-cloud-aiplatform
library
Attributes:
Name | Type | Description |
---|---|---|
model |
str
|
the model name to use for the LLM e.g. "gemini-1.0-pro". Supported models. |
_aclient |
Optional[GenerativeModel]
|
the |
Source code in src/distilabel/llms/vertexai.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
model_name: str
property
¶
Returns the model name used for the LLM.
agenerate(input, num_generations=1, temperature=None, top_p=None, top_k=None, max_output_tokens=None, stop_sequences=None, safety_settings=None, tools=None)
async
¶
Generates num_generations
responses for the given input using the VertexAI async client definition.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input |
ChatType
|
a single input in chat format to generate responses for. |
required |
num_generations |
int
|
the number of generations to create per input. Defaults to
|
1
|
temperature |
Optional[float]
|
Controls the randomness of predictions. Range: [0.0, 1.0]. Defaults to |
None
|
top_p |
Optional[float]
|
If specified, nucleus sampling will be used. Range: (0.0, 1.0]. Defaults to |
None
|
top_k |
Optional[int]
|
If specified, top-k sampling will be used. Defaults to |
None
|
max_output_tokens |
Optional[int]
|
The maximum number of output tokens to generate per message. Defaults to |
None
|
stop_sequences |
Optional[List[str]]
|
A list of stop sequences. Defaults to |
None
|
safety_settings |
Optional[Dict[str, Any]]
|
Safety configuration for returned content from the API. Defaults to |
None
|
tools |
Optional[List[Dict[str, Any]]]
|
A potential list of tools that can be used by the API. Defaults to |
None
|
Returns:
Type | Description |
---|---|
GenerateOutput
|
A list of lists of strings containing the generated responses for each input. |
Source code in src/distilabel/llms/vertexai.py
load()
¶
Loads the GenerativeModel
class which has access to generate_content_async
to benefit from async requests.