endpoints
Creates, updates, deletes, gets or lists a endpoints
resource.
Overview
Name | endpoints |
Type | Resource |
Id | google.aiplatform.endpoints |
Fields
Name | Datatype | Description |
---|---|---|
name | string | Output only. The resource name of the Endpoint. |
description | string | The description of the Endpoint. |
createTime | string | Output only. Timestamp when this Endpoint was created. |
dedicatedEndpointDns | string | Output only. DNS of the dedicated endpoint. Will only be populated if dedicated_endpoint_enabled is true. Format: https://{endpoint_id}.{region}-{project_number}.prediction.vertexai.goog . |
dedicatedEndpointEnabled | boolean | If true, the endpoint will be exposed through a dedicated DNS [Endpoint.dedicated_endpoint_dns]. Your request to the dedicated DNS will be isolated from other users' traffic and will have better performance and reliability. Note: Once you enabled dedicated endpoint, you won't be able to send request to the shared DNS {region}-aiplatform.googleapis.com. The limitation will be removed soon. |
deployedModels | array | Output only. The models deployed in this Endpoint. To add or remove DeployedModels use EndpointService.DeployModel and EndpointService.UndeployModel respectively. |
displayName | string | Required. The display name of the Endpoint. The name can be up to 128 characters long and can consist of any UTF-8 characters. |
enablePrivateServiceConnect | boolean | Deprecated: If true, expose the Endpoint via private service connect. Only one of the fields, network or enable_private_service_connect, can be set. |
encryptionSpec | object | Represents a customer-managed encryption key spec that can be applied to a top-level resource. |
etag | string | Used to perform consistent read-modify-write updates. If not set, a blind "overwrite" update happens. |
labels | object | The labels with user-defined metadata to organize your Endpoints. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information and examples of labels. |
modelDeploymentMonitoringJob | string | Output only. Resource name of the Model Monitoring job associated with this Endpoint if monitoring is enabled by JobService.CreateModelDeploymentMonitoringJob. Format: projects/{project}/locations/{location}/modelDeploymentMonitoringJobs/{model_deployment_monitoring_job} |
network | string | Optional. The full name of the Google Compute Engine network to which the Endpoint should be peered. Private services access must already be configured for the network. If left unspecified, the Endpoint is not peered with any network. Only one of the fields, network or enable_private_service_connect, can be set. Format: projects/{project}/global/networks/{network} . Where {project} is a project number, as in 12345 , and {network} is network name. |
predictRequestResponseLoggingConfig | object | Configuration for logging request-response to a BigQuery table. |
privateServiceConnectConfig | object | Represents configuration for private service connect. |
satisfiesPzi | boolean | Output only. Reserved for future use. |
satisfiesPzs | boolean | Output only. Reserved for future use. |
trafficSplit | object | A map from a DeployedModel's ID to the percentage of this Endpoint's traffic that should be forwarded to that DeployedModel. If a DeployedModel's ID is not listed in this map, then it receives no traffic. The traffic percentage values must add up to 100, or map must be empty if the Endpoint is to not accept any traffic at a moment. |
updateTime | string | Output only. Timestamp when this Endpoint was last updated. |
Methods
Name | Accessible by | Required Params | Description |
---|---|---|---|
get | SELECT | endpointsId, locationsId, projectsId | Gets an Endpoint. |
list | SELECT | locationsId, projectsId | Lists Endpoints in a Location. |
create | INSERT | locationsId, projectsId | Creates an Endpoint. |
delete | DELETE | endpointsId, locationsId, projectsId | Deletes an Endpoint. |
patch | UPDATE | endpointsId, locationsId, projectsId | Updates an Endpoint. |
compute_tokens | EXEC | endpointsId, locationsId, projectsId | Return a list of tokens based on the input text. |
count_tokens | EXEC | endpointsId, locationsId, projectsId | Perform a token counting. |
deploy_model | EXEC | endpointsId, locationsId, projectsId | Deploys a Model into this Endpoint, creating a DeployedModel within it. |
direct_predict | EXEC | endpointsId, locationsId, projectsId | Perform an unary online prediction request to a gRPC model server for Vertex first-party products and frameworks. |
direct_raw_predict | EXEC | endpointsId, locationsId, projectsId | Perform an unary online prediction request to a gRPC model server for custom containers. |
explain | EXEC | endpointsId, locationsId, projectsId | Perform an online explanation. If deployed_model_id is specified, the corresponding DeployModel must have explanation_spec populated. If deployed_model_id is not specified, all DeployedModels must have explanation_spec populated. |
generate_content | EXEC | endpointsId, locationsId, projectsId | Generate content with multimodal inputs. |
mutate_deployed_model | EXEC | endpointsId, locationsId, projectsId | Updates an existing deployed model. Updatable fields include min_replica_count , max_replica_count , autoscaling_metric_specs , disable_container_logging (v1 only), and enable_container_logging (v1beta1 only). |
predict | EXEC | endpointsId, locationsId, projectsId | Perform an online prediction. |
raw_predict | EXEC | endpointsId, locationsId, projectsId | Perform an online prediction with an arbitrary HTTP payload. The response includes the following HTTP headers: X-Vertex-AI-Endpoint-Id : ID of the Endpoint that served this prediction. X-Vertex-AI-Deployed-Model-Id : ID of the Endpoint's DeployedModel that served this prediction. |
server_streaming_predict | EXEC | endpointsId, locationsId, projectsId | Perform a server-side streaming online prediction request for Vertex LLM streaming. |
stream_generate_content | EXEC | endpointsId, locationsId, projectsId | Generate content with multimodal inputs with streaming support. |
stream_raw_predict | EXEC | endpointsId, locationsId, projectsId | Perform a streaming online prediction with an arbitrary HTTP payload. |
undeploy_model | EXEC | endpointsId, locationsId, projectsId | Undeploys a Model from an Endpoint, removing a DeployedModel from it, and freeing all resources it's using. |
SELECT
examples
Lists Endpoints in a Location.
SELECT
name,
description,
createTime,
dedicatedEndpointDns,
dedicatedEndpointEnabled,
deployedModels,
displayName,
enablePrivateServiceConnect,
encryptionSpec,
etag,
labels,
modelDeploymentMonitoringJob,
network,
predictRequestResponseLoggingConfig,
privateServiceConnectConfig,
satisfiesPzi,
satisfiesPzs,
trafficSplit,
updateTime
FROM google.aiplatform.endpoints
WHERE locationsId = '{{ locationsId }}'
AND projectsId = '{{ projectsId }}';
INSERT
example
Use the following StackQL query and manifest file to create a new endpoints
resource.
- All Properties
- Manifest
/*+ create */
INSERT INTO google.aiplatform.endpoints (
locationsId,
projectsId,
etag,
enablePrivateServiceConnect,
encryptionSpec,
network,
privateServiceConnectConfig,
displayName,
description,
dedicatedEndpointEnabled,
trafficSplit,
predictRequestResponseLoggingConfig,
labels
)
SELECT
'{{ locationsId }}',
'{{ projectsId }}',
'{{ etag }}',
{{ enablePrivateServiceConnect }},
'{{ encryptionSpec }}',
'{{ network }}',
'{{ privateServiceConnectConfig }}',
'{{ displayName }}',
'{{ description }}',
{{ dedicatedEndpointEnabled }},
'{{ trafficSplit }}',
'{{ predictRequestResponseLoggingConfig }}',
'{{ labels }}'
;
- name: your_resource_model_name
props:
- name: etag
value: string
- name: enablePrivateServiceConnect
value: boolean
- name: encryptionSpec
value:
- name: kmsKeyName
value: string
- name: satisfiesPzi
value: boolean
- name: satisfiesPzs
value: boolean
- name: network
value: string
- name: privateServiceConnectConfig
value:
- name: serviceAttachment
value: string
- name: projectAllowlist
value:
- string
- name: enablePrivateServiceConnect
value: boolean
- name: displayName
value: string
- name: dedicatedEndpointDns
value: string
- name: updateTime
value: string
- name: description
value: string
- name: dedicatedEndpointEnabled
value: boolean
- name: createTime
value: string
- name: modelDeploymentMonitoringJob
value: string
- name: name
value: string
- name: trafficSplit
value: object
- name: deployedModels
value:
- - name: displayName
value: string
- name: modelVersionId
value: string
- name: model
value: string
- name: createTime
value: string
- name: disableExplanations
value: boolean
- name: privateEndpoints
value:
- name: serviceAttachment
value: string
- name: predictHttpUri
value: string
- name: healthHttpUri
value: string
- name: explainHttpUri
value: string
- name: disableContainerLogging
value: boolean
- name: explanationSpec
value:
- name: metadata
value:
- name: latentSpaceSource
value: string
- name: featureAttributionsSchemaUri
value: string
- name: outputs
value: object
- name: inputs
value: object
- name: parameters
value:
- name: integratedGradientsAttribution
value:
- name: blurBaselineConfig
value:
- name: maxBlurSigma
value: number
- name: smoothGradConfig
value:
- name: featureNoiseSigma
value:
- name: noiseSigma
value:
- - name: name
value: string
- name: sigma
value: number
- name: noisySampleCount
value: integer
- name: noiseSigma
value: number
- name: stepCount
value: integer
- name: topK
value: integer
- name: outputIndices
value:
- any
- name: sampledShapleyAttribution
value:
- name: pathCount
value: integer
- name: xraiAttribution
value:
- name: stepCount
value: integer
- name: examples
value:
- name: nearestNeighborSearchConfig
value: any
- name: neighborCount
value: integer
- name: exampleGcsSource
value:
- name: gcsSource
value:
- name: uris
value:
- string
- name: dataFormat
value: string
- name: presets
value:
- name: modality
value: string
- name: query
value: string
- name: automaticResources
value:
- name: minReplicaCount
value: integer
- name: maxReplicaCount
value: integer
- name: enableAccessLogging
value: boolean
- name: sharedResources
value: string
- name: serviceAccount
value: string
- name: id
value: string
- name: dedicatedResources
value:
- name: machineSpec
value:
- name: acceleratorCount
value: integer
- name: tpuTopology
value: string
- name: machineType
value: string
- name: acceleratorType
value: string
- name: reservationAffinity
value:
- name: reservationAffinityType
value: string
- name: values
value:
- string
- name: key
value: string
- name: autoscalingMetricSpecs
value:
- - name: target
value: integer
- name: metricName
value: string
- name: maxReplicaCount
value: integer
- name: minReplicaCount
value: integer
- name: spot
value: boolean
- name: predictRequestResponseLoggingConfig
value:
- name: samplingRate
value: number
- name: enabled
value: boolean
- name: bigqueryDestination
value:
- name: outputUri
value: string
- name: labels
value: object
UPDATE
example
Updates a endpoints
resource.
/*+ update */
UPDATE google.aiplatform.endpoints
SET
etag = '{{ etag }}',
enablePrivateServiceConnect = true|false,
encryptionSpec = '{{ encryptionSpec }}',
network = '{{ network }}',
privateServiceConnectConfig = '{{ privateServiceConnectConfig }}',
displayName = '{{ displayName }}',
description = '{{ description }}',
dedicatedEndpointEnabled = true|false,
trafficSplit = '{{ trafficSplit }}',
predictRequestResponseLoggingConfig = '{{ predictRequestResponseLoggingConfig }}',
labels = '{{ labels }}'
WHERE
endpointsId = '{{ endpointsId }}'
AND locationsId = '{{ locationsId }}'
AND projectsId = '{{ projectsId }}';
DELETE
example
Deletes the specified endpoints
resource.
/*+ delete */
DELETE FROM google.aiplatform.endpoints
WHERE endpointsId = '{{ endpointsId }}'
AND locationsId = '{{ locationsId }}'
AND projectsId = '{{ projectsId }}';