Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .kokoro/docker/docs/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@ ENV PATH /usr/local/bin:$PATH

# Install dependencies.
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
&& apt-get install -y ca-certificates --fix-missing \
&& update-ca-certificates

RUN apt-get install -y --no-install-recommends \
apt-transport-https \
build-essential \
ca-certificates \
Expand Down
15 changes: 15 additions & 0 deletions agentplatform/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,26 @@
#
"""The agentplatform module."""

import importlib
from google.cloud.aiplatform import init
from google.cloud.aiplatform import version as aiplatform_version

__version__ = aiplatform_version.__version__


def __getattr__(name): # type: ignore[no-untyped-def]
if name == "preview":
# We need to import carefully to avoid `RecursionError`.
# This won't work since it causes `RecursionError`:
# `from agentplatform import preview`
# This won't work due to Copybara lacking a transform:
# `import google.cloud.aiplatform.agentplatform.preview as`
# `agentplatform_preview`
return importlib.import_module(".preview", __name__)
raise AttributeError(f"module '{__name__}' has no attribute '{name}'")


__all__ = [
"init",
"preview",
]
214 changes: 214 additions & 0 deletions agentplatform/model_garden/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
# Gemini Enterprise Agent Platform Model Garden SDK for Python

The Gemini Enterprise Agent Platform Model Garden SDK helps developers use [Model Garden](https://cloud.google.com/model-garden) open models to build AI-powered features and applications.
The SDKs support use cases like the following:

- Deploy an open model
- Export open model weights

## Installation

To install the
[google-cloud-aiplatform](https://pypi.org/project/google-cloud-aiplatform/)
Python package, run the following command:

```shell
pip3 install --upgrade --user "google-cloud-aiplatform>=1.84"
```

## Usage

For detailed instructions, see [deploy an open model](https://cloud.google.com/vertex-ai/generative-ai/docs/model-garden/use-models#deploy_an_open_model) and [deploy notebook tutorial](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_deployment_tutorial.ipynb).

## Quick Start: Default Deployment

This is the simplest way to deploy a model. If you provide just a model name, the SDK will use the default deployment configuration.

```python
from agentplatform import model_garden

model = model_garden.OpenModel("google/paligemma@paligemma-224-float32")
endpoint = model.deploy()
```

**Use case:** Fast prototyping, first-time users evaluating model outputs.

## List Deployable Models

You can list all models that are currently deployable via Model Garden:

```python
from agentplatform import model_garden

models = model_garden.list_deployable_models()
```

To filter only Hugging Face models or by keyword:

```python
models = model_garden.list_deployable_models(list_hf_models=True, model_filter="stable-diffusion")
```

**Use case:** Discover available models before deciding which one to deploy.

## Hugging Face Model Deployment

Deploy a model directly from Hugging Face using the model ID.

```python
model = model_garden.OpenModel("Qwen/Qwen2-1.5B-Instruct")
endpoint = model.deploy()
```

**Use case:** Leverage community or third-party models without custom container setup. If the model is gated, you may need to provide a Hugging Face access token:

```python
endpoint = model.deploy(hugging_face_access_token="your_hf_token")
```

**Use case:** Deploy gated Hugging Face models requiring authentication.

## List Deployment Configurations

You can inspect available deployment configurations for a model:

```python
model = model_garden.OpenModel("google/paligemma@paligemma-224-float32")
deploy_options = model.list_deploy_options()
```

**Use case:** Evaluate compatible machine specs and containers before deployment.

## Select a Verified Deployment: By Container Image

Specify a container image from the list of verified deployment configurations.

```python
endpoint = model.deploy(
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20250430_0916_RC00_maas",
)
```

## Select a Verified Deployment: By Hardware

Specify a hardware configuration from the list of verified deployment configurations.

```python
endpoints = model.deploy(
machine_type="a3-highgpu-1g",
accelerator_type="NVIDIA_H100_80GB",
accelerator_count=1,
)
```

## Select a Verified Deployment: By Container and Hardware

Specify both a container image and a hardware configuration from the list of verified deployment configurations.

```python
endpoint = model.deploy(
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20250430_0916_RC00_maas",
machine_type="a3-highgpu-1g",
accelerator_type="NVIDIA_H100_80GB",
accelerator_count=1,
)
```

**Use case:** Production configuration, performance tuning, scaling.

## EULA Acceptance

Some models require acceptance of a license agreement. Pass `eula=True` if prompted.

```python
model = model_garden.OpenModel("google/gemma2@gemma-2-27b-it")
endpoint = model.deploy(eula=True)
```

**Use case:** First-time deployment of EULA-protected models.

## Spot VM Deployment

Schedule workloads on Spot VMs for lower cost.

```python
endpoint = model.deploy(spot=True)
```

**Use case:** Cost-sensitive development and batch workloads.

## Fast Tryout Deployment

Enable experimental fast-deploy path for popular models.

```python
endpoint = model.deploy(fast_tryout_enabled=True)
```

**Use case:** Interactive experimentation without full production setup.

## Dedicated Endpoints

Create a dedicated DNS-isolated endpoint.

```python
endpoint = model.deploy(use_dedicated_endpoint=True)
```

**Use case:** Traffic isolation for enterprise or regulated workloads.

## Reservation Affinity

Use shared or specific Compute Engine reservations.

```python
endpoint = model.deploy(
reservation_affinity_type="SPECIFIC_RESERVATION",
reservation_affinity_key="compute.googleapis.com/reservation-name",
reservation_affinity_values="projects/YOUR_PROJECT/zones/YOUR_ZONE/reservations/YOUR_RESERVATION"
)
```

**Use case:** Optimized resource usage with pre-reserved capacity.

## Custom Container Image

Override the default container with a custom image.

```python
endpoint = model.deploy(
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/custom-container:latest"
)
```

**Use case:** Use of custom inference servers or fine-tuned environments.

## Advanced Full Container Configuration

Further customize startup probes, health checks, shared memory, and gRPC ports.

```python
endpoint = model.deploy(
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/custom-container:latest",
container_command=["python3"],
container_args=["serve.py"],
container_ports=[8888],
container_env_vars={"ENV": "prod"},
container_predict_route="/predict",
container_health_route="/health",
serving_container_shared_memory_size_mb=512,
serving_container_grpc_ports=[9000],
serving_container_startup_probe_exec=["/bin/check-start.sh"],
serving_container_health_probe_exec=["/bin/health-check.sh"]
)
```

**Use case:** Production-grade deployments requiring deep customization of runtime behavior and monitoring.

## Contributing

See [Contributing](https://github.com/googleapis/python-aiplatform/blob/main/CONTRIBUTING.rst) for more information on contributing to the Gemini Enterprise Agent Platform Python SDK.

## License

The contents of this repository are licensed under the [Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
27 changes: 27 additions & 0 deletions agentplatform/model_garden/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

"""Classes and functions for working with Model Garden."""

# We just want to re-export certain classes
# pylint: disable=g-multiple-import,g-importing-member
from agentplatform.model_garden import _model_garden

OpenModel = _model_garden.OpenModel
PartnerModel = _model_garden.PartnerModel
list_deployable_models = _model_garden.list_deployable_models
list_models = _model_garden.list_models

__all__ = ("OpenModel", "PartnerModel", "list_deployable_models", "list_models")
Loading
Loading