Serverless Indexes

For introductory information on indexes, please see Understanding indexes

Sparse vs Dense embedding vectors

When you are working with dense embedding vectors, you must specify the dimension of the vectors you expect to store at the time your index is created. For sparse vectors, used to represent vectors where most values are zero, you omit dimension and must specify vector_type="sparse".

from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric,
    VectorType
)

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_index(
    name='index-for-dense-vectors',
    dimension=1536,
    metric=Metric.COSINE,
    # vector_type="dense" is the default value, so it can be omitted if you prefer
    vector_type=VectorType.DENSE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_WEST_2
    ),
)

pc.create_index(
    name='index-for-sparse-vectors',
    metric=Metric.DOTPRODUCT,
    vector_type=VectorType.SPARSE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_WEST_2
    ),
)

Available clouds

See the available cloud regions page for the most up-to-date information one which cloud regions are available.

Create a serverless index on Amazon Web Services (AWS)

The following example creates a serverless index in the us-west-2 region of AWS. For more information on serverless and regional availability, see Understanding indexes.

from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric,
    VectorType
)

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')
pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_WEST_2
    ),
    vector_type=VectorType.DENSE
)

Create a serverless index on Google Cloud Platform

The following example creates a serverless index in the us-central1 region of GCP. For more information on serverless and regional availability, see Understanding indexes.

from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    GcpRegion,
    Metric
)

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.GCP,
        region=GcpRegion.US_CENTRAL1
    )
)

Create a serverless index on Azure

The following example creates a serverless index on Azure. For more information on serverless and regional availability, see Understanding indexes.

from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AzureRegion,
    Metric
)

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AZURE,
        region=AzureRegion.EASTUS2
    )
)

Read Capacity Configuration

You can configure the read capacity mode for your serverless index. By default, indexes are created with OnDemand mode. You can also specify Dedicated mode with dedicated read nodes.

Dedicated Read Capacity

Dedicated mode allocates dedicated read nodes for your workload. You must specify node_type, scaling, and scaling configuration.

from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    GcpRegion,
    Metric
)

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.GCP,
        region=GcpRegion.US_CENTRAL1,
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {
                    "shards": 2,
                    "replicas": 2
                }
            }
        }
    )
)

Configuring Read Capacity

You can change the read capacity configuration of an existing serverless index using configure_index. This allows you to:

Switch between OnDemand and Dedicated modes
Adjust the number of shards and replicas for Dedicated mode with manual scaling

from pinecone import Pinecone

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

# Switch to OnDemand read capacity
pc.configure_index(
    name='my-index',
    read_capacity={"mode": "OnDemand"}
)

# Switch to Dedicated read capacity with manual scaling
pc.configure_index(
    name='my-index',
    read_capacity={
        "mode": "Dedicated",
        "dedicated": {
            "node_type": "t1",
            "scaling": "Manual",
            "manual": {
                "shards": 3,
                "replicas": 2
            }
        }
    }
)

# Scale up by increasing shards and replicas
pc.configure_index(
    name='my-index',
    read_capacity={
        "mode": "Dedicated",
        "dedicated": {
            "node_type": "t1",
            "scaling": "Manual",
            "manual": {
                "shards": 4,
                "replicas": 3
            }
        }
    }
)

When you change read capacity configuration, the index will transition to the new configuration. You can use describe_index to check the status of the transition.

Metadata Schema Configuration

You can configure which metadata fields are filterable by specifying a metadata schema. By default, all metadata fields are indexed. However, large amounts of metadata can cause slower index building as well as slower query execution, particularly when data is not cached in a query executor's memory and local SSD and must be fetched from object storage.

To prevent performance issues due to excessive metadata, you can limit metadata indexing to the fields that you plan to use for query filtering. When you specify a metadata schema, only fields marked as filterable: True are indexed and can be used in filters.

from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric
)

pc = Pinecone(api_key='<<PINECONE_API_KEY>>')

pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_WEST_2,
        schema={
            "genre": {"filterable": True},
            "year": {"filterable": True},
            "description": {"filterable": True}
        }
    )
)

Configuring, listing, describing, and deleting

See shared index actions to learn about how to manage the lifecycle of your index after it is created.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serverless Indexes

Sparse vs Dense embedding vectors

Available clouds

Create a serverless index on Amazon Web Services (AWS)

Create a serverless index on Google Cloud Platform

Create a serverless index on Azure

Read Capacity Configuration

Dedicated Read Capacity

Configuring Read Capacity

Metadata Schema Configuration

Configuring, listing, describing, and deleting

FilesExpand file tree

serverless-indexes.md

Latest commit

History

serverless-indexes.md

File metadata and controls

Serverless Indexes

Sparse vs Dense embedding vectors

Available clouds

Create a serverless index on Amazon Web Services (AWS)

Create a serverless index on Google Cloud Platform

Create a serverless index on Azure

Read Capacity Configuration

Dedicated Read Capacity

Configuring Read Capacity

Metadata Schema Configuration

Configuring, listing, describing, and deleting