chore(pricing): Update vertex-ai pricing by siddharthsambharia-portkey · Pull Request #550 · Portkey-AI/models

siddharthsambharia-portkey · 2026-03-17T12:15:04Z

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type	Count
➕ Models added	2
🔄 Models updated (merged)	23

➕ New Models

gemini-2.5-pro-tts
gemini-2.5-flash-tts

🔄 Updated Models

gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite
gemini-2.5-flash-image
gemini-2.0-flash-001
gemini-2.5-flash-preview-09-2025
gemini-2.5-flash-lite-preview-09-2025
gemini-3-pro-preview
gemini-3-flash-preview
gemini-3-pro-image-preview
gemini-3.1-pro-preview
gemini-3.1-flash-lite-preview
gemini-3.1-flash-image-preview
veo-3.1-fast-generate-001
veo-3.0-fast-generate-preview
gemini-embedding-001
gemini-embedding-2-preview
text-embedding-005
text-multilingual-embedding-002
text-embedding-large-exp-03-07
multimodalembedding@001
claude-opus-4-1@20250805
claude-opus-4@20250514

Model → Pricing Page Mapping

Google – Gemini (token pricing, $/1M)

Model ID	Publisher / Section	Source	Notes
`gemini-2.5-pro`	Google – Gemini 2.5 Pro	API	input $1.25, output $10, cache_read $0.31, batch $0.625/$5, web_search 3.5¢, enterprise_web_search 4.5¢
`gemini-2.5-flash`	Google – Gemini 2.5 Flash	API	input $0.30, output $2.50, cache_read $0.075, batch $0.15/$1.25, web_search 3.5¢, enterprise_web_search 4.5¢
`gemini-2.5-flash-lite`	Google – Gemini 2.5 Flash Lite	API	input $0.10, output $0.40, cache_read $0.025, batch $0.05/$0.20, web_search 3.5¢, enterprise_web_search 4.5¢
`gemini-2.5-flash-image`	Google – Gemini 2.5 Flash (image variant)	API	Same token pricing as gemini-2.5-flash + image_token $30/1M
`gemini-2.5-flash-preview-09-2025`	Google – Gemini 2.5 Flash	API	Preview alias — priced same as gemini-2.5-flash
`gemini-2.5-flash-lite-preview-09-2025`	Google – Gemini 2.5 Flash Lite	API	Preview alias — priced same as gemini-2.5-flash-lite
`gemini-2.0-flash-001`	Google – Gemini 2.0 Flash	API	input $0.15, output $0.60, cache_read $0.0375, batch $0.075/$0.30, web_search 3.5¢
`gemini-2.0-flash-lite-001`	Google – Gemini 2.0 Flash Lite	API	input $0.075, output $0.30, web_search 3.5¢
`gemini-3-pro-preview`	Google – Gemini 3 Pro	API	input $2.00, output $12.00, batch $1/$6, web_search 1.4¢, enterprise_web_search 4.5¢
`gemini-3-flash-preview`	Google – Gemini 3 Flash	API	input $0.50, output $3.00, batch $0.25/$1.50, web_search 1.4¢, enterprise_web_search 4.5¢
`gemini-3-pro-image-preview`	Google – Gemini 3 Pro (image variant)	API	Same as gemini-3-pro-preview + image_token $120/1M
`gemini-3.1-pro-preview`	Google – Gemini 3.1 Pro	API	input $2.00, output $12.00, batch $1/$6, web_search 1.4¢, enterprise_web_search 4.5¢
`gemini-3.1-flash-lite-preview`	Google – Gemini 3.1 Flash Lite	API	input $0.25, output $1.50, batch $0.125/$0.75, web_search 1.4¢, enterprise_web_search 4.5¢
`gemini-3.1-flash-image-preview`	Google – Gemini 3.1 Flash (image variant)	API	input $0.25, output $1.50 + image_token $60/1M, web_search 1.4¢
`gemini-2.5-computer-use-preview-10-2025`	Google – Gemini 2.5 (computer use)	API – price not found	No dedicated pricing row; added with price 0
`gemini-2.5-pro-tts`	Google – Gemini 2.5 TTS	API – price not found	TTS model excluded from generative AI pricing page
`gemini-2.5-flash-tts`	Google – Gemini 2.5 TTS	API – price not found	TTS model excluded from generative AI pricing page

Google – Imagen (per-image pricing)

Model ID	Publisher / Section	Source	Notes
`imagen-4.0-ultra-generate-001`	Google – Imagen 4.0 Ultra	API	$0.06/image
`imagen-4.0-generate-001`	Google – Imagen 4.0	API	$0.04/image
`imagen-4.0-fast-generate-001`	Google – Imagen 4.0 Fast	API	$0.02/image
`imagen-3.0-generate-002`	Google – Imagen 3.0	API	$0.04/image
`imagen-3.0-capability-001`	Google – Imagen 3.0 (capability)	API	Priced same as imagen-3.0-generate per schema rules ($0.04/image)
`imagen-3.0-capability-002`	Google – Imagen 3.0 (capability)	API	Priced same as imagen-3.0-generate per schema rules ($0.04/image)

Google – Veo (per-second video pricing)

Model ID	Publisher / Section	Source	Notes
`veo-3.1-generate-001`	Google – Veo 3.1	API	$0.20/sec video, 8s default, 1 sample
`veo-3.1-fast-generate-001`	Google – Veo 3.1 Fast	API	$0.10/sec video, 8s default, 1 sample
`veo-3.1-generate-preview`	Google – Veo 3.1	API	Preview alias — priced same as veo-3.1-generate-001
`veo-3.1-fast-generate-preview`	Google – Veo 3.1 Fast	API	Preview alias — priced same as veo-3.1-fast-generate-001
`veo-3.0-generate-001`	Google – Veo 3.0	API	$0.20/sec video, 8s default, 1 sample
`veo-3.0-fast-generate-001`	Google – Veo 3.0 Fast	API	$0.10/sec video, 8s default, 1 sample
`veo-3.0-generate-preview`	Google – Veo 3.0	API	Preview alias — priced same as veo-3.0-generate-001
`veo-3.0-fast-generate-preview`	Google – Veo 3.0 Fast	API	Preview alias — priced same as veo-3.0-fast-generate-001
`veo-2.0-generate-001`	Google – Veo 2.0	API	$0.50/sec video, 8s default, 1 sample

Google – Embedding

Model ID	Publisher / Section	Source	Notes
`gemini-embedding-001`	Google – Gemini Embedding	API	$0.00015/1K tokens
`gemini-embedding-2-preview`	Google – Gemini Embedding	API	Preview; priced same as gemini-embedding-001
`text-embedding-005`	Google – Text Embedding	API	$0.000025/1K characters (per_million_characters unit)
`text-multilingual-embedding-002`	Google – Text Multilingual Embedding	API	$0.000025/1K characters
`text-embedding-large-exp-03-07`	Google – Text Embedding Large (experimental)	API	Priced same as gemini-embedding family ($0.00015/1K tokens)
`textembedding-gecko@003`	Google – Legacy Embedding	API – price not found	Legacy model; no dedicated pricing row
`textembedding-gecko-multilingual@001`	Google – Legacy Embedding	API – price not found	Legacy model; no dedicated pricing row
`multimodalembedding@001`	Google – Multimodal Embedding	API	Per-image $0.002¢, per-video-standard $0.002¢

Anthropic – Claude

Model ID	Publisher / Section	Source	Notes
`claude-opus-4-6`	Anthropic – Claude Opus 4.6	API	@default stripped; $5/$25, cache_write $6.25, cache_read $0.50
`claude-sonnet-4-6`	Anthropic – Claude Sonnet 4.6	API	@default stripped; $3/$15, cache_write $3.75, cache_read $0.30
`claude-opus-4-5@20251101`	Anthropic – Claude Opus 4.5	API	Pinned version; $5/$25, cache_write $6.25, cache_read $0.50
`claude-sonnet-4-5@20250929`	Anthropic – Claude Sonnet 4.5	API	Pinned version; $3/$15, cache_write $3.75, cache_read $0.30
`claude-haiku-4-5@20251001`	Anthropic – Claude Haiku 4.5	API	Pinned version; $1/$5, cache_write $1.25, cache_read $0.10
`claude-opus-4-1@20250805`	Anthropic – Claude Opus 4.1	API	Pinned version; $5/$25, cache_write $6.25, cache_read $0.50
`claude-opus-4@20250514`	Anthropic – Claude Opus 4	API	Pinned version; $5/$25, cache_write $6.25, cache_read $0.50
`claude-sonnet-4@20250514`	Anthropic – Claude Sonnet 4	API	Pinned version; $3/$15, cache_write $3.75, cache_read $0.30

OpenAI – GPT

Model ID	Publisher / Section	Source	Notes
`gpt-oss-120b-maas`	OpenAI – GPT-OSS 120B	API	$0.09/$0.36

OpenAI models excluded (self-deploy / whisper): gpt-4o-self-deploy, gpt-4o-mini-self-deploy, o3-self-deploy, o4-mini-self-deploy, whisper-1

Meta – Llama

Model ID	Publisher / Section	Source	Notes
`llama-3.3-70b-instruct-maas`	Meta – Llama 3.3 70B	API	$0.72/$0.72
`llama-4-maverick-17b-128e-instruct-maas`	Meta – Llama 4 Maverick	API	$0.35/$1.15

Meta models excluded: llama-guard-*, prompt-guard-* (guard models); faster-rcnn-*, retinanet-*, mask-rcnn-*, segment-anything-*, sam3-* (non-generative CV); xlm-roberta-*, roberta-* (non-generative NLP); nllb-* (translation); imagebind-* (non-generative); all self-deploy without -maas suffix

Qwen

Model ID	Publisher / Section	Source	Notes
`qwen3-235b-a22b-instruct-2507-maas`	Qwen – Qwen3 235B	API	$0.22/$0.88
`qwen3-coder-480b-a35b-instruct-maas`	Qwen – Qwen3 Coder 480B	API	$0.22/$1.80
`qwen3-next-80b-a3b-instruct-maas`	Qwen – Qwen3 Next 80B	API	$0.15/$1.20
`qwen3-next-80b-a3b-thinking-maas`	Qwen – Qwen3 Next 80B (thinking)	API	$0.15/$1.20 (same row as instruct)

Qwen excluded: qwen-image (explicit policy); all self-deploy models without -maas suffix

Mistral

Model ID	Publisher / Section	Source	Notes
`mistral-small-2503`	Mistral – Mistral Small	API	$0.10/$0.30
`mistral-medium-3`	Mistral – Mistral Medium	API	$0.40/$2.00
`codestral-2`	Mistral – Codestral 2	API	$0.30/$0.90

Mistral excluded: mistral-ocr-2505 (OCR); codestral-2501-self-deploy, ministral-3, mistral-large-3 (self-deploy without -maas); mistral/mixtral from mistral-ai namespace (self-deploy)

DeepSeek

Model ID	Publisher / Section	Source	Notes
`deepseek-r1-0528-maas`	DeepSeek – DeepSeek R1 0528	API	$1.35/$5.40
`deepseek-v3.1-maas`	DeepSeek – DeepSeek V3.1	API	$0.60/$1.70
`deepseek-v3.2-maas`	DeepSeek – DeepSeek V3.2	API	$0.56/$1.68

DeepSeek excluded: deepseek-ocr-maas (OCR by name); all self-deploy variants (deepseek-r1, deepseek-v3, etc. without -maas)

Kimi / Moonshot

Model ID	Publisher / Section	Source	Notes
`kimi-k2-thinking-maas`	Moonshot – Kimi K2 Thinking	API	$0.60/$2.50

Kimi excluded: kimi-k2, kimi-k2-5 (self-deploy without -maas)

MiniMax

Model ID	Publisher / Section	Source	Notes
`minimax-m2-maas`	MiniMax – MiniMax M2	API	$0.30/$1.20

MiniMax excluded: minimax-m2 (self-deploy without -maas)

ZAI.org / GLM

Model ID	Publisher / Section	Source	Notes
`glm-4.7-maas`	ZAI.org – GLM-4.7	API	$0.60/$2.20
`glm-5-maas`	ZAI.org – GLM-5	API	$1.00/$3.20

ZAI excluded: glm-image (explicit policy); glm-ocr (OCR); glm-4.7, glm-5, glm-4.5 (self-deploy without -maas)

AI21

AI21: jamba-large-1.6 — self-deploy (has_deploy: true, no -maas suffix); excluded per partner rules. No includable models from AI21.

Generated by Pricing Agent on 2026-04-01

siddharthsambharia-portkey added 30 commits March 17, 2026 17:45

chore(pricing): Update vertex-ai pricing

a1a3f5f

chore(pricing): Update vertex-ai pricing

53b3f5d

chore(pricing): Update vertex-ai pricing

52dbf8e

chore(pricing): Update vertex-ai pricing

f19c6a3

chore(pricing): Update vertex-ai pricing

a6e1035

chore(pricing): Update vertex-ai pricing

91c6f2a

chore(pricing): Update vertex-ai pricing

d32f719

chore(pricing): Update vertex-ai pricing

6a7c7e8

chore(pricing): Update vertex-ai pricing

916ddaf

chore(pricing): Update vertex-ai pricing

fa02c68

chore(pricing): Update vertex-ai pricing

7320d33

chore(pricing): Update vertex-ai pricing

3604db1

chore(pricing): Update vertex-ai pricing

d31b801

chore(pricing): Update vertex-ai pricing

a267566

chore(pricing): Update vertex-ai pricing

04933eb

chore(pricing): Update vertex-ai pricing

2dd50e4

chore(pricing): Update vertex-ai pricing

21a3a64

chore(pricing): Update vertex-ai pricing

244cd8b

chore(pricing): Update vertex-ai pricing

623bbde

chore(pricing): Update vertex-ai pricing

c7e7113

chore(pricing): Update vertex-ai pricing

5cda0eb

chore(pricing): Update vertex-ai pricing

3b130f0

chore(pricing): Update vertex-ai pricing

271a047

chore(pricing): Update vertex-ai pricing

8867d9d

chore(pricing): Update vertex-ai pricing

bdf8d15

chore(pricing): Update vertex-ai pricing

23b51be

chore(pricing): Update vertex-ai pricing

81c0fd3

chore(pricing): Update vertex-ai pricing

ebf58b5

chore(pricing): Update vertex-ai pricing

6745bf8

chore(pricing): Update vertex-ai pricing

62dc55d

siddharthsambharia-portkey added 7 commits March 29, 2026 23:43

chore(pricing): Update vertex-ai pricing

6bdcbde

chore(pricing): Update vertex-ai pricing

c92c59c

chore(pricing): Update vertex-ai pricing

bc2a8ec

chore(pricing): Update vertex-ai pricing

94276ac

chore(pricing): Update vertex-ai pricing

6eb1b42

chore(pricing): Update vertex-ai pricing

f1c7a23

chore(pricing): Update vertex-ai pricing

d828299

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(pricing): Update vertex-ai pricing#550

chore(pricing): Update vertex-ai pricing#550
siddharthsambharia-portkey wants to merge 37 commits intomainfrom
pricing-update/vertex-ai

siddharthsambharia-portkey commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

siddharthsambharia-portkey commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

➕ New Models

🔄 Updated Models

Model → Pricing Page Mapping

Google – Gemini (token pricing, $/1M)

Google – Imagen (per-image pricing)

Google – Veo (per-second video pricing)

Google – Embedding

Anthropic – Claude

OpenAI – GPT

Meta – Llama

Qwen

Mistral

DeepSeek

Kimi / Moonshot

MiniMax

ZAI.org / GLM

AI21

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

siddharthsambharia-portkey commented Mar 17, 2026 •

edited

Loading