Skip to content

blog: Gemini 3.5 Flash deep-dive benchmark and capability review#3014

Merged
atharvadeosthale merged 8 commits into
mainfrom
gemini-3-5-flash
May 20, 2026
Merged

blog: Gemini 3.5 Flash deep-dive benchmark and capability review#3014
atharvadeosthale merged 8 commits into
mainfrom
gemini-3-5-flash

Conversation

@atharvadeosthale
Copy link
Copy Markdown
Member

New blog post evaluating Gemini 3.5 Flash against Google's model card, Artificial Analysis numbers, and Appwrite Arena results. Author: atharva.

@appwrite
Copy link
Copy Markdown

appwrite Bot commented May 20, 2026

Appwrite Website

Project ID: 69d7efb00023389e8d27

Sites (1)
Site Status Logs Preview QR
 website
69d7f2670014e24571ca
Ready Ready View Logs Preview URL QR Code

Website (appwrite/website)

Project ID: 684969cb000a2f6c0a02

Sites (1)
Site Status Logs Preview QR
 website
68496a17000f03d62013
Queued Queued View Logs Preview URL QR Code


Tip

Preview deployments create instant URLs for every branch and commit

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 20, 2026

Greptile Summary

This PR adds a new blog post evaluating Gemini 3.5 Flash using Google's model card, Artificial Analysis leaderboard data, and Appwrite Arena benchmark results, along with a cover image and a corresponding .optimize-cache.json entry.

  • The article's internal numbers are self-consistent after what appears to be a prior revision pass: the MCP Atlas margin, Intelligence Index value, and pricing comparison are all cross-verified against the inline benchmark tables.
  • The cache entry added to .optimize-cache.json is keyed to cover.png, but the committed file is cover.avif; the phantom .png entry will persist in the cache until manually cleaned.

Confidence Score: 5/5

Safe to merge; this is a new blog post with no changes to application logic

All three files are additive: a new markdoc post, a cover image, and a cache entry. The article's benchmark numbers cross-check cleanly against its own inline tables after prior revision. The only residual issue is the stale cache key referencing cover.png instead of cover.avif, which has no runtime impact on readers.

.optimize-cache.json has a phantom cover.png entry that does not match the committed cover.avif

Important Files Changed

Filename Overview
src/routes/blog/post/gemini-3-5-flash-deep-dive/+page.markdoc New blog post evaluating Gemini 3.5 Flash; numerical claims within the article are internally self-consistent after prior revision rounds
.optimize-cache.json Adds a cache entry keyed to cover.png, which does not exist; the committed image is cover.avif
static/images/blog/gemini-3-5-flash-deep-dive/cover.avif Binary cover image in AVIF format; no issues

Reviews (7): Last reviewed commit: "blog(gemini-3-5-flash): correct vs 3.1 P..." | Re-trigger Greptile

Comment thread src/routes/blog/post/gemini-3-5-flash-deep-dive/+page.markdoc
Comment thread .optimize-cache.json
Gemini 3.5 Flash is the fastest frontier-class peer at 278 tok/s; gpt-oss-120b (high) at 246 is the next closest, not faster.
Comment thread src/routes/blog/post/gemini-3-5-flash-deep-dive/+page.markdoc Outdated
GPT 5.5 parenthetical reversed (90.0 to 94.8 = +4.8). Claude Opus 4.7 delta is -0.6, not +0.6: Skills reduced its freeform score from 94.8 to 94.2.
Comment thread src/routes/blog/post/gemini-3-5-flash-deep-dive/+page.markdoc Outdated
- MCP Atlas margin over 3.1 Pro: 5.4 points, not 4.5 (83.6 - 78.2).
- GPT-5.5 (xhigh) speed: 65 tok/s, matching the SOTA table and AA summary.
- Realtime score qualified as MCQ Realtime (94.1%); overall is 94.0.
- Reframed Flash Lite reference: it scores 88.3, below the 90-point top tier.
Comment thread src/routes/blog/post/gemini-3-5-flash-deep-dive/+page.markdoc Outdated
…al cost

Intelligence Index bullet uses 55.3 to match the SOTA table.
Eval cost bullet uses $1,552 to match the table and downstream prose.
Frontmatter unlisted: true was copied from a style-reference post without authorization. Removing so the post appears on the blog index.
Comment thread src/routes/blog/post/gemini-3-5-flash-deep-dive/+page.markdoc Outdated
3.5 Flash is 25% cheaper per token than 3.1 Pro on both input and output ($1.50/$2.00 and $9.00/$12.00 = 0.75), not 40%.
@atharvadeosthale atharvadeosthale merged commit 214fada into main May 20, 2026
11 of 12 checks passed
@atharvadeosthale atharvadeosthale deleted the gemini-3-5-flash branch May 20, 2026 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants