Skip to content

[Feat] Q3VL: Exclude calibration dataset from testing#271

Open
Victor49152 wants to merge 4 commits intomlcommons:mainfrom
Victor49152:feat/q3vl_calibration_dataset
Open

[Feat] Q3VL: Exclude calibration dataset from testing#271
Victor49152 wants to merge 4 commits intomlcommons:mainfrom
Victor49152:feat/q3vl_calibration_dataset

Conversation

@Victor49152
Copy link
Copy Markdown
Collaborator

@Victor49152 Victor49152 commented Apr 7, 2026

What does this PR do?

Exclude Calibration dataset from perf/acc testing as required for v6.1 round.

Type of change

  • New feature

Related issues

Checklist

  • Code follows project style
  • Pre-commit hooks pass

@Victor49152 Victor49152 requested a review from a team April 7, 2026 21:48
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a list of calibration sample indices to be excluded during the Shopify product catalogue dataset generation. The review feedback suggests using a set for the index list to optimize lookup performance and recommends using unique identifiers instead of absolute indices to ensure robustness across different dataset splits.

@Victor49152 Victor49152 force-pushed the feat/q3vl_calibration_dataset branch from fef312e to 7026abb Compare April 7, 2026 22:01
@Victor49152 Victor49152 self-assigned this Apr 7, 2026
@Victor49152 Victor49152 added the type: feature New feature or capability label Apr 7, 2026
@arekay-nv
Copy link
Copy Markdown
Collaborator

@Victor49152 can you add some context on why this is needed? Is the calibration dataset determined by the working group? We might want to rethink the design as this could be useful for other datasets as well.

@nvzhihanj
Copy link
Copy Markdown
Collaborator

@Victor49152 can you add some context on why this is needed? Is the calibration dataset determined by the working group? We might want to rethink the design as this could be useful for other datasets as well.

Usually the convention is that we don't need to exclude the calibration dataset from the inference dataset, we simply use ~10% of it to generate quantization dataset

@Victor49152
Copy link
Copy Markdown
Collaborator Author

Victor49152 commented Apr 8, 2026

@Victor49152 can you add some context on why this is needed? Is the calibration dataset determined by the working group? We might want to rethink the design as this could be useful for other datasets as well.

Based on our experience the model accuracy is not very sensitive on the choice of calibration samples, besides it's only 20/48289 which is very small portion.

However, we received an email from Anton Lokhmotov two weeks ago that states it's a convention to set aside the calibration dataset from validation dataset for fairness, and it's better for VLM to follow the same convention. I'm trying to address this ask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature New feature or capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants