[Feat] Q3VL: Exclude calibration dataset from testing#271
[Feat] Q3VL: Exclude calibration dataset from testing#271Victor49152 wants to merge 4 commits intomlcommons:mainfrom
Conversation
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Code Review
This pull request introduces a list of calibration sample indices to be excluded during the Shopify product catalogue dataset generation. The review feedback suggests using a set for the index list to optimize lookup performance and recommends using unique identifiers instead of absolute indices to ensure robustness across different dataset splits.
src/inference_endpoint/dataset_manager/predefined/shopify_product_catalogue/__init__.py
Outdated
Show resolved
Hide resolved
src/inference_endpoint/dataset_manager/predefined/shopify_product_catalogue/__init__.py
Outdated
Show resolved
Hide resolved
fef312e to
7026abb
Compare
|
@Victor49152 can you add some context on why this is needed? Is the calibration dataset determined by the working group? We might want to rethink the design as this could be useful for other datasets as well. |
Usually the convention is that we don't need to exclude the calibration dataset from the inference dataset, we simply use ~10% of it to generate quantization dataset |
Based on our experience the model accuracy is not very sensitive on the choice of calibration samples, besides it's only 20/48289 which is very small portion. However, we received an email from Anton Lokhmotov two weeks ago that states it's a convention to set aside the calibration dataset from validation dataset for fairness, and it's better for VLM to follow the same convention. I'm trying to address this ask. |
What does this PR do?
Exclude Calibration dataset from perf/acc testing as required for v6.1 round.
Type of change
Related issues
Checklist