fix: pass code_location to create_model for tensorflow estimator deployment #4537

HCharlie · 2024-03-25T21:28:49Z

Description of changes:
add code_location which is passed to tensorflow estimator object but not passed to model in the deploy function.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

codecov · 2024-03-25T21:53:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.44%. Comparing base (0075fb3) to head (07d3f9e).

❗ Current head 07d3f9e differs from pull request most recent head eb186cc. Consider uploading reports for the commit eb186cc to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #4537      +/-   ##
==========================================
- Coverage   87.49%   87.44%   -0.06%     
==========================================
  Files         391      389       -2     
  Lines       37254    36889     -365     
==========================================
- Hits        32595    32256     -339     
+ Misses       4659     4633      -26

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

HCharlie · 2024-05-02T15:01:24Z

Hi @mohanasudhan, do you know what's missing for this PR?

mohanasudhan · 2024-05-02T17:13:35Z

Can you explain your usecase and add unit/integ test?

HCharlie · 2024-05-02T20:42:03Z

Can you explain your usecase and add unit/integ test?

hi @mohanasudhan, thanks for the reply, I think it's either a bug or I am wrong on how to use the tensorflow estimator. I am using Tensorflow estimator, and pass code_location parameter to the estimator like below, and when the code reaches the esitmator.deploy function, it's not parsing the specified parameter code_location to get the s3 bucket, but trying to create a new default s3 bucket. A more detailed version is specifed here. #4536

from sagemaker.tensorflow import TensorFlow
source_dir = 's3://{}/{}/source'.format(bucket, prefix)
output_path = 's3://{}/{}/output'.format(bucket, prefix)

hyperparams = {
    'sagemaker_requirements': 'code/requirements.txt'
}

mnist_estimator = TensorFlow(entry_point='code/mnist.py',
                              base_job_name=base_job_name,
                              output_path=output_path,
                              code_location=source_dir,
                              hyperparameters=hyperparams,
                              role=role,
                              instance_count=2,
                              instance_type='ml.m5.large',
                              framework_version='2.1.0',
                              py_version='py3',
                              distribution={'parameter_server': {'enabled': True}})

## fit
print("start fitting")
mnist_estimator.fit(training_data_uri)

## deploy
print("start deploy")
predictor = mnist_estimator.deploy(initial_instance_count=1, instance_type='ml.m5.large')

* fix: skip TF tests for unsupported versions * flake8

* feat: add pytorch-tgi-inference 2.4.0 * add tgi 3.0.1 image * skip faulty test * formatting * formatting * add hf pytorch training 4.46 * update version alias * add py311 to training version * update tests with pyversion 311 * formatting --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

…mage (aws#4992) Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

* Fix deepdiff dependencies * trigger tests

* change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * documentation: Removed a line about python version requirements of training script which can misguide users.Training script can be of latest version based on the support provided by framework_version of the container * feature: Enabled update_endpoint through model_builder * fix: fix unit test, black-check, pylint errors * fix: fix black-check, pylint errors * fix:Added handler for pipeline variable while creating process job * fix: Added handler for pipeline variable while creating process job * Revert the PR changes: aws#5122, due to issue https://t.corp.amazon.com/P223568185/overview * Fix: fix the issue, https://t.corp.amazon.com/P223568185/communication --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com>

* fix: tgi image uri unit tests * fix: black-format and flake8 failures * fix: parse * fix: print statement --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

…aws#5123) * clean up * bump maxdepth for doc/api/training to fix readthedocs * change maxdepth for readthedocs rendering doc/api/training page * change maxdepth for readthedocs rendering doc/api/training page * change maxdepth for readthedocs rendering doc/api/training page

* change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * change: Allow telemetry only in supported regions * documentation: Removed a line about python version requirements of training script which can misguide users.Training script can be of latest version based on the support provided by framework_version of the container * feature: Enabled update_endpoint through model_builder * fix: fix unit test, black-check, pylint errors * fix: fix black-check, pylint errors * fix:Added handler for pipeline variable while creating process job * fix: Added handler for pipeline variable while creating process job * Revert the PR changes: aws#5122, due to issue https://t.corp.amazon.com/P223568185/overview * Fix: fix the issue, https://t.corp.amazon.com/P223568185/communication * Revert PR 5122 changes, due to issues with other processor codeflows --------- Co-authored-by: Roja Reddy Sareddy <rsareddy@amazon.com> Co-authored-by: Zhaoqi <jzhaoqwa@amazon.com>

…ws#5144) * add s3 uri check to modeltrainer data source * update ModelTrainer to support s3 uri and tar.gz file as source_dir * black-format * add unit and integ tests * update logic and unit test to raise value error if the file is not .tar.gz

…image. (aws#5143) * feature:support custom workflow deployment in ModelBuilder using SMD image. (aws#1661) * feature:support custom workflow deployment in ModelBuilder using SMD inference image. * Rename test case and pass session. * Address PR comments. * Tweak resource cleanup logic in integ test. * Fixing CodeBuild integ test failures. * Renamed integ test. * Remove unused integ test, restore once GA. --------- Co-authored-by: Joseph Zhang <cjz@amazon.com> * Cache client as instance attribute in property@ decorator. (aws#1668) * Remove property@ decorator from ABC definition. * Cache client as instance attribute in @Property. * Fix flake8 issue. --------- Co-authored-by: Joseph Zhang <cjz@amazon.com> * Bugfixes from e2e testing. (aws#1670) * Fix Alabtross Inference component tests * trigger integ tests --------- Co-authored-by: cj-zhang <32367995+cj-zhang@users.noreply.github.com> Co-authored-by: Joseph Zhang <cjz@amazon.com> Co-authored-by: Pravali Uppugunduri <upravali@amazon.com>

…ws#5149) Co-authored-by: Namrata Madan <nmmadan@amazon.com>

Co-authored-by: adishaa <adishaa@amazon.com>

…5146) * Fix Flake8 Violations * Add Owner ID check for bucket with path when prefix is provided **Description** Previously we called the head_bucket call to ensure the owner ID check, but this doesnt take into consideration cases where the s3 path is provided through the prefix. This change makes sure that director level permissions are supported. **Testing Done** Tested through unit tests, integ tests and manual testing through the installation file. Yes * Address PR comment * Codestyle fixes * Minor fix * Codestyle fixes * Fix Unit tests

* chore: add huggingface images * chore: add tei 1.6 image * chore: add tei 1.6.0 to tei mapping in tests

aws#5098) Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.13.2 to 2.20.3. - [Release notes](https://github.com/mlflow/mlflow/releases) - [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.md) - [Commits](mlflow/mlflow@v2.13.2...v2.20.3) --- updated-dependencies: - dependency-name: mlflow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [mlflow](https://github.com/mlflow/mlflow) from 2.13.2 to 2.20.3. - [Release notes](https://github.com/mlflow/mlflow/releases) - [Changelog](https://github.com/mlflow/mlflow/blob/master/CHANGELOG.md) - [Commits](mlflow/mlflow@v2.13.2...v2.20.3) --- updated-dependencies: - dependency-name: mlflow dependency-version: 2.20.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [scikit-learn](https://github.com/scikit-learn/scikit-learn) from 1.3.2 to 1.5.1. - [Release notes](https://github.com/scikit-learn/scikit-learn/releases) - [Commits](scikit-learn/scikit-learn@1.3.2...1.5.1) --- updated-dependencies: - dependency-name: scikit-learn dependency-version: 1.5.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Improve error logging and documentation for issue 4007 * Add hyperlink to RTDs

HCharlie requested a review from a team as a code owner March 25, 2024 21:28

HCharlie requested review from mohanasudhan and removed request for a team March 25, 2024 21:28

HCharlie mentioned this pull request Mar 25, 2024

code_location parameter passed to tensorflow estimator is not all passed to create_model #4536

Closed

HCharlie force-pushed the MLX-1269 branch from 07d3f9e to 2f5345a Compare May 2, 2024 11:36

HCharlie had a problem deploying to manual-approval May 2, 2024 11:36 — with GitHub Actions Error

HCharlie force-pushed the MLX-1269 branch from 2f5345a to ef346b1 Compare May 2, 2024 11:44

HCharlie had a problem deploying to manual-approval May 2, 2024 11:44 — with GitHub Actions Error

fix: add code_location to create_model

3112627

HCharlie force-pushed the MLX-1269 branch from ef346b1 to 3112627 Compare May 2, 2024 11:45

HCharlie had a problem deploying to manual-approval May 2, 2024 11:46 — with GitHub Actions Error

HCharlie changed the title ~~MLX-1224 pass code_location to create_model for tensorflow estimator deployment~~ fix: pass code_location to create_model for tensorflow estimator deployment May 2, 2024

Merge branch 'master' into MLX-1269

53bef61

HCharlie had a problem deploying to manual-approval May 2, 2024 14:32 — with GitHub Actions Error

Merge branch 'master' into MLX-1269

84ef4f3

HCharlie had a problem deploying to manual-approval May 2, 2024 21:04 — with GitHub Actions Error

Merge branch 'master' into MLX-1269

eb186cc

HCharlie had a problem deploying to manual-approval May 6, 2024 07:52 — with GitHub Actions Failure

sagemaker-bot and others added 8 commits January 27, 2025 14:18

change: update image_uri_configs 01-27-2025 06:18:13 PST

dd8d4df

Merge branch 'master-rba' into local_merge

92b706a

fix: skip TF tests for unsupported versions (aws#5007)

2f8ed41

* fix: skip TF tests for unsupported versions * flake8

change: update image_uri_configs 01-29-2025 06:18:08 PST

27b588b

feat: use jumpstart deployment config image as default optimization i…

51e4cc0

…mage (aws#4992) Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

prepare release v2.238.0

87a1f4f

update development version to v2.238.1.dev0

10b64f6

pravali96 and others added 28 commits April 15, 2025 08:14

Fix deepdiff dependencies (aws#5128)

99b1b81

* Fix deepdiff dependencies * trigger tests

fix: tgi image uri unit tests (aws#5127)

92efc09

* fix: tgi image uri unit tests * fix: black-format and flake8 failures * fix: parse * fix: print statement --------- Co-authored-by: Erick Benitez-Ramos <141277478+benieric@users.noreply.github.com>

prepare release v2.243.2

29bdeb4

update development version to v2.243.3.dev0

27e5208

change: update image_uri_configs 04-11-2025 07:18:19 PST

ba6323f

change: update image_uri_configs 04-15-2025 07:18:10 PST

f225b85

change: update image_uri_configs 04-16-2025 07:18:18 PST

6b96afa

update pr test to deprecate py38 and add py312 (aws#5133)

79c4ddd

update readme to reflect py312 upgrade

ba559e6

prepare release v2.243.3

57f483d

update development version to v2.243.4.dev0

201500c

chore: add huggingface images (aws#5142)

15cb303

fix: pin mamba version to 24.11.3-2 to avoid inconsistent test runs (a…

0dae5c9

…ws#5149) Co-authored-by: Namrata Madan <nmmadan@amazon.com>

Add model server timeout (aws#5151)

a896bc6

Co-authored-by: adishaa <adishaa@amazon.com>

prepare release v2.244.0

87372db

update development version to v2.244.1.dev0

85056eb

chore: Add tei 1.6.0 image (aws#5145)

bb803c9

* chore: add huggingface images * chore: add tei 1.6 image * chore: add tei 1.6.0 to tei mapping in tests

Improve error logging and documentation for issue 4007 (aws#5153)

e747b03

* Improve error logging and documentation for issue 4007 * Add hyperlink to RTDs

Merge branch 'master' into MLX-1269

c9d8ced

sage-maker temporarily deployed to manual-approval May 6, 2025 20:31 — with GitHub Actions Inactive

zhaoqizqwang force-pushed the master branch from 5d3f175 to fa30a6d Compare November 20, 2025 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pass code_location to create_model for tensorflow estimator deployment #4537

fix: pass code_location to create_model for tensorflow estimator deployment #4537

Uh oh!

HCharlie commented Mar 25, 2024 •

edited

Loading

Uh oh!

codecov bot commented Mar 25, 2024 •

edited

Loading

Uh oh!

HCharlie commented May 2, 2024

Uh oh!

mohanasudhan commented May 2, 2024

Uh oh!

HCharlie commented May 2, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

fix: pass code_location to create_model for tensorflow estimator deployment #4537

Are you sure you want to change the base?

fix: pass code_location to create_model for tensorflow estimator deployment #4537

Uh oh!

Conversation

HCharlie commented Mar 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Checklist

General

Tests

Uh oh!

codecov bot commented Mar 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

HCharlie commented May 2, 2024

Uh oh!

mohanasudhan commented May 2, 2024

Uh oh!

HCharlie commented May 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

HCharlie commented Mar 25, 2024 •

edited

Loading

codecov bot commented Mar 25, 2024 •

edited

Loading

HCharlie commented May 2, 2024 •

edited

Loading