Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
4a358d0
feat: initial commit - Azure MLOps v2 Solution Accelerator with moder…
prgabriel Nov 12, 2025
5818fdc
docs: update modernization summary with Terraform v4.11, Python 3.11,…
prgabriel Nov 12, 2025
58b089d
docs: Modernize deployment guides with OIDC authentication and update…
prgabriel Nov 12, 2025
38781b3
docs: Update random provider version to v3.7.2 in documentation
prgabriel Nov 12, 2025
1da92e3
docs: update Terraform version requirement to v1.10.0 and azurerm pro…
prgabriel Nov 13, 2025
020358b
docs: add steps for retrieving service principal object ID and config…
prgabriel Nov 15, 2025
4249805
Update deployment guide with validated implementation details
prgabriel Nov 18, 2025
9e2bf4b
docs: update destroy workflow timing to reflect 2-minute endpoint del…
prgabriel Nov 18, 2025
e6568f7
docs: enhance deployment guide with OIDC best practices and optional …
prgabriel Nov 18, 2025
546a33d
docs: remove deprecated python-sdk-v1 option from parameters in initi…
prgabriel Dec 5, 2025
4ca99f8
Merge secondary/main: Keep modernized version with OIDC, Terraform 4.…
pgabriel-01 Dec 5, 2025
2abfa5c
fix: enhance sparse checkout script with directory existence checks a…
pgabriel-01 Dec 15, 2025
d0a1e71
fix: streamline git repository initialization and enhance GitHub repo…
pgabriel-01 Dec 15, 2025
bcc3b3a
fix: remove unnecessary line and correct file path formatting in depl…
pgabriel-01 Dec 16, 2025
fc5c346
fix: update security best practice note for Azure DevOps to emphasize…
pgabriel-01 Dec 16, 2025
6883eec
Merge branch 'Dec2025patch' into main
pgabriel-01 Dec 16, 2025
647a996
docs: update deployment guide with Terraform variable configuration a…
prgabriel Dec 22, 2025
7e8745f
docs: refine deployment guide for GitHub Actions with formatting impr…
prgabriel Dec 22, 2025
af42d47
docs: enhance deployment guide for GitHub Actions with improved struc…
prgabriel Dec 22, 2025
da37140
docs: remove outdated reference to Azure ML python SDK v1 in deployme…
prgabriel Jan 7, 2026
b32c51b
docs: update Azure DevOps deployment guide for service connection con…
prgabriel Jan 8, 2026
8a2ecfd
Update deploy guides with dynamic aml_compute_sku configuration
prgabriel Jan 17, 2026
11bd61b
docs: enhance sparse_checkout.sh to move devops-pipelines to infrastr…
prgabriel Jan 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 17 additions & 12 deletions documentation/deployguides/deployguide_ado.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,6 @@ In this step, you will run an Azure DevOps pipeline, `initialise-project`, that
- Choose **nlp** for natural language projects
- **MLOps Interface**: Select the interface to the Azure ML platform, either CLI or SDK.
- Choose **aml-cli-v2** for the Azure ML CLI v2 interface. This is supported for all ML project types.
- Choose **python-sdk-v1** to use the Azure ML python SDK v1 for training and deployment of your model. This is supported for Classical and CV project types.
- Choose **python-sdk-v2** to use the Azure ML python SDK v2 for training and deployment of your model. This is supported for Classical and NLP project types.
- Choose **rai-aml-cli-v2** to use the Responsible AI cli tools for training and deployment of your model. This is supported only for Classical project types at this time.

Expand Down Expand Up @@ -240,7 +239,7 @@ In this step, you will run an Azure DevOps pipeline, `initialise-project`, that

For Azure DevOps pipelines to create Azure Machine Learning infrastructure and deploy and execute Azure ML pipelines, it is necessary to create an Azure service principal for each Azure ML environment (Dev and/or Prod) and configure Azure DevOps service connections using those service principals.

> **🔐 Security Best Practice**: Azure DevOps now supports **workload identity federation** as a more secure alternative to service principal secrets. Workload identity federation uses OpenID Connect (OIDC) to establish trust without storing long-lived secrets. For production deployments, consider using workload identity federation instead of the service principal methods described below. See [Microsoft's documentation](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/connect-to-azure#create-an-azure-resource-manager-service-connection-using-workload-identity-federation) for setup instructions.
> **Security Best Practice**: Azure DevOps now supports **workload identity federation** as a more secure alternative to service principal secrets. Workload identity federation uses OpenID Connect (OIDC) to establish trust without storing long-lived secrets. For production deployments, consider using workload identity federation instead of the service principal methods described below. See [Microsoft's documentation](https://learn.microsoft.com/en-us/azure/devops/pipelines/library/connect-to-azure#create-an-azure-resource-manager-service-connection-using-workload-identity-federation) for setup instructions.

These service principals can be created using one of the two methods below:

Expand Down Expand Up @@ -343,13 +342,15 @@ Select **Project Settings** at the bottom left of the project page and select **
Select **Create service connection**

* For service, select **Azure Resource Manager** and **Next**
* For authentication method, select **Service principal (manual)** and **Next**

Complete the new service connection configuration using the information from your tenant, subscription, and the service principal you created for Prod.
* For identity type, select **App registration (automatic)**
* For Credential, select **Workload identity Federation**
* for Subscription, select your subscription from the drop-down and select **Save**

<p align="left">
<img src="./images/ado-service-principal-manual.png" alt="Service connection" width="35%" height="35%"/>
</p>
<img src="./images/ado-service-principal-automatic.png" alt="Service connection" width="35%" height="35%"/>
</p>

Complete the new service connection configuration using the information from your tenant, subscription, and the service principal you created for Prod.

Name this service connection **Azure-ARM-Prod**. Check **Grant access permission to all pipelines**. and click **Verify and save**.

Expand Down Expand Up @@ -397,14 +398,18 @@ To do this, go back to **Repos** and your ML project repo, in this example, `tax

>**Important:**
>> Note that `config-infra-prod.yml` and `config-infra-dev.yml` files use default region as **eastus** to deploy resource group and Azure ML Workspace. If you are using Free/Trial or similar learning purpose subscriptions, you must do one of the below -
> 1. If you decide to use **eastus** region, ensure that your subscription(s) have a quota/limit of up to 20 vCPUs for **Standard Dsv5 Family vCPUs** (or **Standard DSv2 Family vCPUs** for older deployments). Visit Subscription page in Azure Portal as shown below to validate this.
> 1. If you decide to use **eastus** region, ensure that your subscription(s) have a quota/limit of up to 64 vCPUs for **Standard DSv3 Family vCPUs**. The default compute cluster uses **STANDARD_D16S_V3** (16 vCPUs per node, up to 4 nodes = 64 vCPUs max). Visit Subscription page in Azure Portal as shown below to validate this.
![alt text](images/susbcriptionQuota.png)
> 2. If not, you should change it to a region where **Standard Dsv5 Family vCPUs** has a quota/limit of up to 20 vCPUs.
> 3. You may also choose to change the region and compute type being used for deployment. The default compute is now **STANDARD_D4S_V5** (5th generation, improved performance). To change this, search for **STANDARD_D4S_V5** in the following DevOps pipeline files and change to a compute type that works for your setup:
> * `mlops-templates/aml-cli-v2/mlops/devops-pipelines/deploy-model-training-pipeline.yml`
> 2. If not, you should change it to a region where **Standard DSv3 Family vCPUs** has sufficient quota.
> 3. You can easily change the VM SKU by editing the `aml_compute_sku` parameter in your config file:
> * `config-infra-prod.yml` or `config-infra-dev.yml` - set `aml_compute_sku: <YOUR_SKU>` (e.g., `STANDARD_D4S_V3`)
> * This works for both Bicep and Terraform deployments
> 4. For ML pipeline compute (separate from infrastructure), you may need to edit:
> * `mlops-templates/aml-cli-v2/mlops/devops-pipelines/deploy-model-training-pipeline.yml` - for ML pipeline compute
> * `mlops-project-template/classical/aml-cli-v2/mlops/devops-pipelines/deploy-batch-endpoint-pipeline.yml`
> * `/mlops-project-template/classical/aml-cli-v2/mlops/azureml/deploy/online/online-deployment.yml`
> 4. Note in the path above that you need to navigate to the right repository (e.g. **mlops-templates**), and the right ML interface (e.g. **aml-cli-v2**).
>
> **Note**: The default infrastructure SKU is **STANDARD_D16S_V3** (3rd generation). ML pipelines may use different SKUs like **Standard_D4s_v5**. Adjust based on your quota and requirements.

Making sure you are in the **main** branch, click on `config-infra-prod.yml` to open it.

Expand Down
Loading