Skip to content

Commit

Permalink
update for serverless compute (#2830)
Browse files Browse the repository at this point in the history
* update for serverless compute

* run black
  • Loading branch information
sdgilley authored Nov 14, 2023
1 parent 628fdb9 commit 1ad4a12
Show file tree
Hide file tree
Showing 3 changed files with 70 additions and 227 deletions.
112 changes: 29 additions & 83 deletions tutorials/get-started-notebooks/pipeline.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -105,14 +105,17 @@
"\n",
"# authenticate\n",
"credential = DefaultAzureCredential()\n",
"# # Get a handle to the workspace\n",
"\n",
"SUBSCRIPTION = \"<SUBSCRIPTION_ID>\"\n",
"RESOURCE_GROUP = \"<RESOURCE_GROUP>\"\n",
"WS_NAME = \"<AML_WORKSPACE_NAME>\"\n",
"# Get a handle to the workspace\n",
"ml_client = MLClient(\n",
" credential=credential,\n",
" subscription_id=\"<SUBSCRIPTION_ID>\",\n",
" resource_group_name=\"<RESOURCE_GROUP>\",\n",
" workspace_name=\"<AML_WORKSPACE_NAME>\",\n",
")\n",
"cpu_cluster = None"
" subscription_id=SUBSCRIPTION,\n",
" resource_group_name=RESOURCE_GROUP,\n",
" workspace_name=WS_NAME,\n",
")"
]
},
{
Expand All @@ -121,21 +124,32 @@
"metadata": {},
"source": [
"> [!NOTE]\n",
"> Creating MLClient will not connect to the workspace. The client initialization is lazy, it will wait for the first time it needs to make a call (this will happen when creating the `credit_data` data asset, two code cells from here).\n",
"\n",
"## Access the registered data asset\n",
"\n",
"Start by getting the data that you previously registered in [Tutorial: Upload, access and explore your data](explore-data.ipynb).\n",
"> Creating MLClient will not connect to the workspace. The client initialization is lazy, it will wait for the first time it needs to make a call (this will happen in the next code cell).\n",
"\n",
"* Azure Machine Learning uses a `Data` object to register a reusable definition of data, and consume data within a pipeline."
"Verify the connection by making a call to `ml_client`. Since this is the first time that you're making a call to the workspace, you may be asked to authenticate. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Verify that the handle works correctly.\n",
"# If you ge an error here, modify your SUBSCRIPTION, RESOURCE_GROUP, and WS_NAME in the previous cell.\n",
"ws = ml_client.workspaces.get(WS_NAME)\n",
"print(ws.location, \":\", ws.resource_group)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Since this is the first time that you're making a call to the workspace, you may be asked to authenticate. Once the authentication is complete, you then see the dataset registration completion message."
"## Access the registered data asset\n",
"\n",
"Start by getting the data that you previously registered in [Tutorial: Upload, access and explore your data](explore-data.ipynb).\n",
"\n",
"* Azure Machine Learning uses a `Data` object to register a reusable definition of data, and consume data within a pipeline."
]
},
{
Expand All @@ -157,72 +171,6 @@
"print(f\"Data asset URI: {credit_data.path}\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a compute resource to run your pipeline (Optional)\n",
"\n",
"You can **skip this step** if you want to use [serverless compute (preview)](https://learn.microsoft.com/azure/machine-learning/how-to-use-serverless-compute?view=azureml-api-2&tabs=python) to run the training job. Through serverless compute, Azure Machine Learning takes care of creating, scaling, deleting, patching and managing compute, along with providing managed network isolation, reducing the burden on you. \n",
"\n",
"Each step of an Azure Machine Learning pipeline can use a different compute resource for running the specific job of that step. It can be single or multi-node machines with Linux or Windows OS, or a specific compute fabric like Spark.\n",
"\n",
"In this section, you provision a Linux [compute cluster](https://docs.microsoft.com/azure/machine-learning/how-to-create-attach-compute-cluster?tabs=python). See the [full list on VM sizes and prices](https://azure.microsoft.com/en-ca/pricing/details/machine-learning/) .\n",
"\n",
"For this tutorial, you only need a basic cluster so use a Standard_DS3_v2 model with 2 vCPU cores, 7-GB RAM and create an Azure Machine Learning Compute.\n",
"> [!TIP]\n",
"> If you already have a compute cluster, replace \"cpu-cluster\" in the next code block with the name of your cluster. This will keep you from creating another one.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"name": "cpu_cluster"
},
"outputs": [],
"source": [
"from azure.ai.ml.entities import AmlCompute\n",
"\n",
"# Name assigned to the compute cluster\n",
"cpu_compute_target = \"cpu-cluster\"\n",
"\n",
"try:\n",
" # let's see if the compute target already exists\n",
" cpu_cluster = ml_client.compute.get(cpu_compute_target)\n",
" print(\n",
" f\"You already have a cluster named {cpu_compute_target}, we'll reuse it as is.\"\n",
" )\n",
"\n",
"except Exception:\n",
" print(\"Creating a new cpu compute target...\")\n",
"\n",
" # Let's create the Azure Machine Learning compute object with the intended parameters\n",
" # if you run into an out of quota error, change the size to a comparable VM that is available.\n",
" # Learn more on https://azure.microsoft.com/en-us/pricing/details/machine-learning/.\n",
" cpu_cluster = AmlCompute(\n",
" name=cpu_compute_target,\n",
" # Azure Machine Learning Compute is the on-demand VM service\n",
" type=\"amlcompute\",\n",
" # VM Family\n",
" size=\"STANDARD_DS3_V2\",\n",
" # Minimum running nodes when there is no job running\n",
" min_instances=0,\n",
" # Nodes in cluster\n",
" max_instances=4,\n",
" # How many seconds will the node running after the job termination\n",
" idle_time_before_scale_down=180,\n",
" # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination\n",
" tier=\"Dedicated\",\n",
" )\n",
" print(\n",
" f\"AMLCompute with name {cpu_cluster.name} will be created, with compute size {cpu_cluster.size}\"\n",
" )\n",
" # Now, we pass the object to MLClient's create_or_update method\n",
" cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)"
]
},
{
"attachments": {},
"cell_type": "markdown",
Expand Down Expand Up @@ -829,9 +777,7 @@
"\n",
"\n",
"@dsl.pipeline(\n",
" compute=cpu_compute_target\n",
" if (cpu_cluster)\n",
" else \"serverless\", # \"serverless\" value runs pipeline on serverless compute\n",
" compute=\"serverless\", # \"serverless\" value runs pipeline on serverless compute\n",
" description=\"E2E data_perp-train pipeline\",\n",
")\n",
"def credit_defaults_pipeline(\n",
Expand Down
92 changes: 23 additions & 69 deletions tutorials/get-started-notebooks/quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@
"\n",
"> * Set up a handle to your Azure Machine Learning workspace\n",
"> * Create your training script\n",
"> * Create a scalable compute resource, a compute cluster or use [serverless compute (preview)](https://learn.microsoft.com/azure/machine-learning/how-to-use-serverless-compute?view=azureml-api-2&tabs=python) instead\n",
"> * Create and run a command job that will run the training script on the compute cluster, configured with the appropriate job environment\n",
"> * View the output of your training script\n",
"> * Deploy the newly-trained model as an endpoint\n",
Expand Down Expand Up @@ -90,14 +89,16 @@
"# authenticate\n",
"credential = DefaultAzureCredential()\n",
"\n",
"SUBSCRIPTION = \"<SUBSCRIPTION_ID>\"\n",
"RESOURCE_GROUP = \"<RESOURCE_GROUP>\"\n",
"WS_NAME = \"<AML_WORKSPACE_NAME>\"\n",
"# Get a handle to the workspace\n",
"ml_client = MLClient(\n",
" credential=credential,\n",
" subscription_id=\"<SUBSCRIPTION_ID>\",\n",
" resource_group_name=\"<RESOURCE_GROUP>\",\n",
" workspace_name=\"<AML_WORKSPACE_NAME>\",\n",
")\n",
"cpu_cluster = None"
" subscription_id=SUBSCRIPTION,\n",
" resource_group_name=RESOURCE_GROUP,\n",
" workspace_name=WS_NAME,\n",
")"
]
},
{
Expand All @@ -106,7 +107,19 @@
"metadata": {},
"source": [
"> [!NOTE]\n",
"> Creating MLClient will not connect to the workspace. The client initialization is lazy, it will wait for the first time it needs to make a call (in this notebook, that will happen in the cell that creates the compute cluster)."
"> Creating MLClient will not connect to the workspace. The client initialization is lazy, it will wait for the first time it needs to make a call (this will happen in the next code cell)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Verify that the handle works correctly.\n",
"# If you ge an error here, modify your SUBSCRIPTION, RESOURCE_GROUP, and WS_NAME in the previous cell.\n",
"ws = ml_client.workspaces.get(WS_NAME)\n",
"print(ws.location, \":\", ws.resource_group)"
]
},
{
Expand Down Expand Up @@ -270,63 +283,7 @@
"\n",
"You might need to select **Refresh** to see the new folder and script in your **Files**.\n",
"\n",
"![refresh](./media/refresh.png)\n",
"\n",
"## Create a compute cluster, a scalable way to run a training job (Optional)\n",
"\n",
"You can **skip this step** if you want to use **serverless compute** to run the training job. Through serverless compute, Azure Machine Learning takes care of creating, scaling, deleting, patching and managing compute, along with providing managed network isolation, reducing the burden on you. \n",
"\n",
"You already have a compute instance, which you're using to run the notebook. Now you'll add a second type of compute, a **compute cluster** that you'll use to run your training job. While a compute instance is a single node machine, a compute cluster can be single or multi-node machines with Linux or Windows OS, or a specific compute fabric like Spark.\n",
"\n",
"You'll provision a Linux compute cluster. See the [full list on VM sizes and prices](https://azure.microsoft.com/pricing/details/machine-learning/) .\n",
"\n",
"For this example, you only need a basic cluster, so you'll use a Standard_DS3_v2 model with 2 vCPU cores, 7-GB RAM."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from azure.ai.ml.entities import AmlCompute\n",
"\n",
"# Name assigned to the compute cluster\n",
"cpu_compute_target = \"cpu-cluster\"\n",
"\n",
"try:\n",
" # let's see if the compute target already exists\n",
" cpu_cluster = ml_client.compute.get(cpu_compute_target)\n",
" print(\n",
" f\"You already have a cluster named {cpu_compute_target}, we'll reuse it as is.\"\n",
" )\n",
"\n",
"except Exception:\n",
" print(\"Creating a new cpu compute target...\")\n",
"\n",
" # Let's create the Azure Machine Learning compute object with the intended parameters\n",
" # if you run into an out of quota error, change the size to a comparable VM that is available.\n",
" # Learn more on https://azure.microsoft.com/en-us/pricing/details/machine-learning/.\n",
" cpu_cluster = AmlCompute(\n",
" name=cpu_compute_target,\n",
" # Azure Machine Learning Compute is the on-demand VM service\n",
" type=\"amlcompute\",\n",
" # VM Family\n",
" size=\"STANDARD_DS3_V2\",\n",
" # Minimum running nodes when there is no job running\n",
" min_instances=0,\n",
" # Nodes in cluster\n",
" max_instances=4,\n",
" # How many seconds will the node running after the job termination\n",
" idle_time_before_scale_down=180,\n",
" # Dedicated or LowPriority. The latter is cheaper but there is a chance of job termination\n",
" tier=\"Dedicated\",\n",
" )\n",
" print(\n",
" f\"AMLCompute with name {cpu_cluster.name} will be created, with compute size {cpu_cluster.size}\"\n",
" )\n",
" # Now, we pass the object to MLClient's create_or_update method\n",
" cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)"
"![refresh](./media/refresh.png)"
]
},
{
Expand All @@ -339,10 +296,10 @@
"Now that you have a script that can perform the desired tasks, and a compute cluster to run the script, you'll use a general purpose **command** that can run command line actions. This command line action can directly call system commands or run a script. \n",
"\n",
"Here, you'll create input variables to specify the input data, split ratio, learning rate and registered model name. The command script will:\n",
"* Use the compute cluster to run the command if you created it or use serverless compute by not specifying any compute.\n",
"* Use an *environment* that defines software and runtime libraries needed for the training script. Azure Machine Learning provides many curated or ready-made environments, which are useful for common training and inference scenarios. You'll use one of those environments here. In the [Train a model](train-model.ipynb) tutorial, you'll learn how to create a custom environment. \n",
"* Configure the command line action itself - `python main.py` in this case. The inputs/outputs are accessible in the command via the `${{ ... }}` notation.\n",
"* In this sample, we access the data from a file on the internet. "
"* In this sample, we access the data from a file on the internet. \n",
"* Since a compute resource was not specified, the script will be run on a [serverless compute cluster](https://learn.microsoft.com/azure/machine-learning/how-to-use-serverless-compute?view=azureml-api-2&tabs=python) that is automatically created."
]
},
{
Expand Down Expand Up @@ -374,9 +331,6 @@
" code=\"./src/\", # location of source code\n",
" command=\"python main.py --data ${{inputs.data}} --test_train_ratio ${{inputs.test_train_ratio}} --learning_rate ${{inputs.learning_rate}} --registered_model_name ${{inputs.registered_model_name}}\",\n",
" environment=\"AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest\",\n",
" compute=\"cpu-cluster\"\n",
" if (cpu_cluster)\n",
" else None, # No compute needs to be passed to use serverless\n",
" display_name=\"credit_default_prediction\",\n",
")"
]
Expand Down
Loading

0 comments on commit 1ad4a12

Please sign in to comment.