Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit a47b08f

Browse files
committed
CAIP PLs remote deployment example
1 parent 505d0ef commit a47b08f

File tree

3 files changed

+419
-0
lines changed

3 files changed

+419
-0
lines changed
Lines changed: 355 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,355 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Using Google Cloud Functions to support event-based triggering of Cloud AI Platform Pipelines\n",
8+
"\n",
9+
"This example shows how you can run a [Cloud AI Platform Pipeline](https://cloud.google.com/blog/products/ai-machine-learning/introducing-cloud-ai-platform-pipelines) from a [Google Cloud Function](https://cloud.google.com/functions/docs/), thus providing a way for Pipeline runs to be triggered by events (in the interim before this is supported by Pipelines itself). \n",
10+
"\n",
11+
"In this example, the function is triggered by the addition of or update to a file in a [Google Cloud Storage](https://cloud.google.com/storage/) (GCS) bucket, but Cloud Functions can have other triggers too (including [Pub/Sub](https://cloud.google.com/pubsub/docs/)-based triggers).\n",
12+
"\n",
13+
"The example is Google Cloud Platform (GCP)-specific, and requires a [Cloud AI Platform Pipelines](https://cloud.google.com/ai-platform/pipelines/docs) installation using Pipelines version >= 0.4.\n"
14+
]
15+
},
16+
{
17+
"cell_type": "markdown",
18+
"metadata": {},
19+
"source": [
20+
"## Setup\n",
21+
"\n",
22+
"### Create a Cloud AI Platform Pipelines installation\n",
23+
"\n",
24+
"Follow the instructions in the [documentation](https://cloud.google.com/ai-platform/pipelines/docs) to create a Cloud AI Platform Pipelines installation. "
25+
]
26+
},
27+
{
28+
"cell_type": "markdown",
29+
"metadata": {},
30+
"source": [
31+
"### Identify (or create) a Cloud Storage bucket to use for the example"
32+
]
33+
},
34+
{
35+
"cell_type": "markdown",
36+
"metadata": {},
37+
"source": [
38+
"**Before executing the next cell**, edit it to **set the `TRIGGER_BUCKET` environment variable** to a Google Cloud Storage bucket ([create a bucket first](https://console.cloud.google.com/storage/browser) if necessary). Do *not* include the `gs://` prefix in the bucket name.\n",
39+
"\n",
40+
"We'll deploy the GCF function so that it will trigger on new and updated files (blobs) in this bucket."
41+
]
42+
},
43+
{
44+
"cell_type": "code",
45+
"execution_count": null,
46+
"metadata": {},
47+
"outputs": [],
48+
"source": [
49+
"%env TRIGGER_BUCKET=REPLACE_WITH_YOUR_GCS_BUCKET_NAME"
50+
]
51+
},
52+
{
53+
"cell_type": "markdown",
54+
"metadata": {},
55+
"source": [
56+
"### Give Cloud Function's service account the necessary access\n",
57+
"\n",
58+
"First, make sure the Cloud Function API [is enabled](https://console.cloud.google.com/apis/library/cloudfunctions.googleapis.com?q=functions).\n",
59+
"\n",
60+
"Functions uses the project's 'appspot'acccount for its service account. It will have the form: \n",
61+
"` [email protected]`. (This is also the App Engine service account).\n",
62+
"\n",
63+
"- Go to your project's [IAM - Service Account page](https://console.cloud.google.com/iam-admin/serviceaccounts).\n",
64+
"- Find the ` [email protected]` account and copy its email address.\n",
65+
"- Find the project's Compute Engine (GCE) default service account (this is the default account used for the Pipelines installation). It will have a form like this: `[email protected]`.\n",
66+
" Click the checkbox next to the GCE service account, and in the 'INFO PANEL' to the right, click **ADD MEMBER**. Add the Functions service account (`[email protected]`) as a **Project Viewer** of the GCE service account. "
67+
]
68+
},
69+
{
70+
"cell_type": "markdown",
71+
"metadata": {},
72+
"source": [
73+
"Next, configure your `TRIGGER_BUCKET` to allow the Functions service account access to that bucket. \n",
74+
"\n",
75+
"- Navigate in the console to your list of buckets in the [Storage Browser](https://console.cloud.google.com/storage/browser).\n",
76+
"- Click the checkbox next to the `TRIGGER_BUCKET`. In the 'INFO PANEL' to the right, click **ADD MEMBER**. Add the service account (`[email protected]`) with `Storage Object Admin` permissions."
77+
]
78+
},
79+
{
80+
"cell_type": "markdown",
81+
"metadata": {},
82+
"source": [
83+
"## Create a simple GCF function to test your configuration\n",
84+
"\n",
85+
"First we'll generate and deploy a simple GCF function, to test that the basics are properly configured. "
86+
]
87+
},
88+
{
89+
"cell_type": "code",
90+
"execution_count": null,
91+
"metadata": {},
92+
"outputs": [],
93+
"source": [
94+
"%%bash\n",
95+
"mkdir -p functions"
96+
]
97+
},
98+
{
99+
"cell_type": "markdown",
100+
"metadata": {},
101+
"source": [
102+
"We'll first create a `requirements.txt` file, to indicate what packages the GCF code requires to be installed. (We won't actually need `kfp` for this first 'sanity check' version of a GCF function, but we'll need it below for the second function we'll create, that deploys a pipeline)."
103+
]
104+
},
105+
{
106+
"cell_type": "code",
107+
"execution_count": null,
108+
"metadata": {},
109+
"outputs": [],
110+
"source": [
111+
"%%writefile functions/requirements.txt\n",
112+
"kfp"
113+
]
114+
},
115+
{
116+
"cell_type": "markdown",
117+
"metadata": {},
118+
"source": [
119+
"Next, we'll create a simple GCF function in the `functions/main.py` file:"
120+
]
121+
},
122+
{
123+
"cell_type": "code",
124+
"execution_count": null,
125+
"metadata": {},
126+
"outputs": [],
127+
"source": [
128+
"%%writefile functions/main.py\n",
129+
"import logging\n",
130+
"\n",
131+
"def gcs_test(data, context):\n",
132+
" \"\"\"Background Cloud Function to be triggered by Cloud Storage.\n",
133+
" This generic function logs relevant data when a file is changed.\n",
134+
"\n",
135+
" Args:\n",
136+
" data (dict): The Cloud Functions event payload.\n",
137+
" context (google.cloud.functions.Context): Metadata of triggering event.\n",
138+
" Returns:\n",
139+
" None; the output is written to Stackdriver Logging\n",
140+
" \"\"\"\n",
141+
"\n",
142+
" logging.info('Event ID: {}'.format(context.event_id))\n",
143+
" logging.info('Event type: {}'.format(context.event_type))\n",
144+
" logging.info('Data: {}'.format(data))\n",
145+
" logging.info('Bucket: {}'.format(data['bucket']))\n",
146+
" logging.info('File: {}'.format(data['name']))\n",
147+
" file_uri = 'gs://%s/%s' % (data['bucket'], data['name'])\n",
148+
" logging.info('Using file uri: %s', file_uri)\n",
149+
"\n",
150+
" logging.info('Metageneration: {}'.format(data['metageneration']))\n",
151+
" logging.info('Created: {}'.format(data['timeCreated']))\n",
152+
" logging.info('Updated: {}'.format(data['updated']))"
153+
]
154+
},
155+
{
156+
"cell_type": "markdown",
157+
"metadata": {},
158+
"source": [
159+
"Deploy the GCF function as follows. (You'll need to wait a moment or two for output of the deployment to display in the notebook). You can also run this command from a notebook terminal window in the `functions` subdirectory."
160+
]
161+
},
162+
{
163+
"cell_type": "code",
164+
"execution_count": null,
165+
"metadata": {},
166+
"outputs": [],
167+
"source": [
168+
"%%bash\n",
169+
"cd functions\n",
170+
"gcloud functions deploy gcs_test --runtime python37 --trigger-resource ${TRIGGER_BUCKET} --trigger-event google.storage.object.finalize"
171+
]
172+
},
173+
{
174+
"cell_type": "markdown",
175+
"metadata": {},
176+
"source": [
177+
"After you've deployed, test your deployment by adding a file to the specified `TRIGGER_BUCKET`. Then check in the logs viewer panel (https://console.cloud.google.com/logs/viewer) to confirm that the GCF function was triggered and ran correctly.\n"
178+
]
179+
},
180+
{
181+
"cell_type": "markdown",
182+
"metadata": {},
183+
"source": [
184+
"## Deploy a Pipeline from a GCF function\n",
185+
"\n",
186+
"Next, we'll create a GCF function that deploys an AI Platform Pipeline when triggered. First, preserve your existing main.py in a backup file:"
187+
]
188+
},
189+
{
190+
"cell_type": "code",
191+
"execution_count": null,
192+
"metadata": {},
193+
"outputs": [],
194+
"source": [
195+
"%%bash\n",
196+
"cd functions\n",
197+
"mv main.py main.py.bak"
198+
]
199+
},
200+
{
201+
"cell_type": "markdown",
202+
"metadata": {},
203+
"source": [
204+
"Then, **before executing the next cell**, **edit the `HOST` variable** in the code below. To find this URL, visit the [Pipelines panel](https://console.cloud.google.com/ai-platform/pipelines/) in the Cloud Console. \n",
205+
"\n",
206+
"From here, you can find the URL by clicking on the **SETTINGS** link for the Pipelines installation you want to use, and copying the 'host' string displayed in the client example code (prepend `https://` to that string). \n",
207+
"You can alternately click on **OPEN PIPELINES DASHBOARD** for the Pipelines installation, and copy that URL, removing the `/#/pipelines` suffix."
208+
]
209+
},
210+
{
211+
"cell_type": "code",
212+
"execution_count": null,
213+
"metadata": {},
214+
"outputs": [],
215+
"source": [
216+
"%%writefile functions/main.py\n",
217+
"import logging\n",
218+
"import datetime\n",
219+
"import logging\n",
220+
"import time\n",
221+
" \n",
222+
"import kfp\n",
223+
"import kfp.compiler as compiler\n",
224+
"import kfp.dsl as dsl\n",
225+
" \n",
226+
"import requests\n",
227+
" \n",
228+
"# TODO: replace with your Pipelines endpoint URL\n",
229+
"HOST = 'https://<yours>.pipelines.googleusercontent.com'\n",
230+
" \n",
231+
"@dsl.pipeline(\n",
232+
" name='Sequential',\n",
233+
" description='A pipeline with two sequential steps.'\n",
234+
")\n",
235+
"def sequential_pipeline(filename='gs://ml-pipeline-playground/shakespeare1.txt'):\n",
236+
" \"\"\"A pipeline with two sequential steps.\"\"\"\n",
237+
" op1 = dsl.ContainerOp(\n",
238+
" name='filechange',\n",
239+
" image='library/bash:4.4.23',\n",
240+
" command=['sh', '-c'],\n",
241+
" arguments=['echo \"%s\" > /tmp/results.txt' % filename],\n",
242+
" file_outputs={'newfile': '/tmp/results.txt'})\n",
243+
" op2 = dsl.ContainerOp(\n",
244+
" name='echo',\n",
245+
" image='library/bash:4.4.23',\n",
246+
" command=['sh', '-c'],\n",
247+
" arguments=['echo \"%s\"' % op1.outputs['newfile']]\n",
248+
" )\n",
249+
" \n",
250+
"def get_access_token():\n",
251+
" url = 'http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token'\n",
252+
" r = requests.get(url, headers={'Metadata-Flavor': 'Google'})\n",
253+
" r.raise_for_status()\n",
254+
" access_token = r.json()['access_token']\n",
255+
" return access_token\n",
256+
" \n",
257+
"def hosted_kfp_test(data, context):\n",
258+
" logging.info('Event ID: {}'.format(context.event_id))\n",
259+
" logging.info('Event type: {}'.format(context.event_type))\n",
260+
" logging.info('Data: {}'.format(data))\n",
261+
" logging.info('Bucket: {}'.format(data['bucket']))\n",
262+
" logging.info('File: {}'.format(data['name']))\n",
263+
" file_uri = 'gs://%s/%s' % (data['bucket'], data['name'])\n",
264+
" logging.info('Using file uri: %s', file_uri)\n",
265+
" \n",
266+
" logging.info('Metageneration: {}'.format(data['metageneration']))\n",
267+
" logging.info('Created: {}'.format(data['timeCreated']))\n",
268+
" logging.info('Updated: {}'.format(data['updated']))\n",
269+
" \n",
270+
" token = get_access_token() \n",
271+
" logging.info('attempting to launch pipeline run.')\n",
272+
" ts = int(datetime.datetime.utcnow().timestamp() * 100000)\n",
273+
" client = kfp.Client(host=HOST, existing_token=token)\n",
274+
" compiler.Compiler().compile(sequential_pipeline, '/tmp/sequential.tar.gz')\n",
275+
" exp = client.create_experiment(name='gcstriggered') # this is a 'get or create' op\n",
276+
" res = client.run_pipeline(exp.id, 'sequential_' + str(ts), '/tmp/sequential.tar.gz',\n",
277+
" params={'filename': file_uri})\n",
278+
" logging.info(res)\n",
279+
"\n"
280+
]
281+
},
282+
{
283+
"cell_type": "markdown",
284+
"metadata": {},
285+
"source": [
286+
"Next, deploy the new GCF function. As before, it will take a moment or two for the results of the deployment to display in the notebook."
287+
]
288+
},
289+
{
290+
"cell_type": "code",
291+
"execution_count": null,
292+
"metadata": {},
293+
"outputs": [],
294+
"source": [
295+
"%%bash\n",
296+
"cd functions\n",
297+
"gcloud functions deploy hosted_kfp_test --runtime python37 --trigger-resource ${TRIGGER_BUCKET} --trigger-event google.storage.object.finalize"
298+
]
299+
},
300+
{
301+
"cell_type": "markdown",
302+
"metadata": {},
303+
"source": [
304+
"Add another file to your `TRIGGER_BUCKET`. This time you should see both GCF functions triggered. The `hosted_kfp_test` function will deploy the pipeline. You'll be able to see it running at your Pipeline installation's endpoint, `https://<deployment-name>.endpoints.<project>.cloud.goog/pipeline`, under the given Pipelines Experiment (`gcstriggered` as default)."
305+
]
306+
},
307+
{
308+
"cell_type": "markdown",
309+
"metadata": {},
310+
"source": [
311+
"------------------------------------------\n",
312+
"Copyright 2020, Google, LLC.\n",
313+
"Licensed under the Apache License, Version 2.0 (the \"License\");\n",
314+
"you may not use this file except in compliance with the License.\n",
315+
"You may obtain a copy of the License at\n",
316+
"\n",
317+
" http://www.apache.org/licenses/LICENSE-2.0\n",
318+
"\n",
319+
"Unless required by applicable law or agreed to in writing, software\n",
320+
"distributed under the License is distributed on an \"AS IS\" BASIS,\n",
321+
"WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
322+
"See the License for the specific language governing permissions and\n",
323+
"limitations under the License."
324+
]
325+
},
326+
{
327+
"cell_type": "code",
328+
"execution_count": null,
329+
"metadata": {},
330+
"outputs": [],
331+
"source": []
332+
}
333+
],
334+
"metadata": {
335+
"kernelspec": {
336+
"display_name": "Python 3",
337+
"language": "python",
338+
"name": "python3"
339+
},
340+
"language_info": {
341+
"codemirror_mode": {
342+
"name": "ipython",
343+
"version": 3
344+
},
345+
"file_extension": ".py",
346+
"mimetype": "text/x-python",
347+
"name": "python",
348+
"nbconvert_exporter": "python",
349+
"pygments_lexer": "ipython3",
350+
"version": "3.6.8"
351+
}
352+
},
353+
"nbformat": 4,
354+
"nbformat_minor": 2
355+
}

0 commit comments

Comments
 (0)