You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ml/automl/tables/kfp_e2e/README.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -76,9 +76,9 @@ Once a Pipelines installation is running, we can upload the example AutoML Table
76
76
Click on **Pipelines** in the left nav bar of the Pipelines Dashboard. Click on **Upload Pipeline**.
77
77
78
78
- For Cloud AI Platform Pipelines, upload [`tables_pipeline_caip.py.tar.gz`][36], from this directory. This archive points to the compiled version of [this pipeline][37], specified and compiled using the [Kubeflow Pipelines SDK][38].
79
-
- For Kubeflow Pipelines on a Kubeflow installation, upload [`tables_pipeline_kf.py.tar.gz`][39]. This archive points to the compiled version of [this pipeline][40].
79
+
- For Kubeflow Pipelines on a Kubeflow installation, upload [`tables_pipeline_kf.py.tar.gz`][39]. This archive points to the compiled version of [this pipeline][40].**To run this example on a KF installation, you will need to give the `<deployment-name>-user@<project-id>.iam.gserviceaccount.com` service account `AutoML Admin` privileges**.
80
80
81
-
> Note: The difference between the two pipelines relates to how GCP authentication is handled. For the Kubeflow pipeline, we’ve added `.apply(gcp.use_gcp_secret('user-gcp-sa'))` annotations to the pipeline steps. This tells the pipeline to use the mounted _secret_—set up during the installation process— that provides GCP account credentials. With the Cloud AI Platform Pipelines installation, the GKE cluster nodes have been set up to use the `cloud-platform` scope. With an upcoming Kubeflow release, specification of the mounted secret will no longer be necessary.
81
+
> Note: The difference between the two pipelines relates to how GCP authentication is handled. For the Kubeflow pipeline, we’ve added `.apply(gcp.use_gcp_secret('user-gcp-sa'))` annotations to the pipeline steps. This tells the pipeline to use the mounted _secret_—set up during the installation process— that provides GCP account credentials. With the Cloud AI Platform Pipelines installation, the GKE cluster nodes have been set up to use the `cloud-platform` scope. With recent Kubeflow releases, specification of the mounted secret is no longer necessary, but we include both versions for compatibility.
82
82
83
83
The uploaded pipeline graph will look similar to this:
84
84
@@ -88,7 +88,7 @@ The uploaded pipeline graph will look similar to this:
88
88
</figure>
89
89
90
90
Click the **+Create Run** button to run the pipeline. You will need to fill in some pipeline parameters.
91
-
Specifically, replace `YOUR_PROJECT_HERE` with the name of your project; replace `YOUR_DATASET_NAME` with the name you want to give your new dataset (make it unique, and use letters, numbers and underscores up to 32 characters); and replace `YOUR_BUCKET_NAME` with the name of a [GCS bucket][41]. This bucket should be in the [same _region_][42] as that specified by the `gcp_region` parameter. E.g., if you keep the default `us-central1` region, your bucket should also be a _regional_ (not multi-regional) bucket in the `us-central1` region. ++double check that this is necessary.++
91
+
Specifically, replace `YOUR_PROJECT_HERE` with the name of your project; replace `YOUR_DATASET_NAME` with the name you want to give your new dataset (make it unique, and use letters, numbers and underscores up to 32 characters); and replace `YOUR_BUCKET_NAME` with the name of a [GCS bucket][41]. Do not include the `gs://` prefix— just enter the name. This bucket should be in the [same _region_][42] as that specified by the `gcp_region` parameter. E.g., if you keep the default `us-central1` region, your bucket should also be a _regional_ (not multi-regional) bucket in the `us-central1` region. ++double check that this is necessary.++
92
92
93
93
If you want to schedule a recurrent set of runs, you can do that instead. If your data is in [BigQuery][43]— as is the case for this example pipeline— and has a temporal aspect, you could define a _view_ to reflect that, e.g. to return data from a window over the last `N` days or hours. Then, the AutoML pipeline could specify ingestion of data from that view, grabbing an updated data window each time the pipeline is run, and building a new model based on that updated window.
Copy file name to clipboardExpand all lines: ml/automl/tables/kfp_e2e/create_model_for_tables/tables_eval_metrics_component.yaml
+28-62Lines changed: 28 additions & 62 deletions
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@ inputs:
4
4
type: evals
5
5
- name: thresholds
6
6
type: String
7
-
default: '{"mean_absolute_error": 450}'
7
+
default: '{"mean_absolute_error": 460}'
8
8
optional: true
9
9
- name: confidence_threshold
10
10
type: Float
@@ -16,7 +16,7 @@ outputs:
16
16
- name: mlpipeline_metrics
17
17
type: UI_metrics
18
18
- name: deploy
19
-
type: Boolean
19
+
type: String
20
20
implementation:
21
21
container:
22
22
image: python:3.7
@@ -25,64 +25,35 @@ implementation:
25
25
- -u
26
26
- -c
27
27
- |
28
-
class OutputPath:
29
-
'''When creating component from function, OutputPath should be used as function parameter annotation to tell the system that the function wants to output data by writing it into a file with the given path instead of returning the data from the function.'''
'''When creating component from function, InputPath should be used as function parameter annotation to tell the system to pass the *data file path* to the function instead of passing the actual data.'''
0 commit comments