Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 1d6736b

Browse files
Added sample for geospatial classification (GoogleCloudPlatform#7291)
* Added sample for geospatial classification * Added noxfile * added emojis to notebook * Small updates to geospatial sample * updates to requirements txt * fix minor bugs in notebook * added requirmenets file * simplify prediction and data extraction logic * update tests * fix linter fail * fix linter and header fails * auth fix for tests * add noxfile to serving app * minor updates based on comments * small updates * update credentials * added staging bucket to deploy * update staging bucket * removed serving app noxfile * clarify use of timestamp in sample * fix permissions error * Added project to gcloud build * add GPU support * deploy from source code * update READMEs * added constrants file * added type hints * fix type hint errors * final updates to notebook * update notebook Co-authored-by: David Cavazos <[email protected]>
1 parent 588b100 commit 1d6736b

18 files changed

+8680
-0
lines changed

people-and-planet-ai/README.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,20 @@ location data.
4040
[Dataflow]: https://cloud.google.com/dataflow
4141
[Keras]: https://keras.io
4242
[Vertex AI]: https://cloud.google.com/vertex-ai
43+
44+
## 🏭 [Coal Plant Predictions -- _geospatial-classification_](geospatial-classification)
45+
46+
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/geospatial-classification/README.ipynb)
47+
48+
This model uses satellite data to predict if a coal plant is turned on and producing carbon emissions. The satellite data comes from [Google Earth Engine.](https://earthengine.google.com/)
49+
50+
* **Model**: 1D Fully Convolutional Network in [TensorFlow]
51+
* **Creating datasets**: [Sentinel-2] satellite data from [Earth Engine]
52+
* **Training the model**: [TensorFlow] in [Vertex AI]
53+
* **Getting predictions**: [TensorFlow] in [Cloud Run]
54+
55+
[Cloud Run]: https://cloud.google.com/run
56+
[Sentinel-2]: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2
57+
[Earth Engine]: https://earthengine.google.com/
58+
[TensorFlow]: https://www.tensorflow.org/
59+
[Vertex AI]: https://cloud.google.com/vertex-ai

people-and-planet-ai/geospatial-classification/README.ipynb

Lines changed: 1437 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# 🏭 Coal Plant Predictions -- _geospatial-classification_
2+
3+
[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GoogleCloudPlatform/python-docs-samples/blob/main/people-and-planet-ai/geospatial-classification/README.ipynb)
4+
5+
This model uses satellite data to predict if a coal plant is turned on and producing carbon emissions. The satellite data comes from [Google Earth Engine.](https://earthengine.google.com/)
6+
7+
* **Model**: 1D Fully Convolutional Network in [TensorFlow]
8+
* **Creating datasets**: [Sentinel-2] satellite data from [Earth Engine]
9+
* **Training the model**: [TensorFlow] in [Vertex AI]
10+
* **Getting predictions**: [TensorFlow] in [Cloud Run]
11+
12+
[Cloud Run]: https://cloud.google.com/run
13+
[Sentinel-2]: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2
14+
[Earth Engine]: https://earthengine.google.com/
15+
[TensorFlow]: https://www.tensorflow.org/
16+
[Vertex AI]: https://cloud.google.com/vertex-ai
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
google-auth==2.5.0
2+
google-cloud-bigquery==2.32.0
Lines changed: 356 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,356 @@
1+
# Copyright 2022 Google LLC
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from datetime import datetime, timedelta
16+
import logging
17+
import os
18+
import platform
19+
import subprocess
20+
import time
21+
from typing import NamedTuple
22+
import uuid
23+
24+
import ee
25+
import google.auth
26+
from google.cloud import aiplatform
27+
from google.cloud import storage
28+
import pandas as pd
29+
import pytest
30+
import requests
31+
32+
33+
PYTHON_VERSION = "".join(platform.python_version_tuple()[0:2])
34+
35+
NAME = f"ppai/geospatial-classification-py{PYTHON_VERSION}"
36+
37+
UUID = uuid.uuid4().hex[0:6]
38+
PROJECT = os.environ["GOOGLE_CLOUD_PROJECT"]
39+
REGION = "us-central1"
40+
41+
TIMEOUT_SEC = 30 * 60 # 30 minutes in seconds
42+
POLL_INTERVAL_SEC = 60 # 1 minute in seconds
43+
44+
VERTEX_AI_SUCCESS_STATE = "PIPELINE_STATE_SUCCEEDED"
45+
VERTEX_AI_FINISHED_STATE = {
46+
"PIPELINE_STATE_SUCCEEDED",
47+
"PIPELINE_STATE_FAILED",
48+
"PIPELINE_STATE_CANCELLED",
49+
}
50+
51+
EARTH_ENGINE_SUCCESS_STATE = "SUCCEEDED"
52+
EARTH_ENGINE_FINISHED_STATE = {"SUCCEEDED"}
53+
54+
BANDS = [
55+
"B1",
56+
"B2",
57+
"B3",
58+
"B4",
59+
"B5",
60+
"B6",
61+
"B7",
62+
"B8",
63+
"B8A",
64+
"B9",
65+
"B10",
66+
"B11",
67+
"B12",
68+
]
69+
LABEL = "is_powered_on"
70+
71+
IMAGE_COLLECTION = "COPERNICUS/S2"
72+
SCALE = 10
73+
74+
TRAIN_VALIDATION_SPLIT = 0.7
75+
76+
PATCH_SIZE = 16
77+
78+
credentials, _ = google.auth.default(
79+
scopes=["https://www.googleapis.com/auth/cloud-platform"]
80+
)
81+
ee.Initialize(credentials, project=PROJECT)
82+
83+
logging.getLogger().setLevel(logging.INFO)
84+
85+
86+
@pytest.fixture(scope="session")
87+
def bucket_name() -> str:
88+
storage_client = storage.Client()
89+
90+
bucket_name = f"{NAME.replace('/', '-')}-{UUID}"
91+
bucket = storage_client.create_bucket(bucket_name, location=REGION)
92+
93+
logging.info(f"bucket_name: {bucket_name}")
94+
yield bucket_name
95+
96+
bucket.delete(force=True)
97+
98+
99+
@pytest.fixture(scope="session")
100+
def test_data(bucket_name: str) -> str:
101+
labels_dataframe = pd.read_csv("labeled_geospatial_data.csv")
102+
train_dataframe = labels_dataframe.sample(
103+
frac=TRAIN_VALIDATION_SPLIT, random_state=200
104+
) # random state is a seed value
105+
validation_dataframe = labels_dataframe.drop(train_dataframe.index).sample(frac=1.0)
106+
107+
train_features = [labeled_feature(row) for row in train_dataframe.itertuples()]
108+
109+
validation_features = [
110+
labeled_feature(row) for row in validation_dataframe.itertuples()
111+
]
112+
113+
training_task = ee.batch.Export.table.toCloudStorage(
114+
collection=ee.FeatureCollection(train_features),
115+
description="Training image export",
116+
bucket=bucket_name,
117+
fileNamePrefix="geospatial_training",
118+
selectors=BANDS + [LABEL],
119+
fileFormat="TFRecord",
120+
)
121+
122+
training_task.start()
123+
124+
validation_task = ee.batch.Export.table.toCloudStorage(
125+
collection=ee.FeatureCollection(validation_features),
126+
description="Validation image export",
127+
bucket=bucket_name,
128+
fileNamePrefix="geospatial_validation",
129+
selectors=BANDS + [LABEL],
130+
fileFormat="TFRecord",
131+
)
132+
133+
validation_task.start()
134+
135+
train_status = None
136+
val_status = None
137+
138+
logging.info("Waiting for data export to complete.")
139+
for _ in range(0, TIMEOUT_SEC, POLL_INTERVAL_SEC):
140+
train_status = ee.data.getOperation(training_task.name)["metadata"]["state"]
141+
val_status = ee.data.getOperation(validation_task.name)["metadata"]["state"]
142+
if (
143+
train_status in EARTH_ENGINE_FINISHED_STATE
144+
and val_status in EARTH_ENGINE_FINISHED_STATE
145+
):
146+
break
147+
time.sleep(POLL_INTERVAL_SEC)
148+
149+
assert train_status == EARTH_ENGINE_SUCCESS_STATE
150+
assert val_status == EARTH_ENGINE_SUCCESS_STATE
151+
logging.info(f"Export finished with status {train_status}")
152+
153+
yield training_task.name
154+
155+
156+
def labeled_feature(row: NamedTuple) -> ee.FeatureCollection:
157+
start = datetime.fromisoformat(row.timestamp)
158+
end = start + timedelta(days=1)
159+
image = (
160+
ee.ImageCollection(IMAGE_COLLECTION)
161+
.filterDate(start.strftime("%Y-%m-%d"), end.strftime("%Y-%m-%d"))
162+
.select(BANDS)
163+
.mosaic()
164+
)
165+
point = ee.Feature(
166+
ee.Geometry.Point([row.lon, row.lat]),
167+
{LABEL: row.is_powered_on},
168+
)
169+
return (
170+
image.neighborhoodToArray(ee.Kernel.square(PATCH_SIZE))
171+
.sampleRegions(ee.FeatureCollection([point]), scale=SCALE)
172+
.first()
173+
)
174+
175+
176+
@pytest.fixture(scope="session")
177+
def container_image(bucket_name: str) -> str:
178+
# https://cloud.google.com/sdk/gcloud/reference/builds/submit
179+
container_image = f"gcr.io/{PROJECT}/{NAME}:{UUID}"
180+
subprocess.check_call(
181+
[
182+
"gcloud",
183+
"builds",
184+
"submit",
185+
"serving_app",
186+
f"--tag={container_image}",
187+
f"--project={PROJECT}",
188+
"--machine-type=e2-highcpu-8",
189+
"--timeout=15m",
190+
"--quiet",
191+
]
192+
)
193+
194+
logging.info(f"container_image: {container_image}")
195+
yield container_image
196+
197+
# https://cloud.google.com/sdk/gcloud/reference/container/images/delete
198+
subprocess.check_call(
199+
[
200+
"gcloud",
201+
"container",
202+
"images",
203+
"delete",
204+
container_image,
205+
f"--project={PROJECT}",
206+
"--force-delete-tags",
207+
"--quiet",
208+
]
209+
)
210+
211+
212+
@pytest.fixture(scope="session")
213+
def service_url(bucket_name: str, container_image: str) -> str:
214+
# https://cloud.google.com/sdk/gcloud/reference/run/deploy
215+
service_name = f"{NAME.replace('/', '-')}-{UUID}"
216+
subprocess.check_call(
217+
[
218+
"gcloud",
219+
"run",
220+
"deploy",
221+
service_name,
222+
f"--image={container_image}",
223+
"--command=gunicorn",
224+
"--args=--threads=8,--timeout=0,main:app",
225+
"--platform=managed",
226+
f"--project={PROJECT}",
227+
f"--region={REGION}",
228+
"--memory=1G",
229+
"--no-allow-unauthenticated",
230+
]
231+
)
232+
233+
# https://cloud.google.com/sdk/gcloud/reference/run/services/describe
234+
service_url = (
235+
subprocess.run(
236+
[
237+
"gcloud",
238+
"run",
239+
"services",
240+
"describe",
241+
service_name,
242+
"--platform=managed",
243+
f"--project={PROJECT}",
244+
f"--region={REGION}",
245+
"--format=get(status.url)",
246+
],
247+
capture_output=True,
248+
)
249+
.stdout.decode("utf-8")
250+
.strip()
251+
)
252+
253+
logging.info(f"service_url: {service_url}")
254+
yield service_url
255+
256+
# https://cloud.google.com/sdk/gcloud/reference/run/services/delete
257+
subprocess.check_call(
258+
[
259+
"gcloud",
260+
"run",
261+
"services",
262+
"delete",
263+
service_name,
264+
"--platform=managed",
265+
f"--project={PROJECT}",
266+
f"--region={REGION}",
267+
"--quiet",
268+
]
269+
)
270+
271+
272+
@pytest.fixture(scope="session")
273+
def identity_token() -> str:
274+
yield (
275+
subprocess.run(
276+
["gcloud", "auth", "print-identity-token", f"--project={PROJECT}"],
277+
capture_output=True,
278+
)
279+
.stdout.decode("utf-8")
280+
.strip()
281+
)
282+
283+
284+
@pytest.fixture(scope="session")
285+
def train_model(bucket_name: str) -> str:
286+
aiplatform.init(project=PROJECT, staging_bucket=bucket_name)
287+
job = aiplatform.CustomTrainingJob(
288+
display_name="climate_script_colab",
289+
script_path="task.py",
290+
container_uri="us-docker.pkg.dev/vertex-ai/training/tf-gpu.2-7:latest",
291+
)
292+
293+
job.run(
294+
accelerator_type="NVIDIA_TESLA_K80",
295+
accelerator_count=1,
296+
args=[f"--bucket={bucket_name}"],
297+
)
298+
299+
logging.info(f"train_model resource_name: {job.resource_name}")
300+
301+
# Wait until the model training job finishes.
302+
status = None
303+
logging.info("Waiting for model to train.")
304+
for _ in range(0, TIMEOUT_SEC, POLL_INTERVAL_SEC):
305+
# https://googleapis.dev/python/aiplatform/latest/aiplatform_v1/job_service.html
306+
status = job.state.name
307+
if status in VERTEX_AI_FINISHED_STATE:
308+
break
309+
time.sleep(POLL_INTERVAL_SEC)
310+
311+
logging.info(f"Model job finished with status {status}")
312+
assert status == VERTEX_AI_SUCCESS_STATE
313+
yield job.resource_name
314+
315+
316+
def get_prediction_data(lon: float, lat: float, start: str, end: str) -> dict:
317+
"""Extracts Sentinel image as json at specific lat/lon and timestamp."""
318+
319+
location = ee.Feature(ee.Geometry.Point([lon, lat]))
320+
image = (
321+
ee.ImageCollection(IMAGE_COLLECTION)
322+
.filterDate(start, end)
323+
.select(BANDS)
324+
.mosaic()
325+
)
326+
327+
feature = image.neighborhoodToArray(ee.Kernel.square(PATCH_SIZE)).sampleRegions(
328+
collection=ee.FeatureCollection([location]), scale=SCALE
329+
)
330+
331+
return feature.getInfo()["features"][0]["properties"]
332+
333+
334+
def test_predict(
335+
bucket_name: str,
336+
test_data: str,
337+
train_model: str,
338+
service_url: str,
339+
identity_token: str,
340+
) -> None:
341+
342+
# Test point
343+
prediction_data = get_prediction_data(
344+
-84.80529, 39.11613, "2021-10-01", "2021-10-31"
345+
)
346+
347+
# Make prediction
348+
response = requests.post(
349+
url=f"{service_url}/predict",
350+
headers={"Authorization": f"Bearer {identity_token}"},
351+
json={"data": prediction_data, "bucket": bucket_name},
352+
).json()
353+
354+
# Check that we get non-empty predictions.
355+
assert "predictions" in response["predictions"]
356+
assert len(response["predictions"]) > 0

0 commit comments

Comments
 (0)