Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 16 additions & 5 deletions docs/domains/process_control/data_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ erDiagram

### Fledge OPC UA South Plugin

[Fledge](https://www.lfedge.org/projects/fledge/) provides support for sending data between various data sources and data destinations. The mapping below is for the [OPC UA South Pugin](https://fledge-iot.readthedocs.io/en/latest/plugins/fledge-south-opcua/index.html) that can be sent to message brokers like Kafka, Azure IoT Hub etc.
[Fledge](https://www.lfedge.org/projects/fledge/){target=_blank} provides support for sending data between various data sources and data destinations. The mapping below is for the [OPC UA South Pugin](https://fledge-iot.readthedocs.io/en/latest/plugins/fledge-south-opcua/index.html){target=_blank} that can be sent to message brokers like Kafka, Azure IoT Hub etc.

This mapping is performed by the [RTDIP Fledge to PCDM Component](../../sdk/code-reference/pipelines/transformers/spark/fledge_json_to_pcdm.md) and can be used in an [RTDIP Ingestion Pipeline.](../../sdk/pipelines/framework.md)

Expand All @@ -51,7 +51,7 @@ This mapping is performed by the [RTDIP Fledge to PCDM Component](../../sdk/code

### OPC Publisher

[OPC Publisher](https://learn.microsoft.com/en-us/azure/industrial-iot/overview-what-is-opc-publisher) connects to OPC UA assets and publishes data to the Microsoft Azure Cloud's IoT Hub.
[OPC Publisher](https://learn.microsoft.com/en-us/azure/industrial-iot/overview-what-is-opc-publisher){target=_blank} connects to OPC UA assets and publishes data to the Microsoft Azure Cloud's IoT Hub.

The mapping below is performed by the [RTDIP OPC Publisher to PCDM Component](../../sdk/code-reference/pipelines/transformers/spark/opc_publisher_json_to_pcdm.md) and can be used in an [RTDIP Ingestion Pipeline.](../../sdk/pipelines/framework.md)

Expand All @@ -62,9 +62,21 @@ The mapping below is performed by the [RTDIP OPC Publisher to PCDM Component](..
| OPC Publisher | StatusCode.Symbol | string | EVENTS| Status | string | Null values can be overriden in the [RTDIP OPC Publisher to PCDM Component](../../sdk/code-reference/pipelines/transformers/spark/opc_publisher_json_to_pcdm.md) |
| OPC Publisher | Value.Value | string | EVENTS | Value | dynamic | Converts Value into either a float number or string based on how it is received in the message |

### EdgeX
[EdgeX](https://www.lfedge.org/projects/edgexfoundry/){target=_blank} provides support for sending data between various data sources and data destinations.

This mapping is performed by the [RTDIP EdgeX to PCDM Component](../../sdk/code-reference/pipelines/transformers/spark/edgex_json_to_pcdm.md) and can be used in an [RTDIP Ingestion Pipeline.](../../sdk/pipelines/framework.md)

| From Data Model | From Field | From Type | To Data Model |To Field| To Type | Mapping Logic |
|------|----|---------|------|------|--------|-----------|
| EdgeX | deviceName | string | EVENTS| TagName | string | |
| EdgeX | origin | string | EVENTS| EventTime | timestamp | Converted to a timestamp |
| | | | EVENTS| Status | string | Can be defaulted in [RTDIP EdgeX to PCDM Component](../../sdk/code-reference/pipelines/transformers/spark/edgex_json_to_pcdm.md) otherwise Null |
| EdgeX | value | string | EVENTS | Value | dynamic | Converts Value into either a float number or string based on how it is received in the message |

### SSIP PI

[SSIP PI](https://bakerhughesc3.ai/oai-solution/shell-sensor-intelligence-platform/) connects to Osisoft PI Historians and sends the data to the Cloud.
[SSIP PI](https://bakerhughesc3.ai/oai-solution/shell-sensor-intelligence-platform/){target=_blank} connects to Osisoft PI Historians and sends the data to the Cloud.

The mapping below is performed by the RTDIP SSIP PI to PCDM Component and can be used in an [RTDIP Ingestion Pipeline.](../../sdk/pipelines/framework.md)

Expand All @@ -73,5 +85,4 @@ The mapping below is performed by the RTDIP SSIP PI to PCDM Component and can be
| SSIP PI | TagName | string | EVENTS| TagName | string | |
| SSIP PI | EventTime | string | EVENTS| EventTime | timestamp | |
| SSIP PI | Status | string | EVENTS| Status | string | |
| SSIP PI | Value | dynamic | EVENTS | Value | dynamic | |

| SSIP PI | Value | dynamic | EVENTS | Value | dynamic | |
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Convert Edge Xpert Json to Process Control Data Model
::: src.sdk.python.rtdip_sdk.pipelines.transformers.spark.edgex_json_to_pcdm
1 change: 1 addition & 0 deletions docs/sdk/pipelines/components.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ Transformers are components that perform transformations on data. These will tar
|[Binary To String](../code-reference/pipelines/transformers/spark/binary_to_string.md)||:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
|[OPC Publisher Json To Process Control Data Model](../code-reference/pipelines/transformers/spark/opc_publisher_json_to_pcdm.md)||:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
|[Fledge Json To Process Control Data Model](../code-reference/pipelines/transformers/spark/fledge_json_to_pcdm.md)||:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
|[EdgeX Json To Process Control Data Model](../code-reference/pipelines/transformers/spark/edgex_json_to_pcdm.md)||:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
|[SSIP PI Binary Files To Process Control Data Model](../code-reference/pipelines/transformers/spark/ssip_pi_binary_file_to_pcdm.md)||:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|
|[SSIP PI Binary JSON To Process Control Data Model](../code-reference/pipelines/transformers/spark/ssip_pi_binary_json_to_pcdm.md)||:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|:heavy_check_mark:|

Expand Down
2 changes: 2 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,8 @@ nav:
- Fledge Json To Process Control Data Model: sdk/code-reference/pipelines/transformers/spark/fledge_json_to_pcdm.md
- SSIP PI Binary File data To Process Control Data Model: sdk/code-reference/pipelines/transformers/spark/ssip_pi_binary_file_to_pcdm.md
- SSIP PI Binary JSON data To Process Control Data Model: sdk/code-reference/pipelines/transformers/spark/ssip_pi_binary_json_to_pcdm.md
- EdgeX JSON data To Process Control Data Model: sdk/code-reference/pipelines/transformers/spark/edgex_json_to_pcdm.md

- Destinations:
- Spark:
- Delta: sdk/code-reference/pipelines/destinations/spark/delta.md
Expand Down
21 changes: 20 additions & 1 deletion src/sdk/python/rtdip_sdk/pipelines/_pipeline_utils/spark.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,4 +145,23 @@ def get_dbutils(
StructField("asset", StringType(), True),
StructField("readings", MapType(StringType(), StringType(), True), True),
StructField("timestamp", TimestampType(), True)])
)
)

EDGEX_SCHEMA = StructType([
StructField('apiVersion', StringType(), True),
StructField('id', StringType(), True),
StructField('deviceName', StringType(), True),
StructField('profileName', StringType(), True),
StructField('sourceName', StringType(), True),
StructField('origin', LongType(), True),
StructField('readings', ArrayType(
StructType([
StructField('id', StringType(), True),
StructField('origin', LongType(), True),
StructField('deviceName', StringType(), True),
StructField('resourceName', StringType(), True),
StructField('profileName', StringType(), True),
StructField('valueType', StringType(), True),
StructField('value', StringType(), True)]))
, True)
])
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@
import time
from pyspark.sql import DataFrame
from py4j.protocol import Py4JJavaError
from pyspark.sql.functions import to_json, struct


from ..interfaces import DestinationInterface
from ..._pipeline_utils.models import Libraries, SystemType
Expand Down Expand Up @@ -81,6 +83,7 @@ def write_batch(self):
try:
return (
self.data
.select(to_json(struct("*")).alias("value"))
.write
.format("kafka")
.options(**self.options)
Expand All @@ -101,6 +104,7 @@ def write_stream(self):
try:
query = (
self.data
.select(to_json(struct("*")).alias("value"))
.writeStream
.format("kafka")
.options(**self.options)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Copyright 2022 RTDIP
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from pyspark.sql import DataFrame
from pyspark.sql.functions import from_json, col, explode, when, lit, coalesce, to_timestamp, from_unixtime

from ..interfaces import TransformerInterface
from ..._pipeline_utils.models import Libraries, SystemType
from ..._pipeline_utils.spark import EDGEX_SCHEMA

class EdgeXJsonToPCDMTransformer(TransformerInterface):
'''
Converts a Spark Dataframe column containing a json string created by EdgeX to the Process Control Data Model

Args:
data (DataFrame): Dataframe containing the column with EdgeX data
status_null_value (str): If populated, will replace 'Good' in the Status column with the specified value.
change_type_value (str): If populated, will replace 'insert' in the ChangeType column with the specified value.
'''
data: DataFrame
status_null_value: str
change_type_value: str

def __init__(self, data: DataFrame, status_null_value: str = "Good", change_type_value: str = "insert") -> None:
self.data = data
self.status_null_value = status_null_value
self.change_type_value = change_type_value

@staticmethod
def system_type():
'''
Attributes:
SystemType (Environment): Requires PYSPARK
'''
return SystemType.PYSPARK

@staticmethod
def libraries():
libraries = Libraries()
return libraries

@staticmethod
def settings() -> dict:
return {}

def pre_transform_validation(self):
return True

def post_transform_validation(self):
return True

def transform(self) -> DataFrame:
'''
Returns:
DataFrame: A dataframe with the specified column converted to PCDM
'''
df = (
self.data
.withColumn("body", from_json("body", EDGEX_SCHEMA))
.select("*", explode("body.readings"))
.selectExpr("body.deviceName as TagName", "to_utc_timestamp(to_timestamp((col.origin / 1000000000)), current_timezone()) as EventTime", "col.value as Value", "col.valueType as ValueType")
.withColumn("Status", lit(self.status_null_value))
.withColumn("ChangeType", lit(self.change_type_value))
.withColumn("ValueType", (when(col("ValueType") == "Int8", "integer")
.when(col("ValueType") == "Int16", "integer")
.when(col("ValueType") == "Int32", "integer")
.when(col("ValueType") == "Int64", "integer")
.when(col("ValueType") == "Uint8", "integer")
.when(col("ValueType") == "Uint16", "integer")
.when(col("ValueType") == "Uint32", "integer")
.when(col("ValueType") == "Uint64", "integer")
.when(col("ValueType") == "Float32", "float")
.when(col("ValueType") == "Float64", "float")
.when(col("ValueType") == "Bool", "bool")
.otherwise("string")))
)

return df.select("TagName", "EventTime", "Status", "Value", "ValueType", "ChangeType")
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Copyright 2022 RTDIP
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import sys
sys.path.insert(0, '.')
from src.sdk.python.rtdip_sdk.pipelines.transformers.spark.edgex_json_to_pcdm import EdgeXJsonToPCDMTransformer
from src.sdk.python.rtdip_sdk.pipelines._pipeline_utils.models import Libraries, SystemType
from tests.sdk.python.rtdip_sdk.pipelines._pipeline_utils.spark_configuration_constants import spark_session

from pyspark.sql import SparkSession, DataFrame
from pyspark.sql.types import StructType, StructField, StringType, TimestampType
from datetime import datetime, timezone

def test_edgex_json_to_pcdm(spark_session: SparkSession):
edgex_json_data = '{"apiVersion":"v2","id":"test","deviceName":"testDevice","profileName":"test","sourceName":"Bool","origin":1683866798739958852,"readings":[{"id":"test","origin":1683866798739958852,"deviceName":"test","resourceName":"Bool","profileName":"Random","valueType":"Bool","value":"true"}]}'
edgex_df: DataFrame = spark_session.createDataFrame([{"body": edgex_json_data}])

expected_schema = StructType([
StructField("TagName", StringType(), True),
StructField("EventTime", TimestampType(), True),
StructField("Status", StringType(), False),
StructField("Value", StringType(), True),
StructField("ValueType", StringType(), False),
StructField("ChangeType", StringType(), False),
])

expected_data = [
{"TagName":"testDevice", "EventTime": datetime.fromisoformat("2023-05-12 04:46:38.739958"), "Status":"Good", "Value":"true", "ValueType":"bool", "ChangeType": "insert"}
]

expected_df: DataFrame = spark_session.createDataFrame(
schema=expected_schema,
data=expected_data
)

eventhub_json_to_edgex_transformer = EdgeXJsonToPCDMTransformer(edgex_df)
actual_df = eventhub_json_to_edgex_transformer.transform()

assert eventhub_json_to_edgex_transformer.system_type() == SystemType.PYSPARK
assert isinstance(eventhub_json_to_edgex_transformer.libraries(), Libraries)
assert expected_schema == actual_df.schema
assert expected_df.collect() == actual_df.collect()