You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Same as the offline, create JDBC sink and add it to the `MaterializationSettings`, set corresponding environment variables, then use it with `FeathrClient.materialize_features`.
110
-
111
-
## Using CosmosDb as the online store
112
-
113
-
To use CosmosDb as the online store, create `CosmosDbSink` and add it to the `MaterializationSettings`, then use it with `FeathrClient.materialize_features`, e.g..
Feathr client doesn't support getting feature values from CosmosDb, you need to use [official CosmosDb client](https://pypi.org/project/azure-cosmos/) to get the values:
123
-
124
-
```
125
-
from azure.cosmos import exceptions, CosmosClient, PartitionKey
To use ElasticSearch as the online store, create `ElasticSearchSink` and add it to the `MaterializationSettings`, then use it with `FeathrClient.materialize_features`, e.g..
Feathr client doesn't support getting feature values from ElasticSearch, you need to use [official ElasticSearch client](https://pypi.org/project/elasticsearch/) to get the values, e.g.:
147
-
148
-
```
149
-
from elasticsearch import Elasticsearch
150
-
151
-
es = Elasticsearch("http://esnode1:9200")
152
-
resp = es.get(index="someindex", id="somekey")
153
-
print(resp['_source'])
154
-
```
155
-
156
-
The feature generation job uses `upsert` mode to write data, so after the job the index may contain stale data, the recommended way is to create a new index each time, and use index alias to seamlessly switch over, detailed information can be found from [the official doc](https://www.elastic.co/guide/en/elasticsearch/reference/master/aliases.html), currently Feathr doesn't provide any helper to do this.
157
-
158
-
NOTE:
159
-
+ You can use no auth or basic auth only, no other authentication methods are supported.
160
-
+ If you enabled SSL, you need to make sure the certificate on ES nodes is trusted by the Spark cluster, otherwise the job will fail.
161
-
162
-
## Using ElasticSearch as offline store
163
-
164
-
To use ElasticSearch as the offline store, create `ElasticSearchSink` and use it with `FeathrClient.get_offline_features`, e.g..
NOTE: The feature joining process doesn't generate meaningful keys for each document, you need to make sure the output dataset can be accessed/queried by some other ways such as full-text-search, otherwise you may have to fetch all the data from ES to get what you look for.
108
+
Same as the offline, create JDBC sink and add it to the `MaterializationSettings`, set corresponding environment variables, then use it with `FeathrClient.materialize_features`.
This is the reference implementation of [the Feathr API spec](./api-spec.md), base on SQL databases instead of PurView.
4
-
5
-
Please note that this implementation uses iterations of `select` to retrieve graph lineages, this approach is very inefficient and should **not** be considered as production-ready. We only suggest to use this implementation for testing/researching purposes.
3
+
This is the reference implementation of [the Feathr API spec](./api-spec.md), base on Purview.
This is the reference implementation of [the Feathr API spec](./api-spec.md), base on SQL databases instead of PurView.
4
-
5
-
Please note that this implementation uses iterations of `select` to retrieve graph lineages, this approach is very inefficient and should **not** be considered as production-ready. We only suggest to use this implementation for testing/researching purposes.
3
+
This is the reference implementation of [the Feathr API spec](./api-spec.md), base on SQL databases.
0 commit comments