feathr-ai · Yuqing-cat · Jun 21, 2022 · Jun 20, 2022 · Jun 20, 2022 · blrchen
diff --git a/docs/concepts/feature-definition.md b/docs/concepts/feature-definition.md
@@ -28,14 +28,14 @@ batch_source = HdfsSource(name="nycTaxiBatchSource",
                           timestamp_format="yyyy-MM-dd HH:mm:ss")
 ```
 
-See the [Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.source.HdfsSource) to get the details on each input column.
+See the [Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.HdfsSource) to get the details on each input column.
 
 ## Step2: Define Anchors and Features
 A feature is called an anchored feature when the feature is directly 
 extracted from the source data, rather than computed on top of other features. The latter case is called derived feature.
 
-Check [Feature Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.feature.Feature)
-and [Anchor Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.anchor.FeatureAnchor) to see more details.
+Check [Feature Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.Feature)
+and [Anchor Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.FeatureAnchor) to see more details.
 
 Here is a sample:
 
@@ -100,8 +100,7 @@ Feature(name="f_location_max_fare",
                                           window="90d"))
 ```
 
-
-Note that the `agg_func`([API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.aggregation.Aggregation)) should be any of these:
+Note that the `agg_func`([API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.Aggregation)) should be any of these:
 
 | Aggregation Type | Input Type | Description |
 | --- | --- | --- |
@@ -125,9 +124,9 @@ request_anchor = FeatureAnchor(name="request_features",
 Note that if the data source is from the observation data, the `source` section should be `INPUT_CONTEXT` to indicate the source of those defined anchors.
 
 ## Step3: Derived Features Section
-Derived features([Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.feature_derivations.DerivedFeature)) 
-are the features that are computed from other features. They could be computed from anchored features, or other derived features.
 
+Derived features([Python API documentation](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.DerivedFeature))
+are the features that are computed from other features. They could be computed from anchored features, or other derived features.
 
 ```python
 f_trip_distance = Feature(name="f_trip_distance",

diff --git a/docs/concepts/feature-generation.md b/docs/concepts/feature-generation.md
@@ -20,8 +20,9 @@ settings = MaterializationSettings("nycTaxiMaterializationJob",
                                    feature_names=["f_location_avg_fare", "f_location_max_fare"])
 client.materialize_features(settings)
 ```
-([MaterializationSettings API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.materialization_settings.MaterializationSettings),
-[RedisSink API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.sink.RedisSink))
+
+([MaterializationSettings API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.MaterializationSettings),
+[RedisSink API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.RedisSink)
 
 In the above example, we define a Redis table called `nycTaxiDemoFeature` and materialize two features called `f_location_avg_fare` and `f_location_max_fare` to Redis.
 
@@ -37,8 +38,9 @@ settings = MaterializationSettings("nycTaxiMaterializationJob",
                                    backfill_time=backfill_time)
 client.materialize_features(settings)
 ```
-([BackfillTime API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.materialization_settings.BackfillTime),
-[client.materialize_features() API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.client.FeathrClient.materialize_features))
+
+([BackfillTime API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.BackfillTime),
+[client.materialize_features() API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.FeathrClient.materialize_features))
 
 ## Consuming the online features
 
@@ -48,7 +50,8 @@ client.wait_job_to_finish(timeout_sec=600)
 res = client.get_online_features('nycTaxiDemoFeature', '265', [
                                      'f_location_avg_fare', 'f_location_max_fare'])
 ```
-([client.get_online_features API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.client.FeathrClient.get_online_features))
+
+([client.get_online_features API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.FeathrClient.get_online_features))
 
 After we finish running the materialization job, we can get the online features by querying the feature name, with the 
 corresponding keys. In the example above, we query the online features called `f_location_avg_fare` and 
@@ -59,6 +62,7 @@ corresponding keys. In the example above, we query the online features called `f
 This is a useful when the feature transformation is computation intensive and features can be re-used. For example, you 
 have a feature that needs more than 24 hours to compute and the feature can be reused by more than one model training 
 pipeline. In this case, you should consider generate features to offline. Here is an API example:
+
 ```python
 client = FeathrClient()
 offlineSink = HdfsSink(output_path="abfss://[email protected]/materialize_offline_test_data/")
@@ -68,11 +72,12 @@ settings = MaterializationSettings("nycTaxiMaterializationJob",
                                    feature_names=["f_location_avg_fare", "f_location_max_fare"])
 client.materialize_features(settings)
 ```
+
 This will generate features on latest date(assuming it's `2022/05/21`) and output data to the following path: 
 `abfss://[email protected]/materialize_offline_test_data/df0/daily/2022/05/21`
 
-
 You can also specify a BackfillTime so the features will be generated for those dates. For example:
+
 ```Python
 backfill_time = BackfillTime(start=datetime(
     2020, 5, 20), end=datetime(2020, 5, 20), step=timedelta(days=1))
@@ -83,8 +88,9 @@ settings = MaterializationSettings("nycTaxiTable",
                                        "f_location_avg_fare", "f_location_max_fare"],
                                    backfill_time=backfill_time)
 ```
-This will generate features only for 2020/05/20 for me and it will be in folder: 
+
+This will generate features only for 2020/05/20 for me and it will be in folder:
 `abfss://[email protected]/materialize_offline_test_data/df0/daily/2020/05/20`
 
-([MaterializationSettings API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.materialization_settings.MaterializationSettings),
-[HdfsSink API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.sink.HdfsSink))
+([MaterializationSettings API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.MaterializationSettings),
+[HdfsSink API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.HdfsSink))
diff --git a/docs/concepts/feature-join.md b/docs/concepts/feature-join.md
@@ -70,8 +70,9 @@ client.get_offline_features(observation_settings=settings,
                             output_path="abfss://[email protected]/demo_data/output.avro")
 
 ```
-([ObservationSettings API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.settings.ObservationSettings), 
-[client.get_offline_feature API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.client.FeathrClient.get_offline_features))
+
+([ObservationSettings API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.ObservationSettings),
+[client.get_offline_feature API doc](https://feathr.readthedocs.io/en/latest/feathr.html#feathr.FeathrClient.get_offline_features))
 
 After you have defined the features (as described in the [Feature Definition](feature-definition.md)) part, you can define how you want to join them.
 

diff --git a/docs/dev_guide/update_python_docs.md b/docs/dev_guide/update_python_docs.md
@@ -119,3 +119,7 @@ So that only this module is accessbile for end users.
 ### Debug and Known Issues
 * `No module named xyz`: Readthedocs need to run the code to generated the docs. So if your dependency is not specified
 in the docs/requirements.txt, it will fail on this. To fix it, specify the dependency in requirements.txt.
+
+## Update the Documentation Links
+
+If your change will affect the Python Doc url link, please remember to check and update related links in `feathr/docs` folder.
diff --git a/feathr_project/feathr/__init__.py b/feathr_project/feathr/__init__.py
@@ -45,6 +45,7 @@
     'MaterializationSettings',
     'MonitoringSettings',
     'RedisSink',
+    'HdfsSink',
     'MonitoringSqlSink',
     'FeatureQuery',
     'LookupFeature',