You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Keras Data Processor (KDP) now provides advanced numerical embedding techniques to better capture complex numerical relationships in your data. This release introduces two embedding approaches:
4
+
5
+
---
6
+
7
+
## AdvancedNumericalEmbedding
8
+
9
+
**Purpose:**
10
+
Processes individual numerical features with tailored embedding layers. This layer performs adaptive binning, applies MLP transformations per feature, and can incorporate dropout and batch normalization.
11
+
12
+
**Key Parameters:**
13
+
-**`embedding_dim`**: Dimension for each feature's embedding.
14
+
-**`mlp_hidden_units`**: Number of hidden units in the MLP applied to each feature.
15
+
-**`num_bins`**: Number of bins used for discretizing continuous inputs.
16
+
-**`init_min` and `init_max`**: Initialization boundaries for binning.
17
+
-**`dropout_rate`**: Dropout rate for regularization.
18
+
-**`use_batch_norm`**: Flag to apply batch normalization.
19
+
20
+
**Usage Example:**
21
+
```python
22
+
from kdp.custom_layers import AdvancedNumericalEmbedding
23
+
import tensorflow as tf
24
+
25
+
layer = AdvancedNumericalEmbedding(
26
+
embedding_dim=8,
27
+
mlp_hidden_units=16,
28
+
num_bins=10,
29
+
init_min=[-3.0, -2.0, -4.0],
30
+
init_max=[3.0, 2.0, 4.0],
31
+
dropout_rate=0.1,
32
+
use_batch_norm=True,
33
+
)
34
+
35
+
# Input shape: (batch_size, num_features)
36
+
x = tf.random.normal((32, 3))
37
+
# Output shape: (32, 3, 8)
38
+
output = layer(x, training=False)
39
+
```
40
+
41
+
---
42
+
43
+
## GlobalAdvancedNumericalEmbedding
44
+
45
+
**Purpose:**
46
+
Combines a set of numerical features into a single, compact representation. It does so by applying an internal advanced numerical embedding on the concatenated input and then performing a global pooling over all features.
47
+
48
+
**Key Parameters (prefixed with `global_`):**
49
+
-**`global_embedding_dim`**: Global embedding dimension (final pooled vector size).
50
+
-**`global_mlp_hidden_units`**: Hidden units in the global MLP.
51
+
-**`global_num_bins`**: Number of bins for discretization.
52
+
-**`global_init_min` and `global_init_max`**: Global initialization boundaries.
53
+
-**`global_dropout_rate`**: Dropout rate.
54
+
-**`global_use_batch_norm`**: Whether to apply batch normalization.
55
+
-**`global_pooling`**: Pooling method to use ("average" or "max").
56
+
57
+
**Usage Example:**
58
+
```python
59
+
from kdp.custom_layers import GlobalAdvancedNumericalEmbedding
60
+
import tensorflow as tf
61
+
62
+
global_layer = GlobalAdvancedNumericalEmbedding(
63
+
global_embedding_dim=8,
64
+
global_mlp_hidden_units=16,
65
+
global_num_bins=10,
66
+
global_init_min=[-3.0, -2.0],
67
+
global_init_max=[3.0, 2.0],
68
+
global_dropout_rate=0.1,
69
+
global_use_batch_norm=True,
70
+
global_pooling="average"
71
+
)
72
+
73
+
# Input shape: (batch_size, num_features)
74
+
x = tf.random.normal((32, 3))
75
+
# Global output shape: (32, 8)
76
+
global_output = global_layer(x, training=False)
77
+
```
78
+
79
+
---
80
+
81
+
## When to Use Which?
82
+
83
+
-**AdvancedNumericalEmbedding:**
84
+
Use this when you need to process each numerical feature individually, preserving their distinct characteristics via per-feature embeddings.
85
+
86
+
-**GlobalAdvancedNumericalEmbedding:**
87
+
Choose this option when you want to merge multiple numerical features into a unified global embedding using a pooling mechanism. This is particularly useful when the overall interaction across features is more important than the individual feature details.
88
+
89
+
## Advanced Configuration
90
+
91
+
Both layers offer additional parameters to fine-tune the embedding process. You can adjust dropout rates, batch normalization, and binning strategies to best suit your data. For more detailed information, please refer to the API documentation.
92
+
93
+
---
94
+
95
+
This document highlights the key differences and usage examples for the new advanced numerical embeddings available in KDP.
0 commit comments