Train models with strong privacy, even when you can't trust anyone.
tf-shell is a TensorFlow extension that uses Homomorphic Encryption (HE) to
train models with centralized label differential privacy (Label DP) guarantees,
without requiring a trusted third party.
It's built for the "vertically partitioned" scenario where one party has the features and another party has the labels. This library implements the protocols from the Hadal research paper to securely train a model without the feature-holder ever seeing the plain-text labels.
This is not an officially supported Google product.
pip install tf-shellSee ./examples/ for how to use the library.
When training with high privacy requirements (e.g.,
tf-shell uses Homomorphic Encryption (via Google's
SHELL library) to
cryptographically "simulate" the trusted curator. The core technical idea is
based on the Features-And-Model-vs-Labels (FAML) data partitioning:
- Forward Pass (Plaintext): Party F (with features and the model) computes the entire forward pass in plaintext, right up to the final layer's logits.
- Encrypted Labels: Party L encrypts its batch of labels using HE and sends the single ciphertext to Party F.
- Backward Pass (Encrypted): The gradient of the loss (e.g., CCE with Softmax)
is often a simple affine function of the labels (like
$\hat{y} - y$ ). Party F can compute this step homomorphically using its plaintext logits and Party L's encrypted labels. - Model Update: Party F finishes the backpropagation, adds the required DP noise, and updates its model weights.
The result is a model trained with the high utility of centralized DP, but Party F never sees Party L's individual labels.
The library is split into two packages:
- tf_shell: The base package. It integrates TensorFlow with the SHELL library, providing a ShellTensor type for basic HE-enabled computations.
- tf_shell_ml: The machine learning library. It implements two different
protocols for the encrypted backpropagation step:
- POSTSCALE: A novel protocol that is highly efficient for models with a low number of output classes (e.g., binary classification).
- HE-DP-SGD: A more direct HE implementation of backpropagation, which is better suited for models with many output classes.
-
Install bazel and python3 or use the devcontainer.
-
Run the tests.
bazel test //tf_shell/... bazel test //tf_shell_ml/... # Large tests, requires 128GB of memory.
-
Build the code.
bazel build //:wheel bazel run //:wheel_rename
-
(Optional) Install the wheel, e.g. to try out the
./examples/. You may first need to copy the wheel out of the devcontainer's filesystem.cp -f bazel-bin/*.whl ./ # Run in devcontainer if using.
Then install.
pip install --force-reinstall tf_shell-*.whl # Run in target environment.
Note the cpython api is not compatible across minor python versions (e.g. 3.10, 3.11) so the wheel must be rebuilt for each python version.
bazel run //:bazel_formatter
bazel run //:python_formatter
bazel run //:clang_formattercloc ./ --fullpath --not-match-d='/(bazel-.*|.*\.venv)/'Update requirements.in and run the following to update the requirements files for each python version.
for ver in 3_9 3_10 3_11 3_12; do
rm requirements_${ver}.txt
touch requirements_${ver}.txt
bazel run //:requirements_${ver}.update
done
bazel clean --expungeIf updating the tensorflow dependency, other dependencies may also need to
change, e.g. abseil (see MODULE.bazel). This issue usually manifests as a
missing symbols error in the tests when trying to import the tensorflow DSO. In
this case, c++filt will help to decode the mangled symbol name and nm --defined-only .../libtensorflow_framework.so | grep ... may help find what the
symbol changed to, and which dependency is causing the error.
See CONTRIBUTING.md for details.
Apache 2.0; see LICENSE for details.
Convolutions on AMD-based platforms may fail due to known limitations of TensorFlow. This will resulting in the following error when running tests:
CPU implementation of Conv3D currently only supports dilated rates of 1.
This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.