Add pb model save #976

Oceania2018 · 2023-02-03T13:19:44Z

@AsakusaRinne Thanks for your contribution. It looks good to merge.

Oceania2018 · 2023-02-04T01:39:26Z

All the test passed.

AsakusaRinne · 2023-02-04T13:43:01Z

src/TensorFlowNET.Core/Checkpoint/functional_saver.cs

+            return Func.DynamicInvoke(args);
+        }
+    }
+    public class Maybe<TA, TB>


@Oceania2018 Please help to review this class. It seems to have an ugly design.😂 What I want is a class that allows to be either of two types. The two type often do not inherit from common interface. Using object is a solution but it does not have a strong constraint of type.

Can we just remove FunctionHolder? use lambda instead?
Are you using this because the number of input args is uncertainty?
If that's the case, is it possible to use Tensors as the args container?
It will be looks like: internal Tensors FunctionHolder(Tensors args);

A better approach is we can try to update Tensors class, change the List<Tensor> to List<ITensorOrOperation>.

The case that the number of input args is uncertain is one of the reasons of designing Maybe. The other is that sometimes a return value of a function may be one type or another. If I separate the function to two, the place which call the function has to deal with two return types. The more the nest of calls, the more extra work it needs.

Thank you for your advice and I've removed Functionholder.

AsakusaRinne · 2023-02-04T17:10:01Z

Till now the development of pb format model saving has been finished. However, it's still phased. The following features are not supported:

The save of checkpoints and assets.
The save of signatures (if it's used).
Some layers of keras has not been supported (mainly cropping, zero_padding and those related with RNN).

Besides, saving the model training by tf with self-defined model has not been verified. Currently it's recommended to use keras with the pb format model saving.

Then, the following records are for developers of tf.net:

A member dependencies was added to SavedObject, which is a class related with protobuf serialization. Some methods were not correspondingly changed such as GetHashCode and MergeFrom. Though everything looks good now, I'm not sure if adding this member would cause some problems in the future.
AutoTrackable uses reflection to automatically get the properties which are needed to be tracked. However, the getters of some properties contain extra process. Therefore the reflection may lead to unexpected behavior. It may be considered to Define an attribute to specify the properties to track.
In order to realize the feature, two "context managers" are added, SaveContext and SharedObjectSavingScope. However, not only the design may not be the best, but also it's not thread-safe. In the future the thread-safe of these two class should be supported.
Break changes were introduced into MySaveableObject. It supports the MySaveableObject contains either Tensor or Variable. The behavior of old APIs of this class was slightly changed.
Args and Config of keras were both defined to inherits IKerasConfig, which is somehow misleading. It should be changed if there's a better design in the future.

AdrienDeverin · 2023-03-27T15:17:13Z

Hi @AsakusaRinne, thank you for the add of this essential feature.
I see that some layers definitions aren't done for the loading part (and maybe also for the saving part).

In my exemple I use a GlobalAveragePooling2D layer, I don't get any issues during the saving part but I threw an Exeception during the loading part.
The issue occure in "generic_utils.cs" during the "deserialize_keras_object(string class_name, JToken config)" function.
Among others things, argType is null if class_name == "GlobalAveragePooling2D", which seems normal since I didn't see it in the Tensorflow.Binding.Keras.ArgsDefinition.

Did you mind if I add myself the missing feature in the binding (supposing that I just need to take others ArgsDef files as exemple) ? Or else could you add this feature to the project ?

Thank you.

One more question : Is a load of .onnx files supported or did I need to build/use a converter .pb to .onnx and vis versa ?

AdrienDeverin · 2023-03-29T14:19:54Z

I add one issue in my case.
After a load I cannot save my model anymore. The learning works correctly.
I got an issue during the return of the _trackable_children() function, saying that KerasAPI already exist in the Dictionnay.
I will look in details the origin of the bug.

AsakusaRinne · 2023-03-30T11:43:48Z

@AdrienDeverin Hi, thank you for reporting us this missing of feature. The model loading is now in rapid development and may be unstable. I'll fix your problem once I finish the feature I'm now working on.

One more question : Is a load of .onnx files supported or did I need to build/use a converter .pb to .onnx and vis versa ?

The loading of onnx file is on our schedule but unfortunately is now currently supported, please convert the onnx file to pb first. :)

AdrienDeverin · 2023-03-30T16:11:03Z

@AsakusaRinne Hi, thank to you.
I have already build a patch to this problem (code below).

Issue : When we load a model (from a folder with correct .pb file), learn, and save, we get an issue during the saving part

After some research it seem that your save function have a little bug when save or load the _name attribut of Tf.Variable. Basically, you don't get the layer name/type but only is type.
To explain with an exemple, you have this.SharedName = "bias/" and not "dense_1/bias/".

My assumption : After a load, it didn't cause a problem during the learning since you don't use it during it, but it cause a issue when saved a second time as graph variable because you get multiple time the same name when you have more than one layer ("bias/" and "kernel/").

This bug could be temporally corrected by changed the function map_resources() in BaseResourceVariable.cs by something like this :

public override (IDictionary<Trackable, Trackable>, IDictionary<Tensor, Tensor>) map_resources(SaveOptions save_options)
        {
            BaseResourceVariable new_variable;
            if (save_options.experimental_variable_policy.save_variable_devices())
            {
                tf.device(this.Device);
                Debug.Assert(this is ResourceVariable);
                new_variable = resource_variable_ops.copy_to_graph_uninitialized((ResourceVariable)this);
            }
            else
            {
                // Patch here ! I let you do something more optimize (or directly change it at the origin of the problem) //
                string basename = this.Name.Split('/')[0];
                string sharedname = this._name.Split('/')[0]; // _name is currently in protected mode. I set it to public temporally to acces
                if (basename != sharedname)
                {
                    this._name = basename + "/" + this._name;
                }
                ////////////////////////////////////////////////////////////////////////////////////////
                new_variable = resource_variable_ops.copy_to_graph_uninitialized((ResourceVariable)this);
            }
            Dictionary<Trackable, Trackable> obj_map = new();
            Dictionary<Tensor, Tensor> resource_map = new();
            obj_map[this] = new_variable;
            resource_map[this.handle] = new_variable.handle;
            return (obj_map, resource_map);
        }

An other bug come from your way to use a Linq function. In _trackable_children() function of Functionnal you use Concat() and ToDictionnary() which cause a problem when we have multiple time the same element (occure due to the _unconditional_checkpoint_dependencies elements). If we change it by Union() it will resolve the problem.

public override IDictionary<string, Trackable> _trackable_children(SaveType save_type = SaveType.CHECKPOINT, IDictionary<string, IDictionary<Trackable, ISerializedAttributes>>? cache = null)
        {
            return LayerCheckpointDependencies.ToDictionary(x => x.Key, x => x.Value.GetTrackable()).Union(base._trackable_children(save_type, cache) ).ToDictionary(x => x.Key, x => x.Value);
        }

Hope it will help you and others people who get this problem.
Thank you again for all your work, these features were necessary

AdrienDeverin · 2023-03-30T16:21:15Z

And for the first issue with GlobalAveragePooling2D Layers, just adding a
GlobalAveragePooling2DArgs.cs file in Tensorflow.Biding.Keras.Pooling resolve it. So it was nothing finally :)

using Newtonsoft.Json;
namespace Tensorflow.Keras.ArgsDefinition
{
    public class GlobalAveragePooling2DArgs : Pooling2DArgs
    {
    }
}

AsakusaRinne added 13 commits January 14, 2023 15:45

Add check for dims of x and y in model.fit.

726b742

Init the serialization of keras pb model.

bb8168b

Add more facilities to the saved model framework.

c4114d5

Add ListWrapper and ITrackable, and revise implmentations.

ddd06ab

Add serialized attributes.

bdca3b5

Implement layer serializations.

b92b08d

Add lacked implementations (mainly MultiDeviceSaver).

83906b8

Support autograph.to_graph under graph mode.

f2e41a1

Add more implementations to the pb model save.

a479e53

Add more implementations to the keras part of pb model save.

2ab0bdb

Merge branch 'master' into add_pb_model_save

c9c339d

Refine some code after merge.

59c1770

Add two simple sequential test case of pb model save.

6c07778

AsakusaRinne added 2 commits February 4, 2023 21:03

Implement serializing attributes other keras arg definitions.

ad541a7

Add alexnet pb save test.

88fe402

AsakusaRinne reviewed Feb 4, 2023

View reviewed changes

Check and refine the code.

3a6a59e

Oceania2018 merged commit 197224f into SciSharp:master Feb 4, 2023

This was referenced Feb 5, 2023

Ability to save a trained model #932

Closed

I've been training the model in Python for hours，How to load the trained model？ #977

Closed

Oceania2018 added this to the Load Model milestone Mar 1, 2023

Oceania2018 added the enhancement New feature or request label Mar 1, 2023

Oceania2018 self-assigned this Mar 1, 2023

AdrienDeverin mentioned this pull request Apr 4, 2023

[Feature Request]: Possibility to save Cropping and Concatenate layer in Tensorflow.Keras #1017

Open

AsakusaRinne mentioned this pull request Apr 27, 2023

[BUG Report]: An item with the same key has already been added. Key: keras_api #1026

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pb model save #976

Add pb model save #976

Oceania2018 commented Feb 3, 2023 •

edited

Loading

Oceania2018 commented Feb 4, 2023

AsakusaRinne Feb 4, 2023

Oceania2018 Feb 4, 2023 •

edited

Loading

AsakusaRinne Feb 4, 2023

AsakusaRinne commented Feb 4, 2023 •

edited

Loading

AdrienDeverin commented Mar 27, 2023

AdrienDeverin commented Mar 29, 2023

AsakusaRinne commented Mar 30, 2023

AdrienDeverin commented Mar 30, 2023 •

edited

Loading

AdrienDeverin commented Mar 30, 2023

Add pb model save #976

Add pb model save #976

Conversation

Oceania2018 commented Feb 3, 2023 • edited Loading

Oceania2018 commented Feb 4, 2023

AsakusaRinne Feb 4, 2023

Choose a reason for hiding this comment

Oceania2018 Feb 4, 2023 • edited Loading

Choose a reason for hiding this comment

AsakusaRinne Feb 4, 2023

Choose a reason for hiding this comment

AsakusaRinne commented Feb 4, 2023 • edited Loading

AdrienDeverin commented Mar 27, 2023

AdrienDeverin commented Mar 29, 2023

AsakusaRinne commented Mar 30, 2023

AdrienDeverin commented Mar 30, 2023 • edited Loading

Issue : When we load a model (from a folder with correct .pb file), learn, and save, we get an issue during the saving part

AdrienDeverin commented Mar 30, 2023

Oceania2018 commented Feb 3, 2023 •

edited

Loading

Oceania2018 Feb 4, 2023 •

edited

Loading

AsakusaRinne commented Feb 4, 2023 •

edited

Loading

AdrienDeverin commented Mar 30, 2023 •

edited

Loading