Liftorch

About Liftorch:

Liftorch is a Pytorch extension that allows the user to optimize neural networks based on the relaxed formulation proposed by El Ghaoui et Al. In this framework, we replace the classical optimization problem of training a neural network:

By its relaxed formulation:

Where stands for the loss function, are penalties imposed on the weights, are the activation functions and their associated divergence on the feasible set . The divergences are convex functions such as

Thus, when is minimum, we have which is the contraint imposed on the classical optimization problem.

This lifted formulation has been shown to provide excellent initial values for the initialization of the layers weights. For this reason, Liftorch aims at providing an easy way to solve this optimization problem for feedforward neural networks implemented in Pytorch.

How it works:

Liftorch propose an extension of Pytorch torch.nn.Module class that keeps the same functionnalities while adding methods to solve the relaxed optimization problem. For instance, we consider the case of a 3 layers feed forward classifier with ReLU activation functions and a cross_entropy loss. The Pytorch implementation of such a network would be:

from torch import nn
from torch.nn import functional as F
class classifier(nn.Module):
    def __init__(self):
        super(classifier, self).__init__()
        self.layer1 = nn.Linear(10, 20)
        self.layer2 = nn.Linear(20, 8)
        self.layer3 = nn.Linear(8, 2)
    def forward(self, inputs):
        inputs = F.relu(self.layer1(inputs))
        inputs = F.relu(self.layer2(inputs))
        return self.layer3(inputs)

With Liftorch, the implementation is almost the same, at the exception that we have to explicitely declare which activation function we use after each layer:

from torch import nn
from torch.nn import functional as F
from liftorch.modules import LiftedModule

class classifier(LiftedModule):
    def __init__(self):
        super(classifier, self).__init__()
        self.layer1 = nn.Linear(10, 20)
        self.layer2 = nn.Linear(20, 8)
        self.layer3 = nn.Linear(8, 2)
        self.set_graph({
            'layer1':'relu',
            'layer2':'relu',
            'layer3':'id',
        })
    def forward(self, inputs):
        inputs = F.relu(self.layer1(inputs))
        inputs = F.relu(self.layer2(inputs))
        return self.layer3(inputs)

In Pytorch, a basic training would look like:

# X in a batch of 100 observations of 10 variables each and y are the associated labels

my_model = classifier()

from torch import optim
optimizer = optim.Adam(my_model.parameters(), lr=0.005)
for epoch in range(10):
    optimizer.zero_grad()
    loss = F.cross_entropy(my_model(X), y)
    loss.backward()
    optimizer.step()

Liftorch allows solving for the relaxed problem, by providing an easy access to the loss

via the get_lifted_loss(X,y,lambda) method. This minimization problem can be solve with only a few adjustement to the previous code:

my_model = classifier()

# Declare the size of the training batch.
my_model.set_batch_size(100)
# Initialize X_l with a gaussian distribution
my_model.initialize_activations(distrib=nn.init.normal_)
# Declare the loss function to uses
my_model.loss_function = F.cross_entropy

# We call all_parameters to optimize over W_l and X_l
optimizer = optim.Adam(my_model.all_parameters(), lr=0.005)

for epoch in range(10):
    optimizer.zero_grad()
    loss, domain = my_model.get_lifted_loss(X, y, lambd=0.1)
    loss.backward()
    optimizer.step()
    # We project the X_l tensors on their domains (Dom_l)
    my_model.project_activations(domain)

Once this problem solved, we can return to the code above to fine-tune the optimization, leveraging the weights we obtained by solving the relaxed problem as a very good initialization. We can also use our model as it is to make predictions, using the usual Pytorch syntax.

Other optimization methods:

As suggested in the original paper, one may wish to update and in a block-coordinate fashion, to take advantage of the fact that the relaxed problem is convex and can be parallelised in . Liftorch propose some methods to obtain only the loss related to certain parameters:

Optimizing on the layer parameters ()

my_model.get_W_loss(layer = 'layer_i', inputs=X) will return the loss related to the parameters of the layer named 'layer_i', that is to say, using the previous notations:

Please note that inputs is only needed for computing the loss related to the first layer. To optimize only on a certain layer, we can pass to our optimizer the following generator: optimizer = optim.Adam(my_model.W_parameters('layer_i'), lr=0.01). To optimize on all the layers, the usual optimizer = optim.Adam(my_model.parameters(), lr=0.01) works.

Optimizing on the activations ()

In this case, my_model.get_X_loss(layer = 'layer_i', inputs=X, y=y) will return the loss related to the parameters (which is, in the usual forward pass, the tensor obtained after composition by 'layer_i' and its activation function). In the case of an hidden layer, this loss is:

For the last layer, this loss becomes:

Please note that for this method, inputs is only needed for the first layer, and y is only needed for the last. To optimize only on , we can pass to our optimizer the generator my_model.X_parameters('layer_i'). To optimize on all the , we can use my_model.X_parameters() (same method, without argument).

Activation functions:

Are currently supported the following activation functions (everything must be understood point-wise):


identity
`relu`
`sigmoid`
`tanh`

Development

This package is in its early development stage. Only feedforward networks with Linear layers are supported. Convolutional layers should come soon. Better algorithms for box-constraint optimization should be implemented, as the projected gradient method seems to have some limits. Heuristics to find and methods to make the optimization simpler might come.

Every contribution is welcomed. For any question or suggestion, don't hesitate to open an issue or email me at [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
liftorch		liftorch
tests		tests
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Liftorch

About Liftorch:

How it works:

Other optimization methods:

Optimizing on the layer parameters ()

Optimizing on the activations ()

Activation functions:

Development

About

Uh oh!

Releases

Packages

Languages

PForet/Liftorch

Folders and files

Latest commit

History

Repository files navigation

Liftorch

About Liftorch:

How it works:

Other optimization methods:

Optimizing on the layer parameters ()

Optimizing on the activations ()

Activation functions:

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages