Thanks to visit codestin.com
Credit goes to github.com

Skip to content

StyleGAN→BigGAN: import the StyleGAN large 8x512 FC _z_ → _w_ embedding trick #26

@gwern

Description

@gwern

It would be nice to make BigGAN more like StyleGAN in a few ways, primarily its stability. StyleGAN also seems to have a better latent space with more disentangled and linearized factors, which is great for editing but may also play a role in the stability.

As far back as IllustrationGAN, people have noted that feeding the z embedding (which is typically just a bunch of random normals, like 128 N(0,1)s) straight into the CNN doesn't work as well as feeding z into 1 or 2 FC layers first. Me & FeepingCreature noticed this for WGAN as well. BigGAN experimented with a variety of other starting distributions, but kept N(0,1) for simplicity. StyleGAN took the breathtaking step of plopping in no less than 8 FC layers to transform z into... something, before passing the final resulting w into the rest of StyleGAN. (It's also worth noting that StackGAN's implementation appears to pass its text embeddings through at least 1 FC layer as part of its "noise augmentation", which may be part of why its text→image works and ours doesn't.) The theory is that the 'true' latent space of the data distribution is extremely nonlinear and complex and non-normal and that starting with N(0,1)s is extremely unhelpful to the convolution layers trying to create something realistic, but that passing z through a deep stack of FC layers lets it be massaged into something that encodes in an easy-to-understand-way everything the rest of StyleGAN needs to know, and reduces the need for things like global attention layers to enforce consistency.

Perhaps this would be useful for BigGAN? We can simply paste in an 8x512 block after z and see how it goes. As tweaks go, this one should be very easy to do and potentially quite helpful, and so is relatively high priority. If it works well, we can consider whether self-attention layers would work even better. (The better this works, the more evidence it provides for the idea of an all-attention GAN.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions