StyleGAN→BigGAN: import the StyleGAN large 8x512 FC _z_ → _w_ embedding trick

It would be nice to make BigGAN more like StyleGAN in a few ways, primarily its stability. StyleGAN also seems to have a better latent space with more disentangled and linearized factors, which is great for editing but may also play a role in the stability. 

As far back as IllustrationGAN, people have noted that feeding the _z_ embedding (which is typically just a bunch of random normals, like 128 N(0,1)s) straight into the CNN doesn't work as well as feeding _z_ into 1 or 2 FC layers first. Me & FeepingCreature noticed this for WGAN as well. BigGAN experimented with a variety of other starting distributions, but kept N(0,1) for simplicity. StyleGAN took the breathtaking step of plopping in no less than *8* FC layers to transform _z_ into... something, before passing the final resulting _w_ into the rest of StyleGAN. (It's also worth noting that StackGAN's implementation appears to pass its text embeddings through at least 1 FC layer as part of its "noise augmentation", which may be part of why its text→image works and [ours doesn't](https://github.com/tensorfork/tensorfork/issues/10).) The theory is that the 'true' latent space of the data distribution is extremely nonlinear and complex and non-normal and that starting with N(0,1)s is extremely unhelpful to the convolution layers trying to create something realistic, but that passing _z_ through a deep stack of FC layers lets it be massaged into something that encodes in an easy-to-understand-way everything the rest of StyleGAN needs to know, and reduces the need for things like global attention layers to enforce consistency.

Perhaps this would be useful for BigGAN? We can simply paste in an 8x512 block after _z_ and see how it goes. As tweaks go, this one should be *very* easy to do and potentially quite helpful, and so is relatively high priority. If it works well, we can consider whether self-attention layers would work even better. (The better this works, the more evidence it provides for the idea of an all-attention GAN.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StyleGAN→BigGAN: import the StyleGAN large 8x512 FC _z_ → _w_ embedding trick #26

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

StyleGAN→BigGAN: import the StyleGAN large 8x512 FC _z_ → _w_ embedding trick #26

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions