Skip to content

alisonpeard/paretogan

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modifications to Pareto GAN

Modifications are made to the original Pareto GAN code which supports this paper.

Pareto GAN shows that a GAN cannot generate data with heavier tails than the latent space, which generally follows a light-tailed distribution (such as Gaussian). This paper (and many others) make changes to the architecture and training procedures of GANs to improve training on heavy-tailed data.

This repo investigates whether a simpler approach can achieve similar results. Specifically, transforming training data to a Gaussian distribution before training as normal, then transforming back to the original distribution. We call this method cauchy2gaussian. If this approach is acceptable, it avoids the need for more complex modifications to GANs, which have generally been developed for Gaussian data.

Results

Histograms left to right: normal, pareto, cauchy2gaussian

Result metrics are shown in the table below. The KS statistic measures the maximum distance between the empirical cumulative distribution functions of the generated and real data. Lower is better. The area function computes the area between the log-log complementary cumulative distribution functions (CCDFs). This metric is particularly sensitive to tail behavior in heavy-tailed distributions. Lower is better.

Experiment KS Statistic (↓) Log-Log Area (↓)
normal 0.0332 52.35
pareto 0.0123 1.97
cauchy2gaussian 0.0059 2.07

Even though the training data is a mixture of Cauchy distributions, we find that fitting another Cauchy distribution to the mixture provides adequate results. This indicates that the method may still be suitable for data composed of a mixture of unknown heavy-tailed distributions, provided the fitted distribution provides a reasonably good approximation.

Pareto GAN

Install dependencies

pip install torch numpy matplotlib pandas scipy

Note: we recommend installing torch with GPU support

Run an experiment

python exps.py -ds 0 -type normal -seed 1000
python exps.py -ds 0 -type pareto -seed 1000
python exps.py -ds 0 -type cauchy2gaussian -seed 1000

Options

GAN type (-type):

  • pareto
  • uniform
  • normal
  • lognormal

Dataset (-ds):

  • 0: Dual Cauchy

Note: real datasets may not be available anymore. Dual Cauchy is a good "dataset" to illustrate the concept.

About

Modifications to the original ParetoGAN code to test a simpler alternative approach.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages

  • Python 100.0%