Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Parameter tuning for MADE at larger code distances (d=7, 9, 11, 13) | 请教大的distance下MADE参数(depth/width)的调优建议 #1

@Sungh314

Description

@Sungh314

Hi,

First of all, thank you for your excellent and pioneering work on this project and the accompanying paper!
I have been trying to reproduce your results and have found your codebase very helpful.

However, when running the MADE training command as follows:
python training.py -save True -n_type 'made' -c_type 'sur' -n 13 -d 3 -k 1 -seed 0 -er 0.189 -device 'cuda:0' -batch 10000 -epoch 50000 -depth 3 -width 20

I noticed that the depth and width hyperparameters have a significant impact on the final network performance.

In your paper, the results for code distances from 3 to 13 all surpass the benchmark impressively. But in my experiments, starting from d=7, the default -depth 3 -width 20 settings seem less effective. For d=9 and above, even after extensive parameter search, it's quite difficult to find suitable hyperparameters, and the required GPU resources and training time are extremely high.

So my questions are:

  • Do you have any recommended hyperparameter settings (especially for depth and width) for larger code distances such as d=7, 9, 11, 13?

  • Are there any tricks or best practices to make the training more efficient at these larger distances?

Thank you so much for your help!

您好,非常感谢您开源的代码和前沿的研究工作!我在复现您的论文和代码时,受益良多。

在运行如下MADE训练指令时:
python training.py -save True -n_type 'made' -c_type 'sur' -n 13 -d 3 -k 1 -seed 0 -er 0.189 -device 'cuda:0' -batch 10000 -epoch 50000 -depth 3 -width 20

我发现depth和width这两个超参数对最终网络性能影响很大。

您的论文中,distance从3到13的结果都优于benchmark,非常令人印象深刻。但我在复现时,d=7开始,默认的-depth 3 -width 20效果就不是很好了。d=9及以上,哪怕花了很多时间搜索参数,也很难找到合适的设置,而且GPU资源和训练时间消耗非常大。

所以想请教:

  • 对于d=7、9、11、13这些较大distance,您是否有更推荐的depth和width等超参数设置?

  • 是否有提升大distance训练效率的技巧或经验?

非常感谢!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions