Parameter tuning for MADE at larger code distances (d=7, 9, 11, 13) | 请教大的distance下MADE参数(depth/width)的调优建议

Hi,

First of all, thank you for your excellent and pioneering work on this project and the accompanying paper! 
I have been trying to reproduce your results and have found your codebase very helpful.

However, when running the MADE training command as follows:
`python training.py -save True -n_type 'made' -c_type 'sur' -n 13 -d 3 -k 1 -seed 0 -er 0.189 -device 'cuda:0' -batch 10000 -epoch 50000 -depth 3 -width 20`

I noticed that the depth and width hyperparameters have a significant impact on the final network performance.

In your paper, the results for code distances from 3 to 13 all surpass the benchmark impressively. But in my experiments, starting from d=7, the default -depth 3 -width 20 settings seem less effective. For d=9 and above, even after extensive parameter search, it's quite difficult to find suitable hyperparameters, and the required GPU resources and training time are extremely high.

So my questions are:

- Do you have any recommended hyperparameter settings (especially for depth and width) for larger code distances such as d=7, 9, 11, 13?

- Are there any tricks or best practices to make the training more efficient at these larger distances?

Thank you so much for your help!

您好，非常感谢您开源的代码和前沿的研究工作！我在复现您的论文和代码时，受益良多。

在运行如下MADE训练指令时：
`python training.py -save True -n_type 'made' -c_type 'sur' -n 13 -d 3 -k 1 -seed 0 -er 0.189 -device 'cuda:0' -batch 10000 -epoch 50000 -depth 3 -width 20`

我发现depth和width这两个超参数对最终网络性能影响很大。

您的论文中，distance从3到13的结果都优于benchmark，非常令人印象深刻。但我在复现时，d=7开始，默认的-depth 3 -width 20效果就不是很好了。d=9及以上，哪怕花了很多时间搜索参数，也很难找到合适的设置，而且GPU资源和训练时间消耗非常大。

所以想请教：

- 对于d=7、9、11、13这些较大distance，您是否有更推荐的depth和width等超参数设置？

- 是否有提升大distance训练效率的技巧或经验？

非常感谢！



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parameter tuning for MADE at larger code distances (d=7, 9, 11, 13) | 请教大的distance下MADE参数(depth/width)的调优建议 #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Parameter tuning for MADE at larger code distances (d=7, 9, 11, 13) | 请教大的distance下MADE参数(depth/width)的调优建议 #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions