-
Notifications
You must be signed in to change notification settings - Fork 204
Dev model CGCNN #964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev model CGCNN #964
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 训练好的模型参数已上传:https://paddle-org.bj.bcebos.com/paddlescience/models/CGCNN/cgcnn_pretrained.pdparams,可以在文档中提供。
- 提交代码之前请安装pre-commit:https://paddlescience-docs.readthedocs.io/zh-cn/latest/zh/development/#41-pre-commit,如果未安装但提交了,请手动执行格式化命令:
pre-commit run --files 修改的文件/文件夹路径
examples/cgcnn/conf/CGCNN_Demo.yaml
Outdated
defaults: # | ||
- ppsci_default # | ||
- TRAIN: train_default # | ||
- TRAIN/ema: ema_default # | ||
- TRAIN/swa: swa_default # | ||
- EVAL: eval_default # | ||
- _self_ # |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
末尾的井号可以删掉?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
好像没删?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已经删除了
ppsci/solver/printer.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请拉取develop进行合并
ppsci/solver/solver.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请拉取develop进行合并
examples/cgcnn/docs/docs/CGCNN.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
best_model.pdparams已上传:https://paddle-org.bj.bcebos.com/paddlescience/models/CGCNN/cgcnn_pretrained.pdparams,可以在文档中链接这个url,然后这几个pd结尾的文件可以删除了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除
docs/zh/examples/cgcnn.md
Outdated
@@ -0,0 +1,142 @@ | |||
# CGCNN (Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties) | |||
|
|||
开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatge`,请额外运行命令`pip install pymatge`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatge`,请额外运行命令`pip install pymatge`。 | |
开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatgen`,请额外运行命令`pip install pymatgen`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
docs/zh/examples/cgcnn.md
Outdated
@@ -0,0 +1,142 @@ | |||
# CGCNN (Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties) | |||
|
|||
开始训练、评估前,请先下载[数据集](https://cmr.fysik.dtu.dk/c2db/c2db.html)并进行划分。数据读取需要额外安装依赖`pymatge`,请额外运行命令`pip install pymatge`。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个页面哪里有下载链接吗,好像只有下载完的使用代码?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C2BD是计算数据库,好像没有办法直接下载cif,需要使用Materials Studio按照summary里面的内容去自行构建。我这边的数据使用的是相关专业的同学自行计算得到的,暂时没有整理出开源的部分,后续会确认哪些数据可以开源,确认后会第一时间进行更新
docs/zh/examples/cgcnn.md
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
代码引用路径删除开头的PaddleScience/,否则页面无法渲染,另外文档有很多细节问题,请参考其他文档的写法,预览检查没问题后再commit代码
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我今天检查一下
@banjiuyufen 确认一下所有提交代码是否被格式化过,否则code-style-check无法通过: |
我这边服务器暂时不能直接git push,只能用网页端上传代码文件,我在本地服务器执行pre-commite后显示我修改的代码均符合。我现在看一下code-style-checkd的详情进行修改 |
目前已经可以通过code-style-check了 |
mkdocs.yml
Outdated
@@ -86,6 +86,7 @@ nav: | |||
- Chip_heat: zh/examples/chip_heat.md | |||
- 材料科学(AI for Material): | |||
- hPINNs: zh/examples/hpinns.md | |||
- CGCNN: zh/example/cgcnn.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- CGCNN: zh/example/cgcnn.md | |
- CGCNN: zh/examples/cgcnn.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
examples/cgcnn/CGCNN.py
Outdated
solver = ppsci.solver.Solver( | ||
model, | ||
validator=validator, | ||
pretrained_model_path=cfg.EVAL.pretrained_model_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretrained_model_path=cfg.EVAL.pretrained_model_path, |
examples/cgcnn/CGCNN.py
Outdated
solver = ppsci.solver.Solver( | ||
model=model, | ||
constraint=constraint, | ||
optimizer=optimizer, | ||
epochs=cfg.TRAIN.epochs, | ||
eval_during_train=True, | ||
validator=validator, | ||
equation=None, | ||
output_dir=cfg.output_dir, | ||
cfg=cfg, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
solver = ppsci.solver.Solver( | |
model=model, | |
constraint=constraint, | |
optimizer=optimizer, | |
epochs=cfg.TRAIN.epochs, | |
eval_during_train=True, | |
validator=validator, | |
equation=None, | |
output_dir=cfg.output_dir, | |
cfg=cfg, | |
) | |
solver = ppsci.solver.Solver( | |
model=model, | |
constraint=constraint, | |
optimizer=optimizer, | |
validator=validator, | |
cfg=cfg, | |
) |
ppsci/solver/train.py
Outdated
"""Compute batch size from given input dict. | ||
NOTE: Returned `batch_size` might be inaccurate, but it won't affect the correctness | ||
of the training results because `batch_size` is now only used for timing. | ||
Args: | ||
input_dict (Dict[str, Union[paddle.Tensor, Sequence[paddle.Tensor]]]): Given input dict. | ||
Returns: | ||
int: Batch size of input dict. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Compute batch size from given input dict. | |
NOTE: Returned `batch_size` might be inaccurate, but it won't affect the correctness | |
of the training results because `batch_size` is now only used for timing. | |
Args: | |
input_dict (Dict[str, Union[paddle.Tensor, Sequence[paddle.Tensor]]]): Given input dict. | |
Returns: | |
int: Batch size of input dict. | |
""" | |
"""Compute batch size from given input dict. | |
NOTE: Returned `batch_size` might be inaccurate, but it won't affect the correctness | |
of the training results because `batch_size` is now only used for timing. | |
Args: | |
input_dict (Dict[str, Union[paddle.Tensor, Sequence[paddle.Tensor]]]): Given input dict. | |
Returns: | |
int: Batch size of input dict. | |
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
均已修改
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里建议将collate_pool用FunctionalBatchTransform包裹,然后用以下形式放到dataloader_cfg中(collate_pool可能需要按照FunctionalBatchTransforms的typehint修改成规范格式),然后这个文件就可以不用改了,把Collate_fn改造完之后添加到batch_transform/
文件夹下作为一个新增的批预处理类即可:
cgcnn_constraint = ppsci.constraint.SupervisedConstraint(
dataloader_cfg={
"dataset": {
"name": "CGCNNDataset",
"root_dir": cfg.TRAIN_DIR,
"input_keys": "i",
"label_keys": "l",
"id_keys": "c",
},
"batch_size": cfg.TRAIN.batch_size,
+ "batch_transforms": [
+ {"Collate_Pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)},
+ ],
},
loss=ppsci.loss.MAELoss("mean"),
output_expr={"l": lambda out: out["out"]},
name="cgcnn_constraint",
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,我修改试试
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
遇到一个问题,PaddleScience/ppsci/data/process/batch_transform/init.py中的transform_obj = eval(transform_cls)(**transform_cfg)报错<module 'ppsci.data.process.batch_transform.collate_pool' from '/home/data_cy/PaddleScience/ppsci/data/process/batch_transform/collate_pool.py'> argument after ** must be a mapping, not FunctionalBatchTransform,没看明白这个报错
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在将"batch_transforms": [ {"collate_fn":
{"collate_pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)}}]修改后,还是这个位置报错“name 'collate_fn' is not defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在将"batch_transforms": [ {"collate_fn": {"collate_pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)}}]修改后,还是这个位置报错“name 'collate_fn' is not defined
collate_fn
已经支持在dataloader_cfg
里传入,可以解决冲突的时候顺便改一下案例代码:
PaddleScience/ppsci/data/__init__.py
Line 106 in 6c720a4
collate_fn: Optional[Callable] = cfg.pop("collate_fn", None) |
PaddleScience/ppsci/data/__init__.py
Lines 190 to 201 in 6c720a4
dataloader_ = io.DataLoader( | |
dataset=_dataset, | |
places=device.get_device(), | |
batch_sampler=batch_sampler, | |
collate_fn=collate_fn, | |
num_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS), | |
use_shared_memory=cfg.get("use_shared_memory", False), | |
worker_init_fn=init_fn, | |
# TODO: Do not enable 'persistent_workers' below for | |
# 'IndexError: pop from empty list ...' will be raised in certain cases | |
# persistent_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS) > 0, | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在将"batch_transforms": [ {"collate_fn": {"collate_pool": ppsci.data.batch_transform.FunctionalBatchTransform(collate_pool)}}]修改后,还是这个位置报错“name 'collate_fn' is not defined
collate_fn
已经支持在dataloader_cfg
里传入,可以解决冲突的时候顺便改一下案例代码:PaddleScience/ppsci/data/__init__.py
Line 106 in 6c720a4
collate_fn: Optional[Callable] = cfg.pop("collate_fn", None) PaddleScience/ppsci/data/__init__.py
Lines 190 to 201 in 6c720a4
dataloader_ = io.DataLoader( dataset=_dataset, places=device.get_device(), batch_sampler=batch_sampler, collate_fn=collate_fn, num_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS), use_shared_memory=cfg.get("use_shared_memory", False), worker_init_fn=init_fn, # TODO: Do not enable 'persistent_workers' below for # 'IndexError: pop from empty list ...' will be raised in certain cases # persistent_workers=cfg.get("num_workers", _DEFAULT_NUM_WORKERS) > 0, )
已改好,刚才处理冲突的时候不小心删除了commit,现在已经重新提交pr。。。。
ppsci/data/__init__.py
Outdated
if isinstance(batch_transforms_cfg, dict): | ||
collate_fn = batch_transforms_cfg["collate_fn"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里先这样吧,我后续支持一下直接传入collate_fn
功能
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,麻烦您了
@banjiuyufen 现在这份代码是否能正常运行?另外,关注一下CI测试中的报错,https://xly.bce.baidu.com/paddlepaddle/PaddleScience/newipipe/detail/11323722/job/27211139 |
现在我这边本地可以正常训练,但是eval中需要添加和train中相同的函数去记时batch |
哦哦这个可以改一下eval.py,用 |
然后这个报错应该是因为你写的Example里,构造的输入数据类型不正确,paddle.rand返回的类型是浮点数,但是你的模型输入应该有一个是int64类型的表示下标的吧?通过执行: |
好的,我明天调整一下 |
已重新调整 |
目前已经可以正常训练和评估 |
PR types
[ New Model ]
PR changes
[ APIs ]
Describe
实现了PaddleSciecne版本的CGCNN预测二维半导体能带结构案例
[ example | cgcnn]
[ modle | arch | crystalgraphconvnet ]
[ data | dataset | cgcnn_datatset ]