Abstract
Molecular property prediction has attracted substantial attention recently. Accurate prediction of drug properties relies heavily on effective molecular representations. The structures of chemical compounds are commonly represented as graphs or SMILES sequences. Recent advances in learning drug properties commonly employ Graph Neural Networks (GNNs) based on the graph representation. For the SMILES representation, Transformer-based architectures have been adopted by treating each SMILES string as a sequence of tokens. Because each representation has its own advantages and disadvantages, combining both representations in learning drug properties is a promising direction. We propose a method named Dual-Modality Cross-Attention (DMCA) that can effectively combine the strengths of two representations by employing the cross-attention mechanism. DMCA was evaluated across five datasets for classification tasks. Results show that our method achieves the best overall performance, highlighting its effectiveness in leveraging the complementary information from both graph and SMILES modalities. The source code of DMCA is available at: https://github.com/Ay-Zhao/DMCA
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baltrušaitis, T., Ahuja, C., Morency, L.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)
Chithrananda, S., Grand, G., Ramsundar, B.: ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. arXiv Preprint arXiv:2010.09885 (2020)
Fabian, B., et al.: Molecular representation learning with language models and domain-relevant auxiliary tasks. arXiv Preprint arXiv:2011.13230 (2020)
Fang, X., et al.: Geometry-enhanced molecular representation learning for property prediction. Nat. Mach. Intell. 4(2), 127–134 (2022)
Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: Proceedings of the International Conference on Machine Learning, pp. 1263–1272 (2017)
Guo, Z., Yu, W., Zhang, C., Jiang, M., Chawla, N.V.: GraSeq: graph and sequence fusion learning for molecular property prediction. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 435–443 (2020)
Guo, Z., Sharma, P., Martinez, A., Du, L., Abraham, R.: Multilingual molecular representation learning via contrastive pre-training. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, pp. 3441–3453 (2022)
Honda, S., Shi, S., Ueda, H.R.: Smiles transformer: pre-trained molecular fingerprint for low-data drug discovery. arXiv Preprint arXiv:1911.04738 (2019)
Hu, W., et al.: Strategies for pre-training graph neural networks. arXiv Preprint arXiv:1905.12265 (2019)
Irwin, J.J., Shoichet, B.K.: ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45(1), 177–182 (2005)
Jia, C., et al.: Scaling up visual and vision-language representation learning with noisy text supervision. In: Proceedings of the International Conference on Machine Learning, pp. 4904–4916 (2021)
Jo, J., Kwak, B., Choi, H.S., Yoon, S.: The message passing neural networks for chemical property prediction on smiles. Methods 179, 65–72 (2020)
Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The SIDER database of drugs and side effects. Nucleic Acids Res. 44(D1), D1075–D1079 (2016)
Li, J., Jiang, X.: Mol-BERT: an effective molecular representation with BERT for molecular property prediction. Wirel. Commun. Mob. Comput. 2021, 1–7 (2021)
Li, Q., Han, Z., Wu, X.: Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, pp. 3538–3545 (2018)
Liu, S., Wang, H., Liu, W., Lasenby, J., Guo, H., Tang, J.: Pre-training molecular graph representation with 3d geometry. arXiv Preprint arXiv:2110.07728 (2021)
Mullard, A.: New drugs cost US \$2.6 billion to develop. Nat. Rev. Drug Discov. (2014)
Rong, Y., et al.: Self-supervised graph transformer on large-scale molecular data. Adv. Neural. Inf. Process. Syst. 33, 12559–12571 (2020)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv Preprint arXiv:1710.10903 (2017)
Wang, S., Guo, Y., Wang, Y., Sun, H., Huang, J.: Smiles-BERT: large scale unsupervised pre-training for molecular property prediction. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 429–436 (2019)
Wang, Y., Wang, J., Cao, Z., Barati Farimani, A.: Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4(3), 279–287 (2022)
Weininger, D.: Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28(1), 31–36 (1988)
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45 (2020)
Wu, Z., et al.: MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9(2), 513–530 (2018)
Xu, L., Pan, S., Xia, L., Li, Z.: Molecular property prediction by combining LSTM and gat. Biomolecules 13(3), 503 (2023)
Xue, D., et al.: X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis. bioRxiv pp. 2020–12 (2020)
Yang, K., et al.: Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59(8), 3370–3388 (2019)
Ye, X.b., Guan, Q., Luo, W., Fang, L., Lai, Z.R., Wang, J.: Molecular substructure graph attention network for molecular property identification in drug discovery. Pattern Recognit. 128, 108659 (2022)
Zhu, J., Xia, Y., Qin, T., Zhou, W., Li, H., Liu, T.: Dual-view molecule pre-training. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3615–3627 (2023)
Acknowledgments
This work is supported by the National Science Foundation (grant Nos. CCF-2200255, CCF-2006780, IIS-2027667) and the National Institutes of Health (grant Nos. U01AG073323, R01HG009658). This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
1.1 Brief Description of Datasets
Brief description of the datasets utilized in the experiments. More details can be found in the MoleculeNet Benchmark website [25].
-
HIV: This dataset involves experimentally measuring the abilities of molecules to inhibit HIV replication.
-
BACE: It focuses on the qualitative binding ability of molecules as inhibitors of BACE-1 (Beta-secretase 1).
-
ClinTox: It records whether the drug was approved by the FDA (Food and Drug Administration) and failed clinical trials for toxicity reasons.
-
BBBP: This dataset contains binary labels indicating blood-brain barrier penetration.
-
SIDER: It comprises information on marketed drugs and their adverse drug reactions categorized into 27 system organ classes.
Rights and permissions
Copyright information
© 2026 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhao, A., Chen, Z., Fang, Z., Zhang, X., Li, J. (2026). Dual-Modality Representation Learning for Molecular Property Prediction. In: Tang, J., Lai, X., Cai, Z., Peng, W., Wei, Y. (eds) Bioinformatics Research and Applications. ISBRA 2025. Lecture Notes in Computer Science(), vol 15756. Springer, Singapore. https://doi.org/10.1007/978-981-95-0698-9_4
Download citation
DOI: https://doi.org/10.1007/978-981-95-0698-9_4
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-95-0697-2
Online ISBN: 978-981-95-0698-9
eBook Packages: Computer ScienceComputer Science (R0)