Thanks to visit codestin.com
Credit goes to link.springer.com

Skip to main content

Dual-Modality Representation Learning for Molecular Property Prediction

  • Conference paper
  • First Online:
Bioinformatics Research and Applications (ISBRA 2025)

Abstract

Molecular property prediction has attracted substantial attention recently. Accurate prediction of drug properties relies heavily on effective molecular representations. The structures of chemical compounds are commonly represented as graphs or SMILES sequences. Recent advances in learning drug properties commonly employ Graph Neural Networks (GNNs) based on the graph representation. For the SMILES representation, Transformer-based architectures have been adopted by treating each SMILES string as a sequence of tokens. Because each representation has its own advantages and disadvantages, combining both representations in learning drug properties is a promising direction. We propose a method named Dual-Modality Cross-Attention (DMCA) that can effectively combine the strengths of two representations by employing the cross-attention mechanism. DMCA was evaluated across five datasets for classification tasks. Results show that our method achieves the best overall performance, highlighting its effectiveness in leveraging the complementary information from both graph and SMILES modalities. The source code of DMCA is available at: https://github.com/Ay-Zhao/DMCA

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+
from £29.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 55.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 69.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Baltrušaitis, T., Ahuja, C., Morency, L.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  Google Scholar 

  2. Chithrananda, S., Grand, G., Ramsundar, B.: ChemBERTa: large-scale self-supervised pretraining for molecular property prediction. arXiv Preprint arXiv:2010.09885 (2020)

  3. Fabian, B., et al.: Molecular representation learning with language models and domain-relevant auxiliary tasks. arXiv Preprint arXiv:2011.13230 (2020)

  4. Fang, X., et al.: Geometry-enhanced molecular representation learning for property prediction. Nat. Mach. Intell. 4(2), 127–134 (2022)

    Article  Google Scholar 

  5. Gilmer, J., Schoenholz, S.S., Riley, P.F., Vinyals, O., Dahl, G.E.: Neural message passing for quantum chemistry. In: Proceedings of the International Conference on Machine Learning, pp. 1263–1272 (2017)

    Google Scholar 

  6. Guo, Z., Yu, W., Zhang, C., Jiang, M., Chawla, N.V.: GraSeq: graph and sequence fusion learning for molecular property prediction. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 435–443 (2020)

    Google Scholar 

  7. Guo, Z., Sharma, P., Martinez, A., Du, L., Abraham, R.: Multilingual molecular representation learning via contrastive pre-training. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, pp. 3441–3453 (2022)

    Google Scholar 

  8. Honda, S., Shi, S., Ueda, H.R.: Smiles transformer: pre-trained molecular fingerprint for low-data drug discovery. arXiv Preprint arXiv:1911.04738 (2019)

  9. Hu, W., et al.: Strategies for pre-training graph neural networks. arXiv Preprint arXiv:1905.12265 (2019)

  10. Irwin, J.J., Shoichet, B.K.: ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45(1), 177–182 (2005)

    Article  Google Scholar 

  11. Jia, C., et al.: Scaling up visual and vision-language representation learning with noisy text supervision. In: Proceedings of the International Conference on Machine Learning, pp. 4904–4916 (2021)

    Google Scholar 

  12. Jo, J., Kwak, B., Choi, H.S., Yoon, S.: The message passing neural networks for chemical property prediction on smiles. Methods 179, 65–72 (2020)

    Article  Google Scholar 

  13. Kuhn, M., Letunic, I., Jensen, L.J., Bork, P.: The SIDER database of drugs and side effects. Nucleic Acids Res. 44(D1), D1075–D1079 (2016)

    Article  Google Scholar 

  14. Li, J., Jiang, X.: Mol-BERT: an effective molecular representation with BERT for molecular property prediction. Wirel. Commun. Mob. Comput. 2021, 1–7 (2021)

    Google Scholar 

  15. Li, Q., Han, Z., Wu, X.: Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, pp. 3538–3545 (2018)

    Google Scholar 

  16. Liu, S., Wang, H., Liu, W., Lasenby, J., Guo, H., Tang, J.: Pre-training molecular graph representation with 3d geometry. arXiv Preprint arXiv:2110.07728 (2021)

  17. Mullard, A.: New drugs cost US \$2.6 billion to develop. Nat. Rev. Drug Discov. (2014)

    Google Scholar 

  18. Rong, Y., et al.: Self-supervised graph transformer on large-scale molecular data. Adv. Neural. Inf. Process. Syst. 33, 12559–12571 (2020)

    Google Scholar 

  19. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  20. Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv Preprint arXiv:1710.10903 (2017)

  21. Wang, S., Guo, Y., Wang, Y., Sun, H., Huang, J.: Smiles-BERT: large scale unsupervised pre-training for molecular property prediction. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 429–436 (2019)

    Google Scholar 

  22. Wang, Y., Wang, J., Cao, Z., Barati Farimani, A.: Molecular contrastive learning of representations via graph neural networks. Nat. Mach. Intell. 4(3), 279–287 (2022)

    Article  Google Scholar 

  23. Weininger, D.: Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28(1), 31–36 (1988)

    Google Scholar 

  24. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45 (2020)

    Google Scholar 

  25. Wu, Z., et al.: MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 9(2), 513–530 (2018)

    Article  Google Scholar 

  26. Xu, L., Pan, S., Xia, L., Li, Z.: Molecular property prediction by combining LSTM and gat. Biomolecules 13(3), 503 (2023)

    Article  Google Scholar 

  27. Xue, D., et al.: X-MOL: large-scale pre-training for molecular understanding and diverse molecular analysis. bioRxiv pp. 2020–12 (2020)

    Google Scholar 

  28. Yang, K., et al.: Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59(8), 3370–3388 (2019)

    Article  Google Scholar 

  29. Ye, X.b., Guan, Q., Luo, W., Fang, L., Lai, Z.R., Wang, J.: Molecular substructure graph attention network for molecular property identification in drug discovery. Pattern Recognit. 128, 108659 (2022)

    Google Scholar 

  30. Zhu, J., Xia, Y., Qin, T., Zhou, W., Li, H., Liu, T.: Dual-view molecule pre-training. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3615–3627 (2023)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Science Foundation (grant Nos. CCF-2200255, CCF-2006780, IIS-2027667) and the National Institutes of Health (grant Nos. U01AG073323, R01HG009658). This work was supported in part by an allocation of computing time from the Ohio Supercomputer Center.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Li .

Editor information

Editors and Affiliations

Appendix

Appendix

Table 2. Node features in graph representation
Fig. 3.
Violin plots displaying data distributions for six different categories labeled as "GiniTex," "BASP," "BACS," "HIN," "slow27," and an additional bar plot labeled "M" and "W." Each violin plot shows the density and spread of values, with a central line indicating the median. The bar plot shows a comparison between two groups, "M" and "W," with numerical values on the y-axis. The plots are arranged in a grid format.

The top-left sub-figure shows the distribution of the label combinations from the ClinTox dataset. The rest sub-figures show the distributions of SMILES string lengths from all the five datasets.

1.1 Brief Description of Datasets

Brief description of the datasets utilized in the experiments. More details can be found in the MoleculeNet Benchmark website [25].

  • HIV: This dataset involves experimentally measuring the abilities of molecules to inhibit HIV replication.

  • BACE: It focuses on the qualitative binding ability of molecules as inhibitors of BACE-1 (Beta-secretase 1).

  • ClinTox: It records whether the drug was approved by the FDA (Food and Drug Administration) and failed clinical trials for toxicity reasons.

  • BBBP: This dataset contains binary labels indicating blood-brain barrier penetration.

  • SIDER: It comprises information on marketed drugs and their adverse drug reactions categorized into 27 system organ classes.

Rights and permissions

Reprints and permissions

Copyright information

© 2026 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, A., Chen, Z., Fang, Z., Zhang, X., Li, J. (2026). Dual-Modality Representation Learning for Molecular Property Prediction. In: Tang, J., Lai, X., Cai, Z., Peng, W., Wei, Y. (eds) Bioinformatics Research and Applications. ISBRA 2025. Lecture Notes in Computer Science(), vol 15756. Springer, Singapore. https://doi.org/10.1007/978-981-95-0698-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-95-0698-9_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-95-0697-2

  • Online ISBN: 978-981-95-0698-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Keywords

Publish with us

Policies and ethics