PySCF to cclib bridge support #1481

oliver-s-lee · 2024-09-17T15:49:51Z

Ok, this is now working to the point where it's ready for at least an initial review.

The main workhorse is cclib.bridge.cclib2pyscf.cclibfrommethods() which accepts a number of PySCF method objects (SCF, MP, excited states etc.) and returns a ccData object.

cclib.bridge.cclib2pyscf.makecclib() is a slightly nicer wrapper on top that does a good job of guessing what type of PySCF method you pass to it.

Supported so far are:

aonames
aooverlaps
atombasis
atomcharges (mulliken and lowdin)
atomcoords
atommasses
atomnos
ccenergies (including CCSD(T) although this is slightly fudged due to the way it's handled in PySCF)
charge
coreelectrons
etenergies
etoscs
etdips
etsecs
etsyms (but only multiplicities for now)
gbasis
geovalues (but only energies, no targets yet)
homos
metadata (most of important things are here, no performance or timing data yet tho)
mocoeffs
moenergies
moments (both dipole and quadrupole, although quad are only available in development versions of PySCF)
mosyms
mpenergies (MP2)
mult
natom
nbasis
nmo
optdone (although slightly hacked due to the way PySCF handles optimisations)
optstatus (see above)
rotconsts
scfenergies (HF and DFT)
scftargets
scfvalues
vibfreqs
vibfconsts
vibirs
vibrmasses

Not currently supported but should be available with only a bit more work

etdips and friends (just forgot about them)
grads (not looked for yet)
hessian (this is readily available, but I'm not knowledgeable enough to understand either the PySCF or cclib formats)
vibdisps (the IR part of PySCF is not so well documented and I'm not sure where this is hidden)

There are also tests for all of the supported properties (unless I've forgotten of course). The tests use the normal parser tests rather than the ones in bridge. To get this to work, I've slightly modified conftest.py so testdata can specify python files which get imported, run, and the results cached as necessary. Most of the PySCF calcs are extremely quick, but the frequencies on DVB can take a few minutes. Not sure if we want to look at saving the raw PySCF data rather than regenerating each time...

berquist · 2024-09-17T15:55:19Z

You can get rid of the F821 warnings by prefixing the variable with _.

codecov · 2024-09-17T16:30:57Z

Codecov Report

Attention: Patch coverage is 92.78351% with 14 lines in your changes missing coverage. Please review.

Project coverage is 81.60%. Comparing base (9a078b4) to head (8b54734).
Report is 168 commits behind head on master.

Files with missing lines	Patch %	Lines
cclib/bridge/cclib2pyscf.py	92.78%	14 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1481      +/-   ##
==========================================
+ Coverage   81.46%   81.60%   +0.14%     
==========================================
  Files          73       73              
  Lines       14828    15019     +191     
==========================================
+ Hits        12079    12256     +177     
- Misses       2749     2763      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

oliver-s-lee · 2024-10-07T15:23:26Z

At the moment this PR parses energies into hartree, but that change is (or at least should be) in a single commit so easy to reverse if need be.

berquist · 2024-10-20T21:00:57Z

At the moment this PR parses energies into hartree, but that change is (or at least should be) in a single commit so easy to reverse if need be.

Ok; sorry to leave you hanging for a review on this.

berquist

Here is a partial review.

So far I agree with all of your assumptions that I didn't comment on.

cclib/bridge/cclib2pyscf.py

berquist · 2024-12-18T04:10:10Z

cclib/bridge/cclib2pyscf.py

+                # Taken from pyscf.tdscf.rhf.get_nto()
+                #
+                # Would appreciate someone checking this makes sense?
+                x *= 1.0 / np.linalg.norm(x)


IIRC I ripped this from pyscf many years ago, but it indicates that for CIS/TDA you're correct, but if there are deexcitation vectors you need to normalize them together:

def norm_xy(z: np.ndarray, nocc: int, nvirt: int) -> Tuple[np.ndarray, np.ndarray]: x, y = z.reshape(2, nvirt, nocc) norm = 2 * (np.linalg.norm(x) ** 2 - np.linalg.norm(y) ** 2) norm = 1 / np.sqrt(norm) return (x * norm).flatten(), (y * norm).flatten()

Ah thanks, TD was originally missed because the DVB calculation wouldn't converge for whatever reason, I've now added a new test for TD with water to test the normalisation with both x and y. The implementation is now:

def norm_xy(x, y): norm_factor = 1.0 / np.sqrt( np.linalg.norm(x) ** 2 - np.linalg.norm(y) ** 2 ) return (x * norm_factor, y * norm_factor)

This logic is beyond me really, but the 2 * multiplication factor in your function was giving the wrong results?

This is fine for now but I'm going to try and understand this.

cclib/bridge/cclib2pyscf.py

berquist · 2025-02-16T21:06:57Z

test/testdata

 Basis     Psi4        GenericBasisTest      basicPsi4-1.7         dvb_sp_rks.out
 Basis     QChem       GenericBasisTest      basicQChem5.1         dvb_sp.out
 Basis     QChem       GenericBasisTest      basicQChem5.4         dvb_sp.out
+Basis     PySCF       GenericBasisTest      basicPySCF2.6         dvb_sp.py


Can we put this in alphabetical order for the second column?

(Also, quick poll...should I convert this to JSON/TOML/YAML/... like I did for regressionfiles.yaml?)

Yep no problem.

Personally I'm a YAML fanboy so my default position is yes, but in this case the table format is actually pretty useful and it would be a shame to lose that. Maybe an argument for the condensed list form?
Eg

- [Basis, Psi4, GenericBasisTest, basicPsi4-1.7, dvb_sp_rks.out] - [Basis, QChem, GenericBasisTest, basicQChem5.1, dvb_sp.out] ...

Or maybe could nest? Something like:

Basis: Psi4: - [GenericBasisTest, basicPsi4-1.7, dvb_sp_rks.out] QChem: - [GenericBasisTest, basicQChem5.1, dvb_sp.out]

test/data/testvib.py

berquist · 2025-02-16T21:47:28Z

test/data/testTDun.py



+# The Gaussian log files for this test are a normal restricted calculation,
+# is this class misnamed?


Wow, in the 11 years I've been working on this code, I've never noticed that this Gaussian calculation is wrong.

I can't tell from the control or ricc2.out, but it looks like the Turbomole calculation is unrestricted but a paired singlet, so an almost identical problem to the Gaussian calculation.

I could have sworn that I added an unrestricted TDDFT Q-Chem calculation in unit or regression tests, but apparently not.

#1533

Yeah I'll be honest this did confuse me for a sec, funny how things can slip through. I wonder if the intention was originally to do a delta-SCF type calculation (the original file name in G03 was apparently deltasym.log) and maybe this was the ground state? Just speculation.

That Turbomole one was probably me I'd guess so woops there too. Will address in another PR.

test/data/testSP.py

berquist · 2025-02-16T22:19:08Z

test/conftest.py

-        filenames = logfile.filename
+        # For 'normal' log files we use ccopen to parse.
+        # For pseudo parsers (like PySCF) we use a different mechanism.
+        if "PySCF" in str(first):


Horrifying. This is another part IMO of the pain from having "parsing" being a separate path from IO and now bridges.

Yeah a bit of a mess right now. This 'bridge' does feel quite a lot like a parser, but then again we might want an actual PySCF parser (from the PySCF log files) in the future in which case this guy would need another name or something

berquist · 2025-02-16T22:34:42Z

pyproject.toml

+    "pyscf",
+    "pyscf-properties @ git+https://github.com/pyscf/properties",
+    "pyberny",
+    "geometric"


I think, but am not sure, that these don't need to be here if stuff is reworked. We keep them in bridges, so they're only specified once, and then you'd do pip install cclib[dev]. The (naming) problem is that test isn't all things needed for testing, but only things needed for testing, and needing dev instead is confusing.

If we rename test to test-infrastructure (better name welcome), then add

test = ["cclib[bridges,docs,test-infrastructure]"]

it will be more consistent.

Yeah that makes sense, and have now updated.

As an unfortunate side effect the docs build then broke because of #1523, which is not something we've considered before I don't think. Considering obabel really isn't a dependency in Pip terms atm, and the wheel build doesn't seem to be functioning, I've removed it from the pyproject for now. Doesn't feel great so would be happy to explore alternative/better alternatives...

This is driving me crazy. I would like to push njzjz/openbabel-wheel#6 and openbabel/openbabel#2408 as a solution but not sure I have the energy.

berquist · 2025-03-09T19:38:01Z

This is good for now but am currently rebasing to purge all the pre-commit autofix commits.

…rom pyscf

Co-authored-by: Eric Berquist <[email protected]>

berquist

This looks great. Thanks for all your hard work on it.

oliver-s-lee force-pushed the pyscf branch 6 times, most recently from 0bce9b3 to e47e482 Compare October 7, 2024 14:02

berquist self-requested a review October 20, 2024 21:00

berquist added the bridge label Dec 18, 2024

berquist added this to the v1.8.2 milestone Dec 18, 2024

berquist requested changes Dec 18, 2024

View reviewed changes

berquist requested changes Feb 16, 2025

View reviewed changes

oliver-s-lee force-pushed the pyscf branch 5 times, most recently from a2d37f2 to 6ef862e Compare February 25, 2025 11:40

berquist force-pushed the pyscf branch from eba20a2 to 2f45d23 Compare March 26, 2025 03:11

oliver-s-lee added 8 commits March 31, 2025 18:05

Added initial bridge from pyscf data -> cclib

bfb4c9f

started adding excited states for pyscf bridge

d6b9afa

added etsecs for pyscf

32eef26

fixed init attributes as list

148c2dc

split pyscf to cclib function in two

a4d08e7

added total energy, orbitals, and excited states (for unrestricted) f…

ea98d3a

…rom pyscf

added initial 'parsing' of pyscf vibrations

1fbf067

added metadata for pyscf

cce4511

oliver-s-lee and others added 24 commits March 31, 2025 18:05

Fixed bug caused by whitespace in makefile rules

9a0e867

Docs build now installs test dependencies

c8701ad

Added some missing depends for bridges

975dbb7

Switched to tuple in isinstance()

d6b035c

Co-authored-by: Eric Berquist <[email protected]>

Replaced unused variable with underscore

1f47871

Co-authored-by: Eric Berquist <[email protected]>

Initial type hints for makecclib function

70f61ac

Co-authored-by: Eric Berquist <[email protected]>

Adjusted type hints

86d9142

Added a check that all methods use the same molecule

e5b15b8

Removed done TODO

d14b2f3

Fixed ignoring deexcitations when normalising excitation vectors

e94fa32

Cleaned up norm_xy() function

f038e03

Minor work towards supporting hessian

99a5ae8

Fixed hessian rehape

6f1fefb

Fixed not converting imaginary frequencies to negative

f06a6f0

Reordered testdata to alphabetical

aee7427

Reordered skipForParser to alphabetical

5a8b811

Update test/data/testSP.py

b05a6b0

Co-authored-by: Eric Berquist <[email protected]>

Update test/data/testSP.py

0a9d30d

Co-authored-by: Eric Berquist <[email protected]>

Updated memory literals to underscore style

23eb1de

Converted some tests to pytest.approx()

244aa19

Fixed overly precise comparison in aooverlaps

8eef998

Renamed test to test-infrastructure and added new test

b5f6d92

Converted PySCF back to non-hartree units

079ee17

(temporarily) removed quadrupole moment from PySCF

bf0d75e

berquist force-pushed the pyscf branch from 2f45d23 to bf0d75e Compare April 1, 2025 00:07

(temporarily) don't list Open Babel as an optional dependency

8b54734

berquist force-pushed the pyscf branch from ff962df to 8b54734 Compare April 12, 2025 19:55

berquist approved these changes Apr 12, 2025

View reviewed changes

berquist merged commit 7b93367 into cclib:master Apr 12, 2025
29 checks passed

berquist mentioned this pull request Jun 2, 2025

PySCF support? #1434

Closed



		# The Gaussian log files for this test are a normal restricted calculation,
		# is this class misnamed?

Uh oh!

PySCF to cclib bridge support #1481

PySCF to cclib bridge support #1481

Uh oh!

Conversation

oliver-s-lee commented Sep 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

berquist commented Sep 17, 2024

Uh oh!

codecov bot commented Sep 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

oliver-s-lee commented Oct 7, 2024

Uh oh!

berquist commented Oct 20, 2024

Uh oh!

berquist left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

berquist commented Mar 9, 2025

Uh oh!

berquist left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oliver-s-lee commented Sep 17, 2024 •

edited

Loading

codecov bot commented Sep 17, 2024 •

edited

Loading