Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@oliver-s-lee
Copy link
Contributor

@oliver-s-lee oliver-s-lee commented Sep 17, 2024

Ok, this is now working to the point where it's ready for at least an initial review.

The main workhorse is cclib.bridge.cclib2pyscf.cclibfrommethods() which accepts a number of PySCF method objects (SCF, MP, excited states etc.) and returns a ccData object.

cclib.bridge.cclib2pyscf.makecclib() is a slightly nicer wrapper on top that does a good job of guessing what type of PySCF method you pass to it.

Supported so far are:

  • aonames
  • aooverlaps
  • atombasis
  • atomcharges (mulliken and lowdin)
  • atomcoords
  • atommasses
  • atomnos
  • ccenergies (including CCSD(T) although this is slightly fudged due to the way it's handled in PySCF)
  • charge
  • coreelectrons
  • etenergies
  • etoscs
  • etdips
  • etsecs
  • etsyms (but only multiplicities for now)
  • gbasis
  • geovalues (but only energies, no targets yet)
  • homos
  • metadata (most of important things are here, no performance or timing data yet tho)
  • mocoeffs
  • moenergies
  • moments (both dipole and quadrupole, although quad are only available in development versions of PySCF)
  • mosyms
  • mpenergies (MP2)
  • mult
  • natom
  • nbasis
  • nmo
  • optdone (although slightly hacked due to the way PySCF handles optimisations)
  • optstatus (see above)
  • rotconsts
  • scfenergies (HF and DFT)
  • scftargets
  • scfvalues
  • vibfreqs
  • vibfconsts
  • vibirs
  • vibrmasses

Not currently supported but should be available with only a bit more work

  • etdips and friends (just forgot about them)
  • grads (not looked for yet)
  • hessian (this is readily available, but I'm not knowledgeable enough to understand either the PySCF or cclib formats)
  • vibdisps (the IR part of PySCF is not so well documented and I'm not sure where this is hidden)

There are also tests for all of the supported properties (unless I've forgotten of course). The tests use the normal parser tests rather than the ones in bridge. To get this to work, I've slightly modified conftest.py so testdata can specify python files which get imported, run, and the results cached as necessary. Most of the PySCF calcs are extremely quick, but the frequencies on DVB can take a few minutes. Not sure if we want to look at saving the raw PySCF data rather than regenerating each time...

@berquist
Copy link
Member

You can get rid of the F821 warnings by prefixing the variable with _.

@codecov
Copy link

codecov bot commented Sep 17, 2024

Codecov Report

Attention: Patch coverage is 92.78351% with 14 lines in your changes missing coverage. Please review.

Project coverage is 81.60%. Comparing base (9a078b4) to head (8b54734).
Report is 168 commits behind head on master.

Files with missing lines Patch % Lines
cclib/bridge/cclib2pyscf.py 92.78% 14 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1481      +/-   ##
==========================================
+ Coverage   81.46%   81.60%   +0.14%     
==========================================
  Files          73       73              
  Lines       14828    15019     +191     
==========================================
+ Hits        12079    12256     +177     
- Misses       2749     2763      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@oliver-s-lee oliver-s-lee force-pushed the pyscf branch 6 times, most recently from 0bce9b3 to e47e482 Compare October 7, 2024 14:02
@oliver-s-lee
Copy link
Contributor Author

At the moment this PR parses energies into hartree, but that change is (or at least should be) in a single commit so easy to reverse if need be.

@berquist berquist self-requested a review October 20, 2024 21:00
@berquist
Copy link
Member

At the moment this PR parses energies into hartree, but that change is (or at least should be) in a single commit so easy to reverse if need be.

Ok; sorry to leave you hanging for a review on this.

@berquist berquist added this to the v1.8.2 milestone Dec 18, 2024
Copy link
Member

@berquist berquist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a partial review.

So far I agree with all of your assumptions that I didn't comment on.

# Taken from pyscf.tdscf.rhf.get_nto()
#
# Would appreciate someone checking this makes sense?
x *= 1.0 / np.linalg.norm(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC I ripped this from pyscf many years ago, but it indicates that for CIS/TDA you're correct, but if there are deexcitation vectors you need to normalize them together:

    def norm_xy(z: np.ndarray, nocc: int, nvirt: int) -> Tuple[np.ndarray, np.ndarray]:
        x, y = z.reshape(2, nvirt, nocc)
        norm = 2 * (np.linalg.norm(x) ** 2 - np.linalg.norm(y) ** 2)
        norm = 1 / np.sqrt(norm)
        return (x * norm).flatten(), (y * norm).flatten()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks, TD was originally missed because the DVB calculation wouldn't converge for whatever reason, I've now added a new test for TD with water to test the normalisation with both x and y. The implementation is now:

def norm_xy(x, y):
    norm_factor = 1.0 / np.sqrt(
        np.linalg.norm(x) ** 2 - np.linalg.norm(y) ** 2
    )
    return (x * norm_factor, y * norm_factor)

This logic is beyond me really, but the 2 * multiplication factor in your function was giving the wrong results?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now but I'm going to try and understand this.

test/testdata Outdated
Basis Psi4 GenericBasisTest basicPsi4-1.7 dvb_sp_rks.out
Basis QChem GenericBasisTest basicQChem5.1 dvb_sp.out
Basis QChem GenericBasisTest basicQChem5.4 dvb_sp.out
Basis PySCF GenericBasisTest basicPySCF2.6 dvb_sp.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this in alphabetical order for the second column?

(Also, quick poll...should I convert this to JSON/TOML/YAML/... like I did for regressionfiles.yaml?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep no problem.

Personally I'm a YAML fanboy so my default position is yes, but in this case the table format is actually pretty useful and it would be a shame to lose that. Maybe an argument for the condensed list form?
Eg

- [Basis, Psi4,  GenericBasisTest, basicPsi4-1.7, dvb_sp_rks.out]
- [Basis, QChem, GenericBasisTest, basicQChem5.1, dvb_sp.out]
...

Or maybe could nest? Something like:

Basis:
  Psi4:
    - [GenericBasisTest, basicPsi4-1.7, dvb_sp_rks.out]
  QChem:
    - [GenericBasisTest, basicQChem5.1, dvb_sp.out]



# The Gaussian log files for this test are a normal restricted calculation,
# is this class misnamed?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, in the 11 years I've been working on this code, I've never noticed that this Gaussian calculation is wrong.

I can't tell from the control or ricc2.out, but it looks like the Turbomole calculation is unrestricted but a paired singlet, so an almost identical problem to the Gaussian calculation.

I could have sworn that I added an unrestricted TDDFT Q-Chem calculation in unit or regression tests, but apparently not.

#1533

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll be honest this did confuse me for a sec, funny how things can slip through. I wonder if the intention was originally to do a delta-SCF type calculation (the original file name in G03 was apparently deltasym.log) and maybe this was the ground state? Just speculation.

That Turbomole one was probably me I'd guess so woops there too. Will address in another PR.

filenames = logfile.filename
# For 'normal' log files we use ccopen to parse.
# For pseudo parsers (like PySCF) we use a different mechanism.
if "PySCF" in str(first):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Horrifying. This is another part IMO of the pain from having "parsing" being a separate path from IO and now bridges.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah a bit of a mess right now. This 'bridge' does feel quite a lot like a parser, but then again we might want an actual PySCF parser (from the PySCF log files) in the future in which case this guy would need another name or something

pyproject.toml Outdated
Comment on lines 66 to 69
"pyscf",
"pyscf-properties @ git+https://github.com/pyscf/properties",
"pyberny",
"geometric"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, but am not sure, that these don't need to be here if stuff is reworked. We keep them in bridges, so they're only specified once, and then you'd do pip install cclib[dev]. The (naming) problem is that test isn't all things needed for testing, but only things needed for testing, and needing dev instead is confusing.

If we rename test to test-infrastructure (better name welcome), then add

test = ["cclib[bridges,docs,test-infrastructure]"]

it will be more consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense, and have now updated.

As an unfortunate side effect the docs build then broke because of #1523, which is not something we've considered before I don't think. Considering obabel really isn't a dependency in Pip terms atm, and the wheel build doesn't seem to be functioning, I've removed it from the pyproject for now. Doesn't feel great so would be happy to explore alternative/better alternatives...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is driving me crazy. I would like to push njzjz/openbabel-wheel#6 and openbabel/openbabel#2408 as a solution but not sure I have the energy.

@oliver-s-lee oliver-s-lee force-pushed the pyscf branch 5 times, most recently from a2d37f2 to 6ef862e Compare February 25, 2025 11:40
@berquist
Copy link
Member

berquist commented Mar 9, 2025

This is good for now but am currently rebasing to purge all the pre-commit autofix commits.

Copy link
Member

@berquist berquist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thanks for all your hard work on it.

@berquist berquist merged commit 7b93367 into cclib:master Apr 12, 2025
29 checks passed
@berquist berquist mentioned this pull request Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants