-
Couldn't load subscription status.
- Fork 175
PySCF to cclib bridge support #1481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
You can get rid of the F821 warnings by prefixing the variable with |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1481 +/- ##
==========================================
+ Coverage 81.46% 81.60% +0.14%
==========================================
Files 73 73
Lines 14828 15019 +191
==========================================
+ Hits 12079 12256 +177
- Misses 2749 2763 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0bce9b3 to
e47e482
Compare
|
At the moment this PR parses energies into hartree, but that change is (or at least should be) in a single commit so easy to reverse if need be. |
Ok; sorry to leave you hanging for a review on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a partial review.
So far I agree with all of your assumptions that I didn't comment on.
cclib/bridge/cclib2pyscf.py
Outdated
| # Taken from pyscf.tdscf.rhf.get_nto() | ||
| # | ||
| # Would appreciate someone checking this makes sense? | ||
| x *= 1.0 / np.linalg.norm(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I ripped this from pyscf many years ago, but it indicates that for CIS/TDA you're correct, but if there are deexcitation vectors you need to normalize them together:
def norm_xy(z: np.ndarray, nocc: int, nvirt: int) -> Tuple[np.ndarray, np.ndarray]:
x, y = z.reshape(2, nvirt, nocc)
norm = 2 * (np.linalg.norm(x) ** 2 - np.linalg.norm(y) ** 2)
norm = 1 / np.sqrt(norm)
return (x * norm).flatten(), (y * norm).flatten()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks, TD was originally missed because the DVB calculation wouldn't converge for whatever reason, I've now added a new test for TD with water to test the normalisation with both x and y. The implementation is now:
def norm_xy(x, y):
norm_factor = 1.0 / np.sqrt(
np.linalg.norm(x) ** 2 - np.linalg.norm(y) ** 2
)
return (x * norm_factor, y * norm_factor)This logic is beyond me really, but the 2 * multiplication factor in your function was giving the wrong results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for now but I'm going to try and understand this.
test/testdata
Outdated
| Basis Psi4 GenericBasisTest basicPsi4-1.7 dvb_sp_rks.out | ||
| Basis QChem GenericBasisTest basicQChem5.1 dvb_sp.out | ||
| Basis QChem GenericBasisTest basicQChem5.4 dvb_sp.out | ||
| Basis PySCF GenericBasisTest basicPySCF2.6 dvb_sp.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this in alphabetical order for the second column?
(Also, quick poll...should I convert this to JSON/TOML/YAML/... like I did for regressionfiles.yaml?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep no problem.
Personally I'm a YAML fanboy so my default position is yes, but in this case the table format is actually pretty useful and it would be a shame to lose that. Maybe an argument for the condensed list form?
Eg
- [Basis, Psi4, GenericBasisTest, basicPsi4-1.7, dvb_sp_rks.out]
- [Basis, QChem, GenericBasisTest, basicQChem5.1, dvb_sp.out]
...
Or maybe could nest? Something like:
Basis:
Psi4:
- [GenericBasisTest, basicPsi4-1.7, dvb_sp_rks.out]
QChem:
- [GenericBasisTest, basicQChem5.1, dvb_sp.out]
|
|
||
|
|
||
| # The Gaussian log files for this test are a normal restricted calculation, | ||
| # is this class misnamed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, in the 11 years I've been working on this code, I've never noticed that this Gaussian calculation is wrong.
I can't tell from the control or ricc2.out, but it looks like the Turbomole calculation is unrestricted but a paired singlet, so an almost identical problem to the Gaussian calculation.
I could have sworn that I added an unrestricted TDDFT Q-Chem calculation in unit or regression tests, but apparently not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'll be honest this did confuse me for a sec, funny how things can slip through. I wonder if the intention was originally to do a delta-SCF type calculation (the original file name in G03 was apparently deltasym.log) and maybe this was the ground state? Just speculation.
That Turbomole one was probably me I'd guess so woops there too. Will address in another PR.
| filenames = logfile.filename | ||
| # For 'normal' log files we use ccopen to parse. | ||
| # For pseudo parsers (like PySCF) we use a different mechanism. | ||
| if "PySCF" in str(first): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Horrifying. This is another part IMO of the pain from having "parsing" being a separate path from IO and now bridges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah a bit of a mess right now. This 'bridge' does feel quite a lot like a parser, but then again we might want an actual PySCF parser (from the PySCF log files) in the future in which case this guy would need another name or something
pyproject.toml
Outdated
| "pyscf", | ||
| "pyscf-properties @ git+https://github.com/pyscf/properties", | ||
| "pyberny", | ||
| "geometric" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, but am not sure, that these don't need to be here if stuff is reworked. We keep them in bridges, so they're only specified once, and then you'd do pip install cclib[dev]. The (naming) problem is that test isn't all things needed for testing, but only things needed for testing, and needing dev instead is confusing.
If we rename test to test-infrastructure (better name welcome), then add
test = ["cclib[bridges,docs,test-infrastructure]"]
it will be more consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that makes sense, and have now updated.
As an unfortunate side effect the docs build then broke because of #1523, which is not something we've considered before I don't think. Considering obabel really isn't a dependency in Pip terms atm, and the wheel build doesn't seem to be functioning, I've removed it from the pyproject for now. Doesn't feel great so would be happy to explore alternative/better alternatives...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is driving me crazy. I would like to push njzjz/openbabel-wheel#6 and openbabel/openbabel#2408 as a solution but not sure I have the energy.
a2d37f2 to
6ef862e
Compare
|
This is good for now but am currently rebasing to purge all the pre-commit autofix commits. |
Co-authored-by: Eric Berquist <[email protected]>
Co-authored-by: Eric Berquist <[email protected]>
Co-authored-by: Eric Berquist <[email protected]>
Co-authored-by: Eric Berquist <[email protected]>
Co-authored-by: Eric Berquist <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks for all your hard work on it.
Ok, this is now working to the point where it's ready for at least an initial review.
The main workhorse is
cclib.bridge.cclib2pyscf.cclibfrommethods()which accepts a number of PySCF method objects (SCF, MP, excited states etc.) and returns a ccData object.cclib.bridge.cclib2pyscf.makecclib()is a slightly nicer wrapper on top that does a good job of guessing what type of PySCF method you pass to it.Supported so far are:
Not currently supported but should be available with only a bit more work
There are also tests for all of the supported properties (unless I've forgotten of course). The tests use the normal parser tests rather than the ones in
bridge. To get this to work, I've slightly modifiedconftest.pysotestdatacan specify python files which get imported, run, and the results cached as necessary. Most of the PySCF calcs are extremely quick, but the frequencies on DVB can take a few minutes. Not sure if we want to look at saving the raw PySCF data rather than regenerating each time...