|
| 1 | +Examples |
| 2 | +-------- |
| 3 | + |
| 4 | +Generating representations using the ``Compound`` class |
| 5 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 6 | + |
| 7 | +The following example demonstrates how to generate a representation via |
| 8 | +the ``qml.Compound`` class. |
| 9 | + |
| 10 | +.. code:: python |
| 11 | +
|
| 12 | + # Read in an xyz or cif file. |
| 13 | + water = Compound(xyz="water.xyz") |
| 14 | +
|
| 15 | + # Generate a molecular coulomb matrices sorted by row norm. |
| 16 | + water.generate_coulomb_matrix(size=5, sorting="row-norm") |
| 17 | +
|
| 18 | + print(water.representation) |
| 19 | +
|
| 20 | +Might print the following representation: |
| 21 | + |
| 22 | +.. code:: |
| 23 | + |
| 24 | + [ 73.51669472 8.3593106 0.5 8.35237809 0.66066557 0.5 |
| 25 | + 0. 0. 0. 0. 0. 0. 0. |
| 26 | + 0. 0. ] |
| 27 | +
|
| 28 | +Generating representations via the ``qml.representations`` module |
| 29 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 30 | + |
| 31 | +.. code:: python |
| 32 | +
|
| 33 | + import numpy as np |
| 34 | + from qml.representations import * |
| 35 | +
|
| 36 | + # Dummy coordinates for a water molecule |
| 37 | + coordinates = np.array([[1.464, 0.707, 1.056], |
| 38 | + [0.878, 1.218, 0.498], |
| 39 | + [2.319, 1.126, 0.952]]) |
| 40 | +
|
| 41 | + # Oxygen, Hydrogen, Hydrogen |
| 42 | + nuclear_charges = np.array([8, 1, 1]) |
| 43 | +
|
| 44 | + # Generate a molecular coulomb matrices sorted by row norm. |
| 45 | + cm1 = generate_coulomb_matrix(nuclear_charges, coordinates, |
| 46 | + size=5, sorting="row-norm") |
| 47 | + print(cm1) |
| 48 | +
|
| 49 | +
|
| 50 | +The resulting Coulomb-matrix for water: |
| 51 | + |
| 52 | +.. code:: |
| 53 | + |
| 54 | + [ 73.51669472 8.3593106 0.5 8.35237809 0.66066557 0.5 |
| 55 | + 0. 0. 0. 0. 0. 0. 0. |
| 56 | + 0. 0. ] |
| 57 | +
|
| 58 | +
|
| 59 | +
|
| 60 | +.. code:: python |
| 61 | +
|
| 62 | + # Generate all atomic coulomb matrices sorted by distance to |
| 63 | + # query atom. |
| 64 | + cm2 = generate_atomic_coulomb_matrix(atomtypes, coordinates, |
| 65 | + size=5, sort="distance") |
| 66 | + print cm2 |
| 67 | +
|
| 68 | +.. code:: |
| 69 | +
|
| 70 | + [[ 73.51669472 8.3593106 0.5 8.35237809 0.66066557 0.5 |
| 71 | + 0. 0. 0. 0. 0. 0. |
| 72 | + 0. 0. 0. ] |
| 73 | + [ 0.5 8.3593106 73.51669472 0.66066557 8.35237809 0.5 |
| 74 | + 0. 0. 0. 0. 0. 0. |
| 75 | + 0. 0. 0. ] |
| 76 | + [ 0.5 8.35237809 73.51669472 0.66066557 8.3593106 0.5 |
| 77 | + 0. 0. 0. 0. 0. 0. |
| 78 | + 0. 0. 0. ]] |
| 79 | +
|
| 80 | +
|
| 81 | +Calculating a Gaussian kernel |
| 82 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 83 | + |
| 84 | +The input for most of the kernels in QML is a numpy array, where the first dimension is the number of representations, and the second dimension is the size of each representation. An brief example is presented here, where ``compounds`` is a list of ``Compound()`` objects: |
| 85 | + |
| 86 | +.. code:: python |
| 87 | + |
| 88 | + import numpy as np |
| 89 | + from qml.kernels import gaussian_kernel |
| 90 | +
|
| 91 | + # Generate a numpy-array of the representation |
| 92 | + X = np.array([c.representation for c in compounds]) |
| 93 | +
|
| 94 | + # Kernel-width |
| 95 | + sigma = 100.0 |
| 96 | +
|
| 97 | + # Calculate the kernel-matrix |
| 98 | + K = gaussian_kernel(X, X, sigma) |
| 99 | +
|
| 100 | +
|
| 101 | +Calculating a Gaussian kernel using a local representation |
| 102 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 103 | + |
| 104 | +The easiest way to calculate the kernel matrix using an explicit, local representation is via the wrappers module. Note that here the sigmas is a list of sigmas, and the result is a kernel for each sigma. The following examples currently work with the atomic coulomb matrix representation and the local SLATM representation: |
| 105 | + |
| 106 | +.. code:: python |
| 107 | +
|
| 108 | + import numpy as np |
| 109 | + from qml.kernels import get_local_kernels_gaussian |
| 110 | +
|
| 111 | + # Assume the QM7 dataset is loaded into a list of Compound() |
| 112 | + for compound in qm7: |
| 113 | +
|
| 114 | + # Generate the desired representation for each compound |
| 115 | + compound.generate_atomic_coulomb_matrix(size=23, sort="row-norm") |
| 116 | +
|
| 117 | + # Make a big array with all the atomic representations |
| 118 | + X = np.concatenate([mol.representation for mol in qm7]) |
| 119 | +
|
| 120 | + # Make an array with the number of atoms in each compound |
| 121 | + N = np.array([mol.natoms for mol in qm7]) |
| 122 | +
|
| 123 | + # List of kernel-widths |
| 124 | + sigmas = [50.0, 100.0, 200.0] |
| 125 | +
|
| 126 | + # Calculate the kernel-matrix |
| 127 | + K = get_local_kernels_gaussian(X, X, N, N, sigmas) |
| 128 | +
|
| 129 | + print(K.shape) |
| 130 | +
|
| 131 | +.. code:: |
| 132 | +
|
| 133 | + (3, 7101, 7101) |
| 134 | +
|
| 135 | +Note that ``mol.representation`` is just a 1D numpy array. |
| 136 | + |
| 137 | + |
| 138 | +Generating the SLATM representation |
| 139 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 140 | + |
| 141 | +The Spectrum of London and Axillrod-Teller-Muto potential (SLATM) representation requires additional input to reduce the size of the representation. |
| 142 | +This input (the types of many-body terms) is generate via the ``get_slatm_mbtypes()`` function. The function takes a list of the nuclear charges for each molecule in the dataset as input. E.g.: |
| 143 | + |
| 144 | + |
| 145 | +.. code:: python |
| 146 | +
|
| 147 | + from qml.representations import get_slatm_mbtypes |
| 148 | +
|
| 149 | + # Assume 'qm7' is a list of Compound() objects. |
| 150 | + mbtypes = get_slatm_mbtypes([mol.nuclear_charges for compound in qm7]) |
| 151 | +
|
| 152 | + # Assume the QM7 dataset is loaded into a list of Compound() |
| 153 | + for compound in qm7: |
| 154 | +
|
| 155 | + # Generate the desired representation for each compound |
| 156 | + compound.generate_slatm(mbtypes, local=True) |
| 157 | +
|
| 158 | +The ``local`` keyword in this example specifies that a local representation is produced. Alternatively the SLATM representation can be generate via the ``qml.representations`` module: |
| 159 | + |
| 160 | +.. code:: python |
| 161 | +
|
| 162 | + from qml.representations import generate_slatm |
| 163 | +
|
| 164 | + # Dummy coordinates |
| 165 | + coordinates = ... |
| 166 | +
|
| 167 | + # Dummy nuclear charges |
| 168 | + nuclear_charges = ... |
| 169 | +
|
| 170 | + # Dummy mbtypes |
| 171 | + mbtypes = get_slatm_mbtypes( ... ) |
| 172 | +
|
| 173 | + # Generate one representation |
| 174 | + rep = generate_slatm(coordinates, nuclear_charges, mbtypes) |
| 175 | +
|
| 176 | +Here ``coordinates`` is an Nx3 numpy array, and ``nuclear_charges`` is simply a list of charges. |
| 177 | + |
| 178 | +Generating the FCHL representation |
| 179 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 180 | +The FCHL representation does not have an explicit representation in the form of a vector, and the kernel elements must be calculated analytically in a separate kernel function. |
| 181 | +The syntax is analogous to the explicit representations (e.g. Coulomb matrix, BoB, SLATM, etc), but is handled by kernels from the separate ``qml.fchl`` module. |
| 182 | + |
| 183 | +The code below show three ways to create the input representations for the FHCL kernel functions. |
| 184 | + |
| 185 | +First using the ``Compound`` class: |
| 186 | + |
| 187 | +.. code:: python |
| 188 | +
|
| 189 | + # Assume the dataset is loaded into a list of Compound() |
| 190 | + for compound in mols: |
| 191 | +
|
| 192 | + # Generate the desired representation for each compound, cut off in angstrom |
| 193 | + compound.generate_fchl_representation(size=23, cut_off=10.0) |
| 194 | + |
| 195 | + # Make Numpy array of the representation, which can be parsed to the kernel |
| 196 | + X = np.array([c.representation for c in mols]) |
| 197 | + |
| 198 | +
|
| 199 | +The dimensions of the array should be ``(number_molecules, size, 5, size)``, where ``size`` is the |
| 200 | +size keyword used when generating the representations. |
| 201 | + |
| 202 | +In addition to using the ``Compound`` class to generate the representations, FCHL representations can also be generated via the ``qml.fchl.generate_fchl_representation()`` function, using similar notation to the functions in the ``qml.representations.*`` functions. |
| 203 | + |
| 204 | + |
| 205 | +.. code:: python |
| 206 | +
|
| 207 | + from qml.fchl import generate_representation |
| 208 | +
|
| 209 | + # Dummy coordinates for a water molecule |
| 210 | + coordinates = np.array([[1.464, 0.707, 1.056], |
| 211 | + [0.878, 1.218, 0.498], |
| 212 | + [2.319, 1.126, 0.952]]) |
| 213 | +
|
| 214 | + # Oxygen, Hydrogen, Hydrogen |
| 215 | + nuclear_charges = np.array([8, 1, 1]) |
| 216 | +
|
| 217 | + rep = generate_representation(coordinates, nuclear_charges) |
| 218 | +
|
| 219 | +To create the representation for a crystal, the notation is as follows: |
| 220 | + |
| 221 | + |
| 222 | +.. code:: python |
| 223 | +
|
| 224 | + from qml.fchl import generate_representation |
| 225 | +
|
| 226 | + # Dummy fractional coordinates |
| 227 | + fractional_coordinates = np.array( |
| 228 | + [[ 0. , 0. , 0. ], |
| 229 | + [ 0.75000042, 0.50000027, 0.25000015], |
| 230 | + [ 0.15115386, 0.81961403, 0.33154037], |
| 231 | + [ 0.51192691, 0.18038651, 0.3315404 ], |
| 232 | + [ 0.08154025, 0.31961376, 0.40115401], |
| 233 | + [ 0.66846017, 0.81961403, 0.48807366], |
| 234 | + [ 0.08154025, 0.68038678, 0.76192703], |
| 235 | + [ 0.66846021, 0.18038651, 0.84884672], |
| 236 | + [ 0.23807355, 0.31961376, 0.91846033], |
| 237 | + [ 0.59884657, 0.68038678, 0.91846033], |
| 238 | + [ 0.50000031, 0. , 0.50000031], |
| 239 | + [ 0.25000015, 0.50000027, 0.75000042]] |
| 240 | + ) |
| 241 | +
|
| 242 | + # Dummy nuclear charges |
| 243 | + nuclear_charges = np.array( |
| 244 | + [58, 58, 8, 8, 8, 8, 8, 8, 8, 8, 23, 23] |
| 245 | + ) |
| 246 | +
|
| 247 | + # Dummy unit cell |
| 248 | + unit_cell = np.array( |
| 249 | + [[ 3.699168, 3.699168, -3.255938], |
| 250 | + [ 3.699168, -3.699168, 3.255938], |
| 251 | + [-3.699168, -3.699168, -3.255938]] |
| 252 | + ) |
| 253 | +
|
| 254 | + # Generate the representation |
| 255 | + rep = generate_representation(fractional_coordinates, nuclear_charges, |
| 256 | + cell=unit_cell, neighbors=100, cut_distance=7.0) |
| 257 | +
|
| 258 | +
|
| 259 | +The neighbors keyword is the max number of atoms with the cutoff-distance |
| 260 | + |
| 261 | +Generating the FCHL kernel |
| 262 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 263 | + |
| 264 | +The following example demonstrates how to calculate the local FCHL kernel elements between FCHL representations. ``X1`` and ``X2`` are numpy arrays with the shape ``(number_compounds,max_size, 5,neighbors)``, as generated in one of the previous examples. You MUST use the same, or larger, cut-off distance to generate the representation, as to calculate the kernel. |
| 265 | + |
| 266 | + |
| 267 | +.. code:: python |
| 268 | +
|
| 269 | + from qml.fchl import get_local_kernels |
| 270 | +
|
| 271 | + # You can get kernels for multiple kernel-widths |
| 272 | + sigmas = [2.5, 5.0, 10.0] |
| 273 | +
|
| 274 | + # Calculate the kernel-matrices for each sigma |
| 275 | + K = get_local_kernels(X1, X2, sigmas, cut_distance=10.0) |
| 276 | +
|
| 277 | + print(K.shape) |
| 278 | +
|
| 279 | +
|
| 280 | +As output you will get a kernel for each kernel-width. |
| 281 | + |
| 282 | +.. code:: |
| 283 | +
|
| 284 | + (3, 100, 200) |
| 285 | +
|
| 286 | +
|
| 287 | +In case ``X1`` and ``X2`` are identical, K will be symmetrical. This is handled by a separate function with exploits this symmetry (thus being twice as fast). |
| 288 | + |
| 289 | +.. code:: python |
| 290 | + |
| 291 | + from qml.fchl import get_local_symmetric_kernels |
| 292 | +
|
| 293 | + # You can get kernels for multiple kernel-widths |
| 294 | + sigmas = [2.5, 5.0, 10.0] |
| 295 | +
|
| 296 | + # Calculate the kernel-matrices for each sigma |
| 297 | + K = get_local_kernels(X1, sigmas, cut_distance=10.0) |
| 298 | +
|
| 299 | + print(K.shape) |
| 300 | +
|
| 301 | +
|
| 302 | +.. code:: |
| 303 | +
|
| 304 | + (3, 100, 100) |
| 305 | +
|
| 306 | +In addition to the local kernel, the FCHL module also provides kernels for atomic properties (e.g. chemical shifts, partial charges, etc). These have the name "atomic", rather than "local". |
| 307 | + |
| 308 | +.. code:: python |
| 309 | +
|
| 310 | + from qml.fchl import get_atomic_kernels |
| 311 | + from qml.fchl import get_atomic_symmetric_kernels |
| 312 | +
|
| 313 | +The only difference between the local and atomic kernels is the shape of the input. |
| 314 | +Since the atomic kernel outputs kernels with atomic resolution, the atomic input has the shape ``(number_atoms, 5, size)``. |
0 commit comments