3/5: Optimal-superposition-via-quaternions
Optimal superposition of one set of atomic coordinates onto another is best achieved using quaternion algebra. Quaternions have the advantage of rapid computation and immunity from 'gimbal lock', a phenomemon affecting rotation matrices when axes that should be independent happen to become aligned. I first encountered quaternions in the mid-1990s while writing a Fortran program to align protein-crystal structures (program ProtRot) using an elegant implementation by Kearsley (1989). The same approach is used here in the program over-rot.py.
Over-rot.py finds all files in the working directory with extension .xyz and obtains the optimal translation and rotation to superimpose the molecules. In our case, these are metconazole-A.xyz and metconazole-B.xyz, but the method is not limited to just two structures. In later examples we'll see four, eight, and seventeen components. For any superposition, over-rot.py takes the first .xyz file it finds as the 'reference' and then calculates alignment parameters (translation and rotation) for all the other .xyz files present.
Let's start with a recap of what these .xyz files look like:
22 metconazole A Cl1A 11.975713 6.839304 6.630064 O1A 7.326818 0.526737 10.347148 . . lines omitted . C16A 11.490834 5.322523 8.825329 C17A 10.737900 4.418826 9.558406
22 metconazole B Cl1B 15.834993 2.554754 4.061488 O1B 16.832833 8.828427 9.962936 . . lines omitted . C16B 14.925302 3.922900 6.206208 C17B 15.071688 4.842193 7.238759
The snippets above show that coordinates of the two molecules differ substantially - corresponding atoms are not aligned. To perform a meaningful superposition, over-rot.py needs to know which atoms (or subset of atoms) to use for the fit. There are three ways to provide this information:
- 'all' to use all atoms
- a positive integer (e.g., n = 12) to use first n atoms
- an atom list (e.g., C1 C2 N1 N2 N3 O1 S1 etc.)
For convenience, this information may be in a file 'over-rot-atoms.txt' in the working directory, but if this file is absent, over-rot.py prompts the user interactively via the command line. Here's an example of a successful run of over-rot.py :
Using reference structure: metconazole-A.xyz No over-rot-atoms.txt found. Please specify which atoms to fit: Options: - 'all' to use all atoms - A number (e.g., n = 12) to use first n atoms - Atom list (e.g., C1 C2 N1 N2 N3 O1 S1 etc.) Atom specification: Using ALL atoms === Pre-alignment Analysis === Targeted Atoms RMSD Matrix (Å): 1 2 ---------------------------------------------------------------------------- 1 metconazole-A.xyz 0.000 9.472 2 metconazole-B.xyz 9.472 0.000 Global RMSD (pre-fitted atoms): 2.7342 Å === Alignment === Aligned metconazole-B.xyz to reference | Fit RMSD: 0.1355 Å | Rotation: 177.34° Rotation Angles: Index Filename Rotation Angle (°) Inverted ------------------------------------------------------------ 1 metconazole-A.xyz 0.00 No 2 metconazole-B.xyz 177.34 No All aligned structures written to over-rot.rot === Post-alignment Analysis === Rotated Atoms RMSD Matrix (Å): 1 2 ---------------------------------------------------------------------------- 1 metconazole-A.xyz 0.000 0.135 2 metconazole-B.xyz 0.135 0.000 Global RMSD (rotated atoms): 0.0391 Å Improvement: 2.6951 Å
In the above we specified 'all' atoms, but feel free to experiment. The program calculates the RMSD before fitting, performs the quaternion-based alignment, and then gives post-alignment statistics. It constructs a square matrix of RMSDs relative to the reference structure (here there are only two structures, so there's not much to see, later examples are more impressive). It then gives a global RMSD relative to the average of the structures present, defined by: \[\text{Global RMSD} = \sqrt{\frac{1}{N \cdot M} \sum_{i=1}^{N} \sum_{j=1}^{M} \left( x_{i,j} - \bar{x}_j \right)^2 }\] in which:
- \( N \) = Number of structures,
- \( M \) = Number of fitting atoms,
- \( x_{i,j} \) = Coordinates of atom \( j \) in structure \( i \),
- \( \bar{x}_j \) = Centroid coordinates of atom \( j \).
Over-rot.py also checks inverted coordinates for each non-reference structure. If the RMSD is smaller for the inverted coordinates, it asks whether to use the inverted form instead. This stuff is also written to the log file. It would be useful for non-Sohncke space groups that have molecules related by a symmetry operation of the second kind. The program also writes the overall rotation experienced by each fragment.
The last thing that over-rot.py does is to create a combined file with all the translated/rotated coordinates concatenated in a file named 'over-rot.rot' (which can of course be renamed). Here's a snippet of the .rot file that shows the first two atoms from each molecule after fitting:
22 metconazole A Cl1A 11.975713 6.839304 6.630064 O1A 7.326818 0.526737 10.347148 . . lines omitted . 22 metconazole-B Cl1B 12.089721 6.795058 6.779809 O1B 7.311501 0.549803 10.433183 . . lines omitted .
As you can see, coordinates for corresponding atoms are now much closer. The over-rot.rot file has individual aligned structures together in .xyz format. We'll use it to create an html/JavaScript webpage to view the overlayed structures in interactive 3D, from which high-resolution PNG images can be saved.
2: Extract atom coordinates from the CIF(s)
3: Optimal superposition via quaternions
4: 3D interactive graphics
5: Three Sn complexes and their pro-ligand