3/6) Molecule editing - XP

We pick up from the end of part 2 of this tutorial. You should have an xterm window (or equivalent) on your computer screen.

You may notice in the steps below that XP is being run on a Mac from an xterm window entitled "xterm for xprep". This is just a consequence of the way I have X11 configured on my laptop. If you have tried to run XP on a Mac and been frustrated by the slow speed of rotation in proj, there is an excellent workaround that enables smooth and fast proj rotation.

Start XP by typing 'xp sucrose' at the prompt. You should get something like this:

There's not much to look at here! XP needs to be told what to do. The first thing is to build a connectivity table between all the potential atom sites (Q peaks) in the '.res' file. This is done with the FMOL command, which stands for something like 'form molecule'. When you hit <return> you get this:

Again, not much to look at. At this stage you can get a list of all the potential atom sites, along with their peak heights by using the command 'INFO'. Remember that large peak heights generally correspond to heavier atoms (more electrons), but at this stage there can be a lot of incorrect peaks, and some real atoms may be missing. To get a graphical view, or projection, of the structure, type 'proj' and hit <return>. A new window appears, which looks like this:

You may be able to recognize this as being vaguely molecule-like. The projection can be rotated by clicking on the yellow buttons to get different views. As stated above, XP is quite old, and it lacks some of the visual sophistication of more modern programs. For example, you can't rotate the projection by dragging it with the mouse. As you look at the projection, a few things should become clear. There are a couple of rings, one six and one five membered. The six membered ring appears to have a couple of spurious 'atoms' in the middle.

The initial model clearly needs to be edited to remove the incorrect features. We also need to figure out which of the Q peaks are carbon and which are oxygen. At this stage we are not concerned with hydrogen. Hydrogens will be found later once all the heavy atoms are in place and a sensible numbering scheme has been applied.

There are a few ways to edit the 'molecule'. An experienced user might just look at the '.res' file, decide that peaks Q24-Q29 are not real atoms, and simply remove them from the '.res' file using a text editor. Generally this is fine for grossly incorrect stuff, especially in easy structures with just a few non-hydrogen atoms. It can be done without even using XP. Certainly it would be ok to edit this structure like that provided you have sufficient experience, but the point of this tutorial is to introduce some of the features of XP.

Within XP there are a couple of ways to remove false 'atoms'. To remove the two Q peaks in the middle of the six-membered ring, for example, you could simply issue the command 'kill q24 q26' in the XP text window. To do this you need to exit from the graphical XP window by clicking on the yellow EXIT button. When you do this, the graphical window does not disappear, it just becomes inactive, and transfers control to the text window. Back in the XP text window, use the aforementioned kill command and hit <return>.

The program eliminates the two bad 'atoms', tells you so, and waits for the next command. To see what happened, you could get a fresh projection by typing 'proj' <return>, which gives you this:

That looks a whole lot better! You could also KILL the other spurious Q peaks, but there is graphical feature of XP that is very powerful for editing crystal structure models, and that is the PICK command. We'll use PICK in a moment, but first we will try to optimize the orientation of the molecule, as this makes it easier to see everything at once, and that usually makes peak picking easier. You could manually spin the model using the yellow buttons, but we will attempt a short cut. It so happens that a view perpendicular to the mean plane through a molecule is very often close to the best orientation for seeing everything. The XP command to find this mean plane is MPLN. So, we EXIT from the graphical window, type 'mpln' <return> in the text window, and type 'pick' <return> to bring up the interactive molecule editor.

Note that the mpln command made XP write a bunch of stuff to the text window. This output gives the perpendicular distance of each of the 'atom' sites to the mean plane. At this stage, this is not especially useful information. A variant of the mpln command, mpln/n will calculate the mean plane and suppress the output of these numbers. Either way, on typing 'pick' <return> you get the following:

As you can see in the picture above, Q29 is flashing on and off, which means it is active. PICK starts from the bottom of the list of Q peaks and gives you the chance to eliminate, keep, or rename them one by one. Remember, the larger the Q number, the smaller the peak. These weak peaks are more likely to be spurious. The top yellow band gives you some options: the return key <CR> will erase it, <space> will bypass it and move on to the next atom etc. If you make a mistake you can step backwards using the <backspace> (<delete> on a Mac). The bottom yellow band tells you distances between the flashing Q peak and its neighbours, which is often useful when deciding if a peak corresponds to a real atom. If you have a good idea of what the structure is, you can even start to assign atom types here. For this example we will wait until the junk has been removed from the molecule before assigning atom types.

At this stage you may end up erasing a real atom. Do not worry about this, real atoms always come back when the model is refined, whereas false 'atoms' don't. Once you have edited the molecule using pick, you can get back to the text window using the / key.

Back in the text window, the first thing to do is to type FMOL again, which registers all the kill and pick changes. For this structure, we now have a model with all the non-H atom sites identified. The next stage is to assign atom types.

As with removing spurious Q peaks, atom assignment can either be done in the text window, or interactively using pick. Remember that SHELXS assigned the Q labels according to peak height, with the smaller-number Qs being higher peaks. In a structure like sucrose (C12H22O11) we can be pretty sure that the smaller-number Qs will tend to be oxygen and the larger-number Qs will tend to be carbon. In the text window we could issue name commands, like this:

Unfortunately this has to be done one name at a time, which can be a lot of typing. If you were to assign the top eleven Q peaks as oxygen in this way, you could always check it by looking at a projection (proj). You should get something like this:

At first glance this looks ok, but it is not entirely correct. By only looking at Q peak height you do not get to use all your chemical knowledge and insight. See the atom labelled O11 in the above image. Any chemist can tell that 'O11' cannot be an oxygen! The next highest peak from the SHELXS solution, Q12, is clearly also not an oxygen, but it looks as though Q13 could be. For this reason, it is probably best to use pick to assign atom types, especially if the molecule is simple and well known to you, as sucrose should be.

For the next step in this molecular editing process there's a little less hand-holding. You need to EXIT from the graphical window, use pick again and work through the whole molecule assigning atom types. Once you have done this, exit from pick (/ key), type fmol in the text window and get a fresh projection (proj). You should have something like this:

You now have a model with all the non-H atoms identified correctly. One remaining problem is that the numbering scheme does not make much sense. The atom numbers are still ordered in terms of peak heights from the original SHELXS solution. You need to figure out a sensible numbering scheme and apply it using another pass through with pick. Unfortunately, for anything but the simplest structures, it is difficult to come up with a sensible numbering scheme on the fly, especially for a novice structure solver. The best strategy is to draw the molecule on paper and figure out a sensible numbering scheme. Even seasoned crystallographers who have completed hundreds or thousands of structures do this. In the present case, you might draw something like this:

The above is just an example, and the sketch does not have to be very good. If you can think of a better numbering scheme, go ahead and use your own. Some types of molecules (e.g. steroids, porphyrins etc.) have well-defined numbering schemes, and if this is the case you should at least try to follow the accepted convention. With others you have a bit more freedom, but if you work with a lot of similar structures it helps to try and number them in a consistent way. With your well-thought out numbering scheme in hand you can use pick to re-number all your atoms. After this round of pick you should get a projection that looks something like this:

We are not quite finished with this stage of the structure determination yet. Even though the numbering scheme is reasonable, the actual order of the atoms in the atom list is still the same as it was in the '.res' file created by SHELXS. You can see this if you EXIT from proj and type fmol in the text window:

The solution to this problem is to sort the atoms. They can be sorted individually, all at once or in groups. One convenient way of sorting for most structures is to use numerical order. You can do this using 'sort/n' in the text window. If you do this and then type fmol again, you get this:

An alternative might be to sort in terms of atom types, like this:

The best sort order will vary from structure to structure, especially if it is not a routine job. Structures with disorder, for example, require careful attention to the order in which the atoms are sorted. Such details do not concern us yet, as this is a simple structure. Nevertheless, it is wise to get in the habit of keeping your structure models logical and consistent. This is especially true if you expect to get help from a guru if you get stuck.

The model is now complete as far as non-hydrogen atoms are concerned. It needs to be saved as a fresh '.ins' file so that it can be refined. Refinement is done in stages using yet another program, SHELXL. To create a new '.ins' file that contains the model, use the command file in the XP text window, like this:

XP needs to know the name of the file that contains the rest of the instructions for the new '.ins' file. This is usually the previous '.res' file, and the program is smart enough to suggest this as the default. When you hit <return> you are sent back to the XP prompt so that you can exit the program and return to the command line.

You now have a new sucrose.ins file that has overwritten the old version. This will be used for least-squares refinement using SHELXL.

Part 1: Setting up the instructions file - XPREP
Part 2: Structure solution - SHELXS
Part 3: Molecular editing - XP
Part 4: Structure refinement - SHELXL
Part 5: Thermal ellipsoid and packing plots - XP
Part 6: The Crystallographic Information File - CIF