#!/bin/bash

# (c) early 2000---2025/10/01: CIFFIX (S. Parkin, University of Kentucky)
#
# A script that modifies CIFs output by George Sheldrick's SHELXL. It uses grep, 
# sed, awk, cut, rev, and other bash-script tools plus a non-interactive call to 
# Ton Spek's Platon to extract the 'moiety' formula. The combination of tools is 
# partly due to bits of it having been written at different times.  It should be 
# easy to modify for other laboratories (though that might be a bit tedious). In 
# the UKy X-Ray Facility it runs right after SHELXL as the penultimate task of a 
# SHELXL launch script. CIFFIX makes a single pass through a nascent SHELXL CIF.  
# It uses the available information from files present in the working directory, 
# so it requires a few files in addition to the CIF (including *.abs, *.pcf, and 
# others to be present, depending on the particular diffractometer). All editing 
# performed by CIFFIX is legal and standard fare for the UKy X-Ray Laboratory. 
# 2015/3/04: a bit of customization for the ALS (beamline 11.3.1) has been added 
# but it might be a bit haphazard, so check the resulting CIF carefully if using 
# a non-home laboratory wavelength (i.e., not Mo, Cu, Ag, Ga).
#
# All search and replace tasks, including those that need different boiler-plate 
# blurb for Mo vs Cu X-rays are handled properly.  Placeholders are in place for 
# Ag and Ga X-rays. Some ALS-specific substitutions are in place for synchrotron 
# data. Special cases for temperature and twinning are mostly handled ok for the 
# majority of cases. At the very least, easily edited placeholder text is added. 
# Parsing of the .RES and .CIF to add conditional text for hydrogen atoms is now 
# pretty much complete. It gets it right the majority of the time. Some blurb is 
# added for most riding hydrogen atoms, but a few special cases might still need 
# a bit of post-CIFFIX manual intervention. It even formats poorly arranged text 
# to make it more aesthetically pleasing. 
# 
# This script was written for OSX, but should be easy to convert to Linux. There   
# are a few differences between some scripting tools in Linux vs OS X, mainly to 
# do with the OS X tools being BSD Unix, which often seem to be a bit older than   
# the GNU versions.  For use on Linux for example, the .temp in the sed commands 
# is not needed for in-place search-and-replace tasks. A few other minor changes 
# may also be required. When bug fixes for SHELXL20xx finally peter out I'll fix 
# up the Linux version and post it on the X-Ray Facility website, here:
# http://xray.uky.edu/Resources/resources.html
#
# CIFFIX consists of the following sections:
# ---------------------------------------
#  1) Split the CIF into CIF+RES and HKL parts.
#  2) _chemical_ CIF changes - including the 'moiety formula' from Platon.
#  3) Get machine dependent control information. 
#  4) Determine crystal shape description. 
#  5) Check for constraints used in refinement. 
#  6) Check for restraints used in refinement. 
#  7) Check for twinning. 
#  8) Get hydrogen atom information. 
#  9) _audit_ comment and blurb placeholders. 
# 10) _cell_ CIF changes.
# 11) _exptl_ CIF changes. 
# 12) _diffrn_ CIF changes. 
# 13) _reflns_ CIF changes. 
# 14) _computing_ CIF changes. 
# 15) _refine_, _atom_sites_ and _shelx_res_file CIF changes. 
# 16) _publ_section_references add. 
# 17) _publ_section_exptl_refinement add twinning blurb. 
# 18) _publ_section_exptl_refinement add constraint/restraint blurb. 
# 19) _publ_section_exptl_refinement add hydrogen atom blurb. 
# 20) _publ_section_exptl_refinement add SQUEEZE blurb (& reference).
# 21) Clean up loose ends.
# 22) Timing diagnostics (in normal use, timing is off - see below).
#
# The next few lines turn overall and/or section timing on or off.  It should be 
# obvious how it works.  The line after that gets an initial time (before CIFFIX  
# does anything) in milliseconds. There are similar operations for each section, 
# which enable the individual sections to be timed (see SECTION 22). Install the 
# GNU coreutils, and the GNU date command can be accessed as 'gdate'.  Even when 
# timing is turned off the first and last gdate calls ensure that overall CIFFIX 
# runtime is obtained. These could be commented out, but they only cost a few ms 
# each. NOTE: Syntax to get time in ms is: date +%s%3N ('3' is for milli...).
#
overall_timing="on"
section_timing="off"
time0=$(gdate +%s%3N)

#===============================================================================
#-SECTION 1: SPLIT THE CIF INTO CIF+RES AND HKL PARTS--------------------------- 
#===============================================================================
# 
# CIFs written by SHELXL now have both the RES and HKL files appended along with 
# checksums for each appended file. This is done to ensure that no inappropriate 
# editing of the model, or alteration of refinement statistics is done. Although 
# CIFFIX will only ever edit lines in the main CIF section, by splitting the CIF  
# into 'CIF + RES' & 'HKL' temporary files, CIFFIX runtime is reduced because it 
# negates the need for Platon to waste any time calculating stuff with the 'FCF' 
# part of the file.  The separated parts are recombined in SECTION 21. Note that 
# SHELX-20xx has a utility, 'shredcif', which splits the full CIF into component
# parts, but shredcif isn't very convenient for use within CIFFIX, hence the use 
# of 'csplit' here.
#
# Make a backup copy and split the whole CIF into regular CIF+RES and HKL parts.
#
cp $1.cif $1"-original".cif
csplit -n1 -k -s $1.cif '/_shelx_hkl_file/'

# Give sensible names to the separated parts of the file. This is not necessary, 
# but it makes the script more readable.
#
mv xx0 $1.cif &
mv xx1 $1_hkl_file.cif &
#===============================================================================

# Get the time in milliseconds after completion of section 1.
#
if [ $section_timing == "on" ]; then
  time1=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 2: CHANGES TO _chemical_ CIF LINES------------------------------------
#===============================================================================
# 
# Use non-interactive Platon to get the 'moiety' formula and put the result into 
# the CIF at line _chemical_formula_moiety.  Also moves the 'sum' formula inline 
# with the _chemical_formula_sum keyword.  NB: On rare occasions with structures 
# having many different atom types, the moiety formula might need to be put on a 
# separate line so that the legacy 80 character per line CIF rule is not broken.  
# NOTE: The moiety formula assigned by Platon is usually ok, but sometimes it is
# not quite right. Careful checking is always a good idea. At the very least, an 
# easily editable placeholder gets inserted by CIFFIX. 
#
# For the non-interactive call to Platon, 'platon -C' is a good deal faster than 
# 'platon -u', presumably because it does not force Platon to do any unnecessary 
# validation tasks. NOTE: Since 141115, the moiety formula is not extracted from 
# the Platon-generated CIF until later (see SECTION 21).  This enables CIFFIX to 
# continue running concurrently with Platon (that's what the & at the end of the 
# platon line is for).  Provided that Platon is finished by the time CIFFIX gets 
# to SECTION 21, this will be ok. There are also a few notes in SECTION 21 about 
# this.  For some disordered structures, 'platon -C' will not write any suitable 
# moiety formula.  In such cases, manually run 'platon -u' and use that provided 
# in the *.chk file, about a dozen lines from the top, it will usually suffice.
#
#platon -u $1.cif > /dev/null 2>&1 &
platon -C $1.cif > /dev/null 2>&1 
platon_pid=`echo $!`
#echo " Platon process ID is $platon_pid."

# The next three lines of code put _sum_formula inline with its keyword.
# NOTE: Modify for Linux ?
#
formula=`awk '/_chemical_formula_sum/{getline; print}' $1.cif`
formula_line_number="`grep -n "$formula" $1.cif | cut -f1 -d":"`"
sed -i .temp -e "/_chemical_formula_sum/s/_chemical_formula_sum/_chemical_formula_sum           \ $formula \ /" -e "${formula_line_number}"d $1.cif 

# The next lines add a suitable entry for _chemical_absolute_configuration if it 
# is needed. NOTE: the _chemical_absolute_configuration line in SHELXL-generated 
# CIFs is embedded in the _refine section rather than with the _chemical_ lines, 
# but that is not important for CIFFIX operation.  The Flack x(su) number is the 
# second field on the _refine_ls_abs_structure_Flack line. Extract these numbers 
# with grep and awk. If the $flack variable is empty, then skip the rest of this 
# section.
#
flack=`echo $(grep "_refine_ls_abs_structure_Flack" $1.cif) | awk '{print $2}'`

if [ "$flack" != "" ]; then

# Use awk to extract the Flack parameter's su from in-between the parentheses in 
# the $flack variable.  Then extract the Flack parameter itself in the same way,
# but also strip any leading zeros (bash assumes octal when a number has leading 
# zeros).
#
# Next two lines get flack parameter and its SU.
flack_su=`echo $flack | awk -F'[()]' '{print $2}'`
flack=`echo $flack | awk -F'\(' '{printf $1}'`
flack=`echo "($flack * 1000 + 0.5)/1" | bc`

# Bash maths is integer only, so CIFFIX uses bc to do floating-point maths. This 
# should allow us to tell if Flack's parameter x(su) has sufficient precision to  
# determine absolute structure. Flack's original recommendation for significance
# was x < 0.05 provided that [x>3su(x)], but is now considered too conservative. 
# CIFFIX uses x <= 0.1 and x>=2su(x), but these can easily be changed.  If it is 
# twinned by inversion, then the assigned _chemical_absolute_configuration value 
# is exchanged in SECTION 17 with a dot ('.'), along with a remark that absolute 
# configuration is not applicable for crystals twinned by inversion. 
# NOTE: This mostly works, but it's best to check it in any case.
#
flack_su_ratio=`echo "scale=0; (($flack+1)/$flack_su)" | bc`
#echo "Flack parameter is: "$flack
#echo "Flack SU value is:  "$flack_su
#echo "Flack:SU ratio is:  "$flack_su_ratio

# Compare Flack x and its SU to see if they meet the criteria.  Here we take the 
# absolute value of Flack's parameter for the second test.  It's only ever going 
# to be marginally negative in any case. This seems to works on all cases tested 
# so far but needs more checking.
# NOTE: Shelxl inexplicably still uses "_chemical_absolute_configuration" rather 
# than something like "_crystal_absolute_structure".
#
if [[ $flack -le 50 ]] && [[ ${flack#-} -ge flack_su_ratio ]]; then
   sed -i .temp "s/_chemical_absolute_configuration  ?/_chemical_absolute_configuration  ad/" $1.cif
else
   sed -i .temp "s/_chemical_absolute_configuration  ?/_chemical_absolute_configuration  unk/" $1.cif
fi

fi
#===============================================================================

# Get the time in milliseconds after completion of section 2.
#
if [ $section_timing == "on" ]; then
  time2=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 3: GET MACHINE DEPENDENT CONTROL INFORMATION-------------------------- 
#===============================================================================
#
# Get the X-ray wavelength so that different blurbs can be written for different 
# diffractometers.  Of the six given below, only Mo and Cu are currently used in 
# the X-ray laboratory at UK.  If the wavelength does not match any of the usual 
# laboratory sources, then assume data is from the ALS synchrotron.  Any special 
# cases are going to require manual editing, but this at least will get suitable 
# placeholders inserted in appropriate places.
#
# Grep the wavelength and truncate it to two decimal places. This is used to set 
# a string variable containing the anode type or 'Sy' if it is synchrotron data. 
# Conversion to a string is not strictly needed but it makes reading the machine 
# dependent stuff easier to understand in subsequent sections of the script.
#
anode=`echo $(grep "_diffrn_radiation_wavelength" $1.cif) | awk '{print substr($2,0,4)}'`
#echo $anode

if [ "$anode" == "0.71" ]; then
  anode="Mo"
elif [ "$anode" == "1.54" ]; then
  anode="Cu"
#elif [ "$anode" == "0.56" ]; then
#  anode="Ag"
#elif [ "$anode" == "1.34" ]; then
#  anode="Ga"
#elif [ "$anode" == "0.51" ]; then
#  anode="In"
else
  anode="Sy"
fi
#===============================================================================

# Get the time in milliseconds after completion of section 3.
#
if [ $section_timing == "on" ]; then
  time3=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 4: DETERMINE CRYSTAL SHAPE DESCRIPTION-------------------------------- 
#===============================================================================
#
# Assign a crystal shape description from crystal dimensions or from information 
# in a REM statement. If the crystal shape is not given in a REM statement, this 
# section calculates a rudimentary description for the shape of the crystal.  It 
# could likely be improved a lot by tweaking some of the parameters.  Since bash 
# arithmetic is integer only, crystal size values are converted from millimetres 
# to microns. Aspect ratios are multiplied by 100 to give better discrimination. 
# The calculation of a shape description should only be needed for cases where a 
# better descriptor, e.g. if a "REM FORM shape", where 'shape' is something like 
# 'block', 'plate', 'needle' etc. remark has not been put in the SHELXL-20xx RES 
# file. Note: Some of the shapecodes used in testing might well be impossible or 
# superfluous, but they don't do any harm. The shape descriptor obtained here is 
# inserted in the CIF in SECTION 11. 
#
# Note:  SHELXL20xx writes crystal dimensions in the CIF to three decimal places 
# but only two of them are likely to be significant unless the crystal is a very 
# thin plate or a skinny needle. If the shape is given in the RES file on a line 
# such as "REM FORM shape", then use that description instead of calculating the 
# shape, as it is more likely to be better.  People are far better at describing 
# shapes than this script could ever hope to be. 
#
shape=`echo $(grep -m1 "REM FORM" $1.cif) | awk '{$1=$2=""; print}' | xargs`

# If the shape description has more than one word, then surround the description 
# with single quotes.  
#
var=$(echo "$shape" | wc -w )
if [ $var -gt "1" ]; then
   shape="'"$shape"'"
fi

# If no shape description was present on a "REM FORM shape" line in the INS file 
# then calculate a shape description using the crystal dimensions.
#
if [ $var == 0 ]; then

# Grep each of the crystal dimensions, and keep only the part after the decimal, 
# and trim away any leading zeros.
#
  max=`grep -m1 "_exptl_crystal_size_max" $1.cif | cut -f2 -d'.' | sed 's/^0*//'`
  mid=`grep -m1 "_exptl_crystal_size_mid" $1.cif | cut -f2 -d'.' | sed 's/^0*//'`
  min=`grep -m1 "_exptl_crystal_size_min" $1.cif | cut -f2 -d'.' | sed 's/^0*//'`

# Determine the aspect ratios of the maximum, middle and minimum cross sections.
#
  maxomid=$((100 * max / mid))
  maxomin=$((100 * max / min))
  midomin=$((100 * mid / min))   
  
# Obtain an aspect ratio for the 2D shape of the projection of the crystal if it 
# were viewed perpendicular to each of the size dimensions specified in the CIF. 
# These could probably be fine-tuned a bit.
#
  if [ $maxomid -le 200 ]; then
     aspect1="1" 
  fi
  if [ $maxomid -gt 200 ]; then
     aspect1="2" 
  fi
  if [ $maxomid -gt 500 ]; then
     aspect1="3"
  fi 
  if [ $maxomin -le 200 ]; then 
     aspect2="1"
  fi
  if [ $maxomin -gt 200 ]; then 
     aspect2="2"
  fi
  if [ $maxomin -gt 500 ]; then 
     aspect2="3"
  fi
  if [ $midomin -le 200 ]; then 
     aspect3="1"
  fi
  if [ $midomin -gt 200 ]; then 
     aspect3="2"
  fi
  if [ $midomin -gt 500 ]; then 
     aspect3="3"
  fi

# Generate a three-digit code to describe the shape of the crystal, and then try 
# to convert that code into a reasonable description.  This *sort of* works much 
# of the time, but still occasionally doesn't get it quite right. Obvious shapes 
# like 'block', 'plate' and 'needle' are assigned alright, but in-between shapes 
# such as 'slab', 'tablet', and 'lath' are sometimes muddled.  Might have been a 
# bit over ambitious in this section. It will likely be better to always include 
# the shape description using a statement such as "REM FORM shape".  Still, this 
# calculated assignment will suffice when a better description is not available. 
# At the very least it adds easily editable placeholder text.  
#
  shapecode=$aspect1$aspect2$aspect3
  if [ "$shapecode" == "111" ]; then
     shape="block"
  elif [ "$shapecode" == "121" ]; then
     shape="slab"
  elif [ "$shapecode" == "122" ]; then
     shape="slab"
  elif [ "$shapecode" == "133" ]; then
     shape="plate"
  elif [ "$shapecode" == "132" ]; then
     shape="plate"
  elif [ "$shapecode" == "222" ]; then
     shape="tablet"
  elif [ "$shapecode" == "223" ]; then
     shape="tablet"
  elif [ "$shapecode" == "232" ]; then
     shape="tablet"
  elif [ "$shapecode" == "233" ]; then
     shape="tablet"
  elif [ "$shapecode" == "231" ]; then
     shape="rod"
  elif [ "$shapecode" == "221" ]; then
     shape="rod"
  elif [ "$shapecode" == "331" ]; then
     shape="needle"
  elif [ "$shapecode" == "332" ]; then
     shape="lath"
  elif [ "$shapecode" == "333" ]; then
     shape="lath"
  fi
fi
shape="_exptl_crystal_description        "$shape
#===============================================================================

# Get the time in milliseconds after completion of section 4.
#
if [ $section_timing == "on" ]; then
  time4=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 5: CHECK FOR CONSTRAINTS USED IN REFINEMENT--------------------------- 
#===============================================================================
#
# Figure out which constraints (if any) were used during the refinement.  CIFFIX 
# puts the information into the string variable $constraints. This gets added to 
# the CIF as part of _publ_section_exptl_refinement in section 18, below.
# 
constraints=""
keyword=$(grep -i -m1 '^EXYZ ' $1.cif)
if [ "$keyword" != "" ]; then
  constraints=$constraints"EXYZ"
fi
keyword=$(grep -i -m1 '^EADP ' $1.cif)
if [ "$keyword" != "" ]; then
  constraints=$constraints"EADP"
fi
if [ "$constraints" == "EXYZEADP" ]; then
  constraints="(SHELXL commands EXYZ and EADP)"
else
  constraints="(SHELXL command "$constraints")"
fi
if [ "$constraints" == "(SHELXL command )" ]; then
  constraints=""
fi
#===============================================================================

# Get the time in milliseconds after completion of section 5.
#
if [ $section_timing == "on" ]; then
  time5=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 6: CHECK FOR RESTRAINTS USED IN REFINEMENT---------------------------- 
#===============================================================================
# 
# Figure out which restraints (if any) were used during the refinement, then put 
# the information into a string variable $restraints. This section tests for the 
# presence of SHELXL keywords for the various restraints and appends them onto a 
# string variable.  It appends them backwards, as that makes it easier to remove 
# a dangling comma. This is added to the CIF as a _publ_section_exptl_refinement 
# entry in SECTION 18.
#
restraints=""
keyword=$(grep -i -m1 '^SAME' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,EMAS"$restraints
fi
keyword=$(grep -i -m1 '^SADI' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,IDAS"$restraints
fi
keyword=$(grep -i -m1 '^DFIX' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,XIFD"$restraints
fi
keyword=$(grep -i -m1 '^DANG' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,GNAD"$restraints
fi
keyword=$(grep -i -m1 '^BUMP' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,PMUB"$restraints
fi
keyword=$(grep -i -m1 '^FLAT' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,TALF"$restraints
fi
keyword=$(grep -i -m1 '^CHIV' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,VIHC"$restraints
fi
keyword=$(grep -i -m1 '^SIMU' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,UMIS"$restraints
fi
keyword=$(grep -i -m1 '^RIGU' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,UGIR"$restraints
fi
keyword=$(grep -i -m1 '^DELU' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,ULED"$restraints
fi
keyword=$(grep -i -m1 '^ISOR' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,ROSI"$restraints
fi
keyword=$(grep -i -m1 '^SUMP' $1.cif)
if [ "$keyword" != "" ]; then
  restraints=" ,PMUS"$restraints
fi

# Next lines remove the dangling "," from the list of restraints, inserts " dna" 
# and finally uses 'rev'. This works well but is a bit convoluted. If there is a 
# better way to do it, please tell me.  It has to write 'dna' because of the rev 
# operation. It even includes the Oxford comma (or not - your choice!). If there 
# are only two types of restraint, it separates them with ' and '. In cases with
# only one type of restraint, it removes 's and', which negates the need for any 
# sed search-and-replace fix up later. 
# 
if [ "$restraints" != "" ]; then
  restraints="(SHELXL commands "` echo "${restraints:0:6} dna${restraints:6}" | cut -c 3- | rev`")"
  restraints=`echo ${restraints//s and/}`  

# If only two types of restraint were used, then remove the superfluous comma.
#
if [ "${#restraints}" == "32" ]; then
  restraints=`echo ${restraints//, and/ and}`
fi

# If you do not want the Oxford comma in a longer restraints list, uncomment the 
# next line.
#  restraints=`echo ${restraints//, and/ and}`  
fi
#===============================================================================

# Get the time in milliseconds after completion of section 6.
#
if [ $section_timing == "on" ]; then
  time6=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 7: CHECK FOR TWINNING------------------------------------------------- 
#===============================================================================
# 
# Test if twinning is present, and if so, put the type of twinning into a string 
# variable $twin_type.  See also SECTION 17, where the actual edits for twinning 
# are made. If "HKLF 5" is present, it assumes twinning by non-merohedry. If the
# TWIN instruction is present, but has no matrix, then it must be inversion. For 
# inversion twinning with no BASF, assume perfect inversion twinning, with equal 
# amounts of each component.  If the TWIN line has a matrix specified, it checks 
# the beta angle.  If beta is an integer, then it assumes twinning by merohedry, 
# otherwise it assumes twinning by pseudo-merohedry. This ought to cover all the 
# most common cases, and most of the uncommon cases.  NOTE:  The beta angle test 
# works by checking if the angle has a standard uncertainty in parentheses (that 
# is all it does). 
#
# Get the HKLF code (treatment is different for 4 vs. 5). NOTE: the caret symbol 
# causes grep to just look for the first field on the line.
#
hklf_type=`echo $(grep -i -m1 '^HKLF' $1.cif) | awk '{print $2}'`

# Check whether there is a TWIN matrix specified.  
#
twin_line=$(grep -i -m1 '^TWIN' $1.cif)
twin_type=`echo $twin_line | awk '{print $2}'`

# Check whether there is a BASF specified. 
# 
basf_line=$(grep -i -m1 '^BASF' $1.cif)

# Check whether the beta angle is an integer by testing if the last character on 
# the line is a close parenthesis ")".  It greps the line, reverses it, and then 
# tests the first character. If there's a better way to do this, please tell me.
#
cell_beta_test=`echo $(grep -m1 "_cell_angle_beta " $1.cif) | awk '{print $2}' | rev` 
if [ "${cell_beta_test:0:1}" != ")" ]; then
  cell_beta_test="integer beta"
else
  cell_beta_test="non-integer beta"
fi

# If no matrix is given on the TWIN line then assume twinning by inversion, else 
# test the beta angle. If the beta angle is non-integer, then assume twinning by 
# pseudo-merohedry, else assume twinning by merohedry.  This ought to cover most 
# cases properly, but see below for twinning by non-merohedry.  Anything that is 
# not covered is likely to be quite rare and would need manual editing (no cases 
# encountered so far).  If the user has included an inversion matrix, then first 
# set variable $twin_type to an empty string so it gets assigned as inversion in 
# the subsequent few lines.  This is a bit clumsy, but should work well enough.
#
twin_matrix=`echo $twin_line | awk '{print $2,$3,$4,$5,$6,$7,$8,$9,$10}'`
if [ "$twin_matrix" == "-1 0 0 0 -1 0 0 0 -1" ]; then
  twin_type=""
fi
if [ "$twin_type" == "" ]; then
  twin_type="inversion"
else 
if [ "$cell_beta_test" == "non-integer beta" ]; then
  twin_type="pseudo-merohedry"
else
  twin_type="merohedry"
fi
fi
if [[ "$twin_type" == "inversion" ]] && [[ "$basf_line" == "" ]]; then
  twin_type="perfect inversion"
fi 

# If there was no TWIN line in the CIF, then first assume no twinning. This will 
# then be altered if there is an HKLF code of 5.
#
if [ "$twin_line" == "" ]; then
   twin_type="no twinning"
fi

# If the HKLF type is 5, then assume twinning by non-merohedry. Also specify the 
# absorption, scaling, and merging program as TWINABS. The correct reference for 
# TWINABS will then be swapped for the SADABS reference in section 16. 
#
if [ "$hklf_type" == "5" ]; then
  twin_type="non-merohedry"
  twinabs="TWINABS"
fi
#===============================================================================

# Get the time in milliseconds after completion of section 7.
#
if [ $section_timing == "on" ]; then
  time7=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 8: GET HYDROGEN ATOM INFORMATION-------------------------------------- 
#===============================================================================
# 
# The hydrogen-atom blurbs constructed here are added to the CIF in section 19.
#
# The original plan was to search the embedded *.RES file for AFIX commands that 
# define riding hydrogens, reduce the resulting list to a unique group, and then 
# construct suitable crystal-dependent blurbs from that list, similar to how the 
# restraints are done in section 6.  It turned out to be more straightforward to 
# also grep for "00 . ?" in the bond lengths list and work from there.  The AFIX 
# types need to be grep'd first so that aromatic, terminal alkene and alkyne C-H 
# (AFIX codes 43, 93 & 163 respectively), which all have the same bond distance, 
# get properly described.
# 
# Get temperature by grep'ing "TEMP", taking only the second field and strip off 
# the decimal point and any fraction by taking just the integer part.  NOTE: The 
# temperature obtained here is also used in section 11.
#
temperature=`echo $(grep -i -m1 '^TEMP ' $1.cif) | awk '{print int($2)}'`

# The next line greps the CIF for unique lines beginning with "AFIX", sorts them  
# removes duplicates and truncates multiple spaces to a single space. The result 
# is inserted into a temporary file .hfix.temp that is used to find if there are 
# alkene (includes aromatics), terminal alkene and/or alkyne present (used later 
# to decide which unsaturated carbon C---H distances to quote), as well as if OH 
# and/or methyl groups are present (used to decide what to enter for 1.5 U~eq~). 
# Actual edits to the are made in section 19.
#
grep -i '^AFIX ' $1.cif | sort -u | tr -s [:space:] > .hfix.temp 

# Grep the temporary file for the particular HFIX (AFIX) cases mentioned above.
#
alkene=`grep -m1 "43" .hfix.temp`
terminal_alkene=`grep -m1 "93" .hfix.temp`
alkyne=`grep -m1 "163" .hfix.temp`
hydroxyl147=`grep -m1 "147" .hfix.temp`
hydroxyl83=`grep -m1 "83" .hfix.temp`
methyl137=`grep -m1 "137" .hfix.temp`
methyl33=`grep -m1 "33" .hfix.temp`

# Test for whether either common type of hydroxyl riding model is present. 
# 
if [[ "$hydroxyl147" != "" ]] || [[ "$hydroxyl83" != "" ]]; then 
   hydroxyl="yes"
fi

# Test for whether either common type of methyl riding model is present. 
# 
if [[ "$methyl137" != "" ]] || [[ "$methyl33" != "" ]]; then 
   methyl="yes"
fi

# If either alkene (including aromatic) or terminal alkene hydrogens are present 
# then specify that hydrogen is bonded to an sp2-hybridized carbon.
#
if [[ "$alkene" != "" ]] || [[ "$terminal_alkene" != "" ]]; then
   sp2_carbon="(C~sp2~H)"
fi

# If alkyne hydrogens are present then specify hydrogen on sp-hybridized carbon.
#
if [ "$alkyne" != "" ]; then
   sp_carbon="(C~sp~H)"
fi

# Catenate the alkene and alkyne types ...
#
unsaturated_CH=$sp2_carbon$sp_carbon

# If there are both alkene and alkyne, then stick a comma between them.
#
if [ "$unsaturated_CH" == "(C~sp2~H)(C~sp~H)" ]; then
   unsaturated_CH="(C~sp2~H, C~sp~H)"
fi

# Get all CIF entries that have "00 . ?", take the first characters of the first 
# and second field plus the whole of the third field, sort and remove duplicates 
# and feed the result into a temporary file.
# 
grep "00 . ?" $1.cif | awk '{print substr($1,0,1), substr($2,0,1), $3}' | sort -u > .hydrogens.temp

# Different bond distances for H-atoms are used for different temperatures in an 
# approximate treatment for the effects of libration. These assignments are made 
# by SHELXL-20xx.  Since the alkene and alkyne type hydrogens are given the same 
# bond distance, a placeholder is inserted here so that the right combination of 
# H-atom types are substituted using the "$unsaturated_CH" string variable found 
# above.  The two different entries for B-H distances for each temperature range 
# are presumably necessary to account for different B-H bond types. We don't see 
# many boranes in this laboratory, so BH bonds are beyond our normal experience. 
# The B-H entries here were made by a trial-and-error process, and might need to 
# be revised.  NB: There is still the occasional odd discrepancy between the new 
# SHELXL20xx and the old SHELXL97.  Some might indicate bugs in SHELXL-20xx, but 
# they are easy to deal with in ciffix with further search-and-replace lines.
#
if [ $temperature -lt -70 ]; then
sed -i .temp "s/C H 1.0000/1.00 \\\%A (R~3~CH),\ /;\
s/C H 0.9900/0.99 \\\%A (R~2~CH~2~),\ /;\
s/C H 0.9800/0.98 \\\%A (RCH~3~),\ /;\
s/C H 0.9500/0.95 \\\%A Csp2H,\ /;\
s/N H 0.8800/0.88 \\\%A (N~sp2~H),\ /;\
s/N H 0.9100/0.91 \\\%A (N~sp3~H),\ /;\
s/N H 1.0000/1.00 \\\%A (R~3~NH),\ /;\
s/O H 0.8400/0.84 \\\%A (OH),\ /;\
s/B H 1.1200/1.12 \\\%A (BH),\ /;\
s/B H 1.0000/1.00 \\\%A (BH),\ /" .hydrogens.temp
elif [ $temperature -lt -20 ]; then
sed -i .temp "s/C H 0.9900/0.99 \\\%A (R~3~CH),\ /;\
s/C H 0.9800/0.98 \\\%A (R~2~CH~2~),\ /;\
s/C H 0.9700/0.97 \\\%A (RCH~3~),\ /;\
s/C H 0.9400/0.94 \\\%A Csp2H,\ /;\
s/N H 0.8700/0.87 \\\%A (N~sp2~H),\ /;\
s/N H 0.9000/0.90 \\\%A (N~sp3~H),\ /;\
s/N H 0.9900/0.99 \\\%A (R~3~NH),\ /;\
s/O H 0.8300/0.83 \\\%A (OH),\ /;\
s/B H 1.1100/1.11 \\\%A (BH),\ /;\
s/B H 0.9900/0.99 \\\%A (BH),\ /" .hydrogens.temp
else
sed -i .temp "s/C H 0.9800/0.98 \\\%A (R~3~CH),\ /;\
s/C H 0.9700/0.97 \\\%A (R~2~CH~2~),\ /;\
s/C H 0.9600/0.96 \\\%A (RCH~3~),\ /;\
s/C H 0.9300/0.93 \\\%A Csp2H,\ /;\
s/N H 0.8600/0.86 \\\%A (N~sp2~H),\ /;\
s/N H 0.8900/0.89 \\\%A (N~sp3~H),\ /;\
s/N H 0.9800/0.98 \\\%A (R~3~NH),\ /;\
s/O H 0.8200/0.82 \\\%A (OH),\ /;\
s/B H 1.1000/1.10 \\\%A (BH),\ /;\
s/B H 0.9800/0.98 \\\%A (BH),\ /" .hydrogens.temp
fi
sed -i .temp "s/Csp2H/$unsaturated_CH/" .hydrogens.temp

# There's probably a better way to get the CIF typesetting for the angstrom sign 
# than this! 
#
sed -e 's/%A/\\%A/g' .hydrogens.temp > .hydrogens.temp.tmp
mv .hydrogens.temp.tmp .hydrogens.temp

# Place the contents of file .hydrogens.temp into the string variable hfix_types 
# and append some easily replaceable marker (i.e. "end_of_list" in this case) at 
# the end. Then replace the marker with a full stop, and lastly replace the last 
# comma with "and" using a combination of rev, sed, and then rev again.  Kind of 
# convoluted perhaps, but it is fast and works well.
#
hfix_types=` cat .hydrogens.temp`"end_of_list"
hfix_types=` echo $hfix_types | sed 's/, end_of_list/. /' | rev | sed 's/,/dna /' | rev ` 
#===============================================================================

# Get the time in milliseconds after completion of section 8.
#
if [ $section_timing == "on" ]; then
  time8=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 9: CIFFIX _audit_ COMMENTS AND _publ_ BLURB PLACEHOLDERS--------------
#===============================================================================
# 
# Next line inserts a comment about CIFFIX after _audit_creation_method and adds 
# a bunch of placeholders for other blurbs that go after the newly added _shelx_ 
# version number line, but before the _chemical_name_systematic line.  The place 
# where 'and CIFFIX' is added needs to be improved, otherwise it will need to be 
# fixed every time a new version of SHELXL is put out.  
#
sed -i .temp '/_audit_creation_method/s/SHELXL-2025\/1/SHELXL-2025\/1 and CIFFIX/' $1.cif 

sed -i .temp '/_shelx_SHELXL_version_number/s/_shelx_SHELXL_version_number /# CIF edited on-the-fly by script CIFFIX (S. Parkin, 2000-2025). \
# CIFFIX: https:\/\/\xray.uky.edu\/\Resources\/\scripts\/\ciffix \
_audit_update_record              ? \
 \
_shelx_SHELXL_version_number /' $1.cif 

sed -i .temp '/^_chemical_name_systematic/s/_chemical_name_systematic         ?/ \
_publ_section_exptl_refinement\
; \
hydrogen_atoms_blurbconstraints_restraints_blurbtwinning_blurbsqueeze-blurb; \
 \
_publ_section_references\
; \
insert_references_here \
; \
 \
_publ_section_acknowledgements\
; \
insert_acknowledgments_here\
; \
 \
_chemical_name_systematic         ?\ /' $1.cif 
#===============================================================================

# Get the time in milliseconds after completion of section 9.
#
if [ $section_timing == "on" ]; then
  time9=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 10: CHANGES TO _cell_ CIF LINES---------------------------------------
#===============================================================================
#
# On the _cell_ CIF lines, change cell parameters that have 0 for the last digit 
# and a SU of (10) so that the non-significant 0 is removed and the SU is set to
# (1). The reason for making this change is that such entries are invariably due 
# to inappropriate rounding by one program or another.  This change violates the 
# IUCr's 'rule-of-19', but it does effectively suppress an over-zealous checkCIF 
# complaint. Note: Changes are only made to numbers on the _cell lines, so other 
# entries that fit the search pattern by chance (such as coordinates, distances, 
# angles etc.) are left unscathed.  Also, for data collections done at about 90K 
# the CryoIndustries LT2 low-temperature machines are much more precise than the 
# SHELXL-20xx default esd of 2 degrees. Here it is changed to 0.2 degrees but it 
# is actually better than 0.1 degree at 90K.  This has been verified a number of 
# times using referenced Cu-constantan thermocouples and with Si-diodes.
#
sed -i .temp "/0(10)/s/0(10)/(1)/;\
/_cell_measurement_temperature     90(2)/s/_cell_measurement_temperature     90(2)/_cell_measurement_temperature     90.0(2) /" $1.cif 
sed -i .temp "/0(10)/s/0(10)/(1)/;\
/_cell_measurement_temperature     100(2)/s/_cell_measurement_temperature     100(2)/_cell_measurement_temperature     100.0(2) /" $1.cif 

# The theta_min and theta_max values and the number of reflections used for cell 
# refinement are found in different places for the kappa CCD (nreport.html file) 
# versus the X8 Proteum (*.pcf file).
#
#if [ "$anode" = "Mo" ]; then
#  line=$(grep "<H2>Unit cell</H2>" nreport.html)
#  nref="_cell_measurement_reflns_used     "`echo $line | awk '{print substr($2,10)}'`
#  theta_limits=`echo $line | awk '{print $5}'`
#  thmin="_cell_measurement_theta_min       "${theta_limits:0:4}
#  thmax="_cell_measurement_theta_max       "${theta_limits:22:5}  
#fi
#if [ "$anode" != "Sy" ]; then
#  nref="_cell_measurement_reflns_used     "`echo $(grep -m1 "_cell_measurement_reflns_used" *.pcf) | awk '{print $2}'`
#  thmin="_cell_measurement_theta_min       "`echo $(grep -m1 "_cell_measurement_theta_min" *.pcf) | awk '{print $2}'`
#  thmax="_cell_measurement_theta_max       "`echo $(grep -m1 "_cell_measurement_theta_max" *.pcf) | awk '{print $2}'`
#fi
#if [ "$anode" = "Sy" ]; then
#  nref="_cell_measurement_reflns_used     "`echo $(grep -m1 "_cell_measurement_reflns_used" *.pcf) | awk '{print $2}'`
#  thmin="_cell_measurement_theta_min       "`echo $(grep -m1 "_cell_measurement_theta_min" *.pcf) | awk '{print $2}'`
#  thmax="_cell_measurement_theta_max       "`echo $(grep -m1 "_cell_measurement_theta_max" *.pcf) | awk '{print $2}'`
#fi

nref="_cell_measurement_reflns_used     "`echo $(grep -m1 "_cell_measurement_reflns_used" *.pcf) | awk '{print $2}'`
thmin="_cell_measurement_theta_min       "`echo $(grep -m1 "_cell_measurement_theta_min" *.pcf) | awk '{print $2}'`
thmax="_cell_measurement_theta_max       "`echo $(grep -m1 "_cell_measurement_theta_max" *.pcf) | awk '{print $2}'`

# Make the above substitutions.
#
sed -i .temp "/_cell_measurement_reflns_used/s/_cell_measurement_reflns_used .*$/$nref /;\
/_cell_measurement_theta_min/s/_cell_measurement_theta_min .*$/$thmin /;\
/_cell_measurement_theta_max/s/_cell_measurement_theta_max .*$/$thmax /" $1.cif 
#===============================================================================

# Get the time in milliseconds after completion of section 10.
#
if [ $section_timing == "on" ]; then
  time10=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 11: CHANGES TO _exptl_ CIF LINES-------------------------------------- 
#===============================================================================
#
# Some of these are dependent on the wavelength used (and hence on which machine 
# was used).  NOTE: The crystal temperature for data collection was extracted in 
# section 8, as it is used in constructing H-atom descriptions. Also use grep to 
# get Tmin and Tmax values from the .abs file written by SADABS (if present) and 
# add them to the CIF. If file $1.abs does not exist, nothing is added to either 
# Tmin or Tmax lines in the CIF.  If the structure was twinned by non-merohedry, 
# and data were collected on the X8 diffractometer, then it gets values from the 
# *.abs file written by TWINABS. The wording in TWINABS and SADABS .abs files is 
# different so separate cases are needed.  
# 
# Standard practice in the UK X-Ray Facility is to include the crystal colour in 
# the RES file so as to ensure that as much useful information about the crystal
# is retained at all times. Since there is no SHELXL command for crystal colour,
# it must go in a REM comment.  If the colour description has more than one word 
# then it needs to be surrounded by single quotes. The following greps the line, 
# uses awk to take all fields except the first two, and uses xargs to strip away 
# leading and trailing spaces. The space trimming could also be done with sed.
#
colour=`echo $(grep "REM COLR" $1.cif) | awk '{$1=$2=""; print}' | xargs`
var=$(echo "$colour" | wc -w )
if [ $var -gt "1" ]; then
   colour="'"$colour"'"
fi
colour="_exptl_crystal_colour             "$colour

sed -i .temp "s/_exptl_crystal_description        ?/$shape /;\
s/_exptl_crystal_colour             ?/$colour /;\
s/_exptl_absorpt_correction_type   multiscan/_exptl_absorpt_correction_type    multi-scan /;\
s/_exptl_absorpt_correction_type   multi-scan/_exptl_absorpt_correction_type    multi-scan /;\
s/_exptl_absorpt_correction_type    ?/_exptl_absorpt_correction_type    multi-scan /" $1.cif

# Grep the Tmin and Tmax values from the appropriate *.abs file.
#
tminmax=`grep "Estimated minimum and maximum transmission:" $1.abs` 
if [ "$twinabs" != "" ]; then
   tminmax=`grep "Minimum and maximum apparent transmission:" $1.abs` 
fi

# The required values are the 6th and 7th fields.  Truncate to three places, and 
# append to the appropriate CIF keyword.
#
tmin="_exptl_absorpt_correction_T_min   "`echo $tminmax | awk '{printf("%4.3f", $6)}'`
tmax="_exptl_absorpt_correction_T_max   "`echo $tminmax | awk '{printf("%4.3f", $7)}'`

sed -i .temp "/_exptl_absorpt_correction_T_min/s/_exptl_absorpt_correction_T_min .*$/$tmin /;\
/_exptl_absorpt_correction_T_max/s/_exptl_absorpt_correction_T_max .*$/$tmax /" $1.cif

# Substitute the proper citation for SADABS or TWINABS.
#
sed -i .temp "/_exptl_absorpt_process_details/s/_exptl_absorpt_process_details*.*/_exptl_absorpt_process_details    '<i>SADABS<\/\i> (Krause <i>et al<\/\i>., 2015)' /" $1.cif
if [ "$twinabs" == "TWINABS" ]; then
   sed -i .temp "/_exptl_absorpt_process_details/s/_exptl_absorpt_process_details    '<i>SADABS<\/\i> (Krause <i>et al<\/\i>., 2015)' /_exptl_absorpt_process_details    '<i>TWINABS<\/\i> (Sheldrick, 2012)' /" $1.cif
fi

# Add something to the exptl_special_details section. Different text is included 
# for different temperatures.  Note: This section ought to be checked by hand if 
# there were any special circumstances, such as a non-typical temperature or the
# like.  Note that version 2014/4 and later of SHELXL-2014 no longer write lines 
# for _exptl_special_details to the CIF, but instead writes something to do with 
# absorption correction details (_exptl_absorpt_special_details). I always found 
# the _exptl_special_details entry useful, so CIFFIX re-introduces it here below 
# the _exptl_absorpt_special_details entry.  This means that a format adjustment 
# with grep and awk is no longer needed.  These are the five commented out lines 
# below. These lines have not been removed completely because they may very well 
# be useful again for some future release of SHELXL.
#
#var=`grep -n "_exptl_special_details" $1.cif | cut -f1 -d":"`
#var=`expr $var + 1`
#var2=`expr $var + 3`
#awk -v m=$var -v n=$var2 'm <= NR && NR <= n {next} {print}' $1.cif > tmp
#mv tmp $1.cif
#

if [[ $anode != "Sy" ]]; then
sed -i .temp '/_exptl_absorpt_special_details/s/_exptl_absorpt_special_details    ?/_exptl_absorpt_special_details    ? \
_exptl_special_details \
; \
The crystal was mounted using polyisobutene oil on the tip of a fine glass \
fibre, which was fastened in a copper mounting pin with electrical solder. \
It was placed directly into the cold gas stream of a liquid-nitrogen based \
cryostat (Hope, 1994; Parkin \&\ Hope, 1998).\
temperature flag\
; \
 /' $1.cif 
else
sed -i .temp '/_exptl_absorpt_special_details/s/_exptl_absorpt_special_details    ?/_exptl_absorpt_special_details    ? \
_exptl_special_details \
; \
The crystal was mounted using polyisobutene oil on the tip of a polyimide \
scoop, which was fastened in a copper mounting pin. It was placed directly \
into the cold gas stream of a liquid-nitrogen based cryostat (Hope, 1994; \
Parkin \&\ Hope, 1998).\
temperature flag\
; \
 /' $1.cif 
fi

if [ $temperature -eq -183 ]; then
sed -i .temp '/temperature flag/s/temperature flag/\
Diffraction data were collected with the crystal at 90K, which is standard \
practice in this laboratory for the majority of flash-cooled crystals. \ /' $1.cif
fi
if [ $temperature -eq -173 ]; then
#if [ $anode -eq "Sy" ]; then
sed -i .temp '/temperature flag/s/temperature flag/\
Diffraction data were collected with the crystal at 100K. \ /' $1.cif
fi
if [ $temperature -gt 18 ]; then
sed -i .temp '/temperature flag/s/temperature flag/\
Data were collected at room temperature. /' $1.cif
fi
if [ $temperature -gt -173 ]; then
sed -i .temp '/temperature flag/s/temperature flag/\
The crystals appeared to undergo a destructive phase transition when cooled \
to 90K. Visual inspection of crystal integrity and diffraction quality vs \
temperature established a safe temperature for data collection of insert_temperature. \ /' $1.cif
temperature=$temperature'\\% C'
sed -i .temp "/insert_temperature/s/insert_temperature/$temperature/" $1.cif
fi

# Add something to the _exptl_absorpt_special_details section, but only when the 
# correction type has been set to multi-scan.
#
abs_corr_type=`echo $(grep "_exptl_absorpt_correction_type " $1.cif) | awk '{print $2}'`
if [ $abs_corr_type == "multi-scan" ]; then
sed -i .temp '/_exptl_absorpt_special_details/s/_exptl_absorpt_special_details    ?/_exptl_absorpt_special_details \
; \
In addition to absorption, multi-scan techniques also correct other slowly \
varying changes in scale factor over reciprocal space, such as those caused \
by X-ray beam inhomogeneity, goniometer imperfection, crystal decomposition, \
crystal longer than the X-ray beam cross section, absorption by the crystal \
mount, scan truncation etc. These account, by and large, for the difference \
between SHELX estimates of Tmin and Tmax and the _exptl_absorpt_correction_ \
values.\
; \
 /' $1.cif 
fi
#===============================================================================

# Get the time in milliseconds after completion of section 11.
#
if [ $section_timing == "on" ]; then
  time11=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 12: CHANGES TO _diffrn_ CIF LINES------------------------------------- 
#===============================================================================
#
# Some of these edits are conditional, dependent on the wavelength (and hence on  
# which machine was used).
# 
sed -i .temp "s/_diffrn_ambient_temperature       90(2)/_diffrn_ambient_temperature       90.0(2) /;\
s/_diffrn_ambient_temperature       100(2)/_diffrn_ambient_temperature       100.0(2) /;\
s/_diffrn_reflns_av_R_equivalents    ?/_diffrn_reflns_av_R_equivalents   ?/;\
s/_diffrn_measured_fraction_theta_max   /_diffrn_measured_fraction_theta_max                /;\
s/_diffrn_measured_fraction_theta_full  /_diffrn_measured_fraction_theta_full               /;\
s/_diffrn_reflns_Laue_measured_fraction_max    /_diffrn_reflns_Laue_measured_fraction_max          /;\
s/_diffrn_reflns_Laue_measured_fraction_full   /_diffrn_reflns_Laue_measured_fraction_full         /" $1.cif
#if [ "$anode" = "Mo" ]; then
#sed -i .temp "s/_diffrn_source                    ?/_diffrn_source                    'fine-focus sealed-tube' /;\
#s/_diffrn_measurement_device_type   ?/_diffrn_measurement_device_type   'Nonius KappaCCD diffractometer' /;\
#s/_diffrn_measurement_method        ?/_diffrn_measurement_method        '\\\f and \\\w scans at fixed \\\c = 55\\\%' /;\
#s/_diffrn_detector_area_resol_mean  ?/_diffrn_detector_area_resol_mean  9.1 /" $1.cif
#fi
if  [ "$anode" != "Sy" ]; then
sed -i .temp "s/_diffrn_source                    ?/_diffrn_source                    'microsource' /;\
s/_diffrn_measurement_device_type   ?/_diffrn_measurement_device_type   'Bruker D8 Venture dual source' /;\
s/_diffrn_measurement_method        ?/_diffrn_measurement_method        '\\\f and \\\w scans' /;\
s/_diffrn_detector_area_resol_mean  ?/_diffrn_detector_area_resol_mean  7.41 /g" $1.cif
fi
if  [ "$anode" = "Sy" ]; then
sed -i .temp 's/_diffrn_radiation_type            ?/_diffrn_radiation_type            synchrotron \
_diffrn_radiation_monochromator   ? /' $1.cif
sed -i .temp "s/_diffrn_source                    ?/_diffrn_source                    \'Advanced Light Source, station 11.3.1\' /;\
s/_diffrn_measurement_device_type   ?/_diffrn_measurement_device_type   \'Bruker D8 with Photon-100 detector\' /;\
s/_diffrn_measurement_method        ?/_diffrn_measurement_method        \'\\\f and \\\w shutterless scans\' /;\
s/_diffrn_detector_area_resol_mean  ?/_diffrn_detector_area_resol_mean  10.42 /g" $1.cif
sed -i .temp "s/_diffrn_radiation_monochromator   ? /_diffrn_radiation_monochromator   'silicon 111' /" $1.cif
fi
#===============================================================================

# Get the time in milliseconds after completion of section 12.
#
if [ $section_timing == "on" ]; then
  time12=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 13: CHANGES TO _reflns_ CIF LINES------------------------------------- 
#===============================================================================
#
# Very trivial stuff - separates _reflns_ lines from _diffrn_ lines, and removes 
# a leading space from the _reflns_special_details blurb so that it is formatted 
# like the other blurbs. Note: the CIF standard does not care about this, but my 
# eyes (and OCD) do.  The "1, " in each sed search and replace term ensures that 
# it only changes the first match.  This is necessary if by chance there are REM 
# statements that also match (e.g., systematic absences).
# 
sed -i .temp '/_reflns_number_total/s/_reflns_number_total/\
_reflns_number_total/' $1.cif
sed -i .temp "1,/ Reflections were merged by /s/ Reflections were merged by /Reflections were merged by /;\
1,/ class for the calculation /s/ class for the calculation /class for the calculation /;\
1,/ _reflns_Friedel_fraction /s/ _reflns_Friedel_fraction /_reflns_Friedel_fraction /;\
1,/ Friedel pairs measured divided /s/ Friedel pairs measured divided /Friedel pairs measured divided /;\
1,/ possible theoretically, ignoring /s/ possible theoretically, ignoring /possible theoretically, ignoring /;\
1,/ systematic absences./s/ systematic absences./systematic absences. /;\
1,/ Structure factors included/s/ Structure factors included/Structure factors included/" $1.cif 

# Remove the blank line immediately before _reflns_special_details in a CIF made 
# by SHELXL-20xx. This fixes the grouping of the _reflns_ lines in a SHELXL CIF. 
# The first example removes the line before _reflns_special_details, but only if 
# it is empty or contains just spaces. The second example removes the line after
# _reflns_Friedel_fraction_full. Both methods work well on a SHELXL-created CIF. 
# Both lines are given (one is commented out) in case a future release of SHELXL 
# makes one of the solutions more convenient.
#
#sed -n -i .temp '/_reflns_special_details/{x;/^ *$/d;x;};1h;1!{x;p;};${x;p;}' $1.cif 
sed -i .temp '/_reflns_Friedel_fraction_full/{N;s/\n.*//;}' $1.cif
#=============================================================================== 

# Get the time in milliseconds after completion of section 13.
#
if [ $section_timing == "on" ]; then
  time13=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 14: CHANGES TO _computing_ CIF LINES---------------------------------- 
#===============================================================================
#
# Some of these edits are conditional, dependent on the wavelength (and hence on  
# which machine was used).
# 
#if [ "$anode" = "Mo" ]; then
#sed -i .temp "s/_computing_data_collection        ?/_computing_data_collection        '<i>COLLECT<\/\i> (Nonius, 1998)' /;\
#s/_computing_cell_refinement        ?/_computing_cell_refinement        '<i>SCALEPACK<\/\i> (Otwinowski \&\ Minor, 2006)' /;\
#s/_computing_data_reduction         ?/_computing_data_reduction         '<i>DENZO-SMN<\/\i> (Otwinowski \&\ Minor, 2006)' /" $1.cif
#fi 

if  [ "$anode" != "Sy" ]; then
sed -i .temp "s/_computing_data_collection        ?/_computing_data_collection        '<i>APEX5<\/\i> (Bruker-AXS, 2023)' /;\
s/_computing_cell_refinement        ?/_computing_cell_refinement        '<i>APEX5<\/\i> (Bruker-AXS, 2023)' /;\
s/_computing_data_reduction         ?/_computing_data_reduction         '<i>APEX5<\/\i> (Bruker-AXS, 2023)' /" $1.cif
fi  

if  [ "$anode" = "Sy" ]; then
sed -i .temp "s/_computing_data_collection        ?/_computing_data_collection        '<i>APEX2<\/\i> (Bruker-AXS, 2006)' /;\
s/_computing_cell_refinement        ?/_computing_cell_refinement        '<i>APEX2<\/\i> (Bruker-AXS, 2006)' /;\
s/_computing_data_reduction         ?/_computing_data_reduction         '<i>APEX2<\/\i> (Bruker-AXS, 2006)' /" $1.cif
fi  

sed -i .temp "s/_computing_structure_solution     ?/_computing_structure_solution     '<i>SHELXT<\/\i> (Sheldrick, 2015a)' /;\
s/_computing_molecular_graphics     ?/_computing_molecular_graphics     '<i>XP in SHELXTL<\/\i> (Sheldrick, 2008)' /;\
s/_computing_structure_refinement .*$/_computing_structure_refinement   '<i>SHELXL-2025\/1<\/\i> (Sheldrick, 2015b)' /;\
s/_computing_publication_material   ?/_computing_publication_material   move_to_newline'<i>SHELX<\/\i> (Sheldrick, 2008) and <i>CIFFIX<\/\i> (Parkin, 2013)'move_to_newline/" $1.cif  
sed -i .temp 's/move_to_newline/\
/g' $1.cif
#=============================================================================== 

# Get the time in milliseconds after completion of section 14.
#
if [ $section_timing == "on" ]; then
  time14=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 15: CHANGES TO _refine_, _atom_sites_ and _shelx_res_file CIF LINES--- 
#===============================================================================
#
# Add boilerplate text to the _refine_special_details section. See also comments 
# in the above section concerning _exptl_special_details.  A reference for RT is  
# also added, as (in this lab.) RT is run by the shelxl_launcher script when the 
# ACTA instruction is present in the *.ins file.
#
# NOTE:  The commented out lines immediately below were used for CIFs made prior 
# to SHELXL-2014/6, which had ; ? ; characters on three separate lines.  Version 
# SHELXL-2014/6 has the ? character on the same line as _refine_special_details.
# These lines are left here as comments in case a future version of SHELXL needs 
# them.
# 
#var=`grep -n "_refine_special_details" $1.cif | cut -f1 -d":"`
#var=`expr $var + 1`
#var2=`expr $var + 2`
#awk -v m=$var -v n=$var2 'm <= NR && NR <= n {next} {print}' $1.cif > tmp
#mv tmp $1.cif
sed -i .temp 's/_refine_special_details .*$/_refine_special_details \
; \
Refinement progress was checked using <i>Platon<\/\i> (Spek, 2020) and by \
an <i>R<\/\i>-tensor (Parkin, 2000). The final model was further checked \
with the IUCr utility <i>checkCIF<\/\i>. \
; \ /' $1.cif 
#sed -i .temp "s/_atom_sites_solution_primary      ?/_atom_sites_solution_primary      direct /;\
#s/_atom_sites_solution_secondary    ?/_atom_sites_solution_secondary    difmap /;\
#s/_atom_sites_solution_hydrogens    geom/_atom_sites_solution_hydrogens    difmap /;\
#s/_shelx_res_file/_iucr_refine_instructions_details /;\
#s/ Refined as a/Refined as a/" $1.cif
sed -i .temp "s/_atom_sites_solution_primary      ?/_atom_sites_solution_primary      direct /;\
s/_atom_sites_solution_secondary    ?/_atom_sites_solution_secondary    difmap /;\
s/_atom_sites_solution_hydrogens    geom/_atom_sites_solution_hydrogens    difmap /;\
s/ Refined as a/Refined as a/" $1.cif
#
# The next substitution fixes a bug in SHELXL2014/7 that caused CIFTAB to miss a 
# close parenthesis if an extinction correction was made.  A bug report has been 
# sent to George, so it will surely get fixed sooner or later. The fix here also 
# changes the citation date for now, i.e. until a better reference is available. 
#
sed -i .temp "s/7 (Sheldrick 2017/1 (Sheldrick 2008)/" $1.cif
#
# The next substitution fixes a minor problem in SHELXL-2017/1 that writes a bad 
# citation for extinction.  There is no proper reference, so here it defaults to 
# the latest SHELXL reference.
#
sed -i .temp "s/(Sheldrick 2017)/(Sheldrick, 2015b)/" $1.cif
#===============================================================================

# Get the time in milliseconds after completion of section 15.
#
if [ $section_timing == "on" ]; then
  time15=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 16: ADD REFERENCES TO _publ_section_references SECTION---------------- 
#===============================================================================
#
# Add a few references to the CIF in the _publ_section_references section.  Also 
# adds an acknowledgment for diffractometer purchase. For the old kappaCCD or X8 
# Proteum, need to use an earlier version as different stuff was written for the 
# diffractometer software etc.
#

sed -i .temp 's/insert_references_here/Bruker-AXS (2023). \
  <i>APEX5<\/\i> Bruker-AXS Inc., Madison, WI, USA. \
 \
Hope, H. (1994). \
  <i>Prog. Inorg. Chem.<\/\i> <b>41<\/\b>, 1--19. \
sadabs-placeholder\
Parkin, S. \&\ Hope, H. (1998). \
  <i>J. Appl. Cryst.<\/\i> <b>31<\/\b>, 945--953. \
 \
Parkin, S. (2000). \
  <i>Acta Cryst.<\/\i> A<b>56<\/\b>, 157--162. \
 \
Parkin, S. (2013). \
  <i>CIFFIX<\/\i>, https:\/\/\xray.uky.edu\/\Resources\/\scripts\/\ciffix \
Twinning-placeholderParsons-quotients-placeholder\
Sheldrick, G.M. (2008). \
  <i>Acta Cryst.<\/\i> A<b>64<\/\b>, 112--122. \
twinabs-placeholder\
Sheldrick, G.M. (2015a). \
  <i>Acta Cryst.<\/\i> A<b>71<\/\b>, 3--8. \
 \
Sheldrick, G.M. (2015b). \
  <i>Acta Cryst.<\/\i> C<b>71<\/\b>, 3--8. \
squeeze-placeholderbypass-placeholder\
Spek, A.L. (2020). \
  <i>Acta Cryst.<\/\i> E<b>76<\/\b>, 1--11. \
 \
Westrip, S.P. (2010). \
  <i>J. Appl. Cryst.<\/\i> <b>43<\/\b>, 920--925.  /' $1.cif 

# Insert different reference for SADABS/TWINABS, then blank out the placeholder.
#
if [ "$twinabs" = "TWINABS" ]; then
   sed -i .temp 's/SADABS/TWINABS/' $1.cif
   sed -i .temp 's/sadabs-placeholder//' $1.cif
   sed -i .temp 's/twinabs-placeholder/\
Sheldrick, G.M. (2012). \
  <i>TWINABS<\/\i> Bruker, Madison, Wisconsin. \
 \ /' $1.cif
else 
   sed -i .temp 's/twinabs-placeholder//' $1.cif
   sed -i .temp 's/sadabs-placeholder/\
Krause, L., Herbst-Irmer, R., Sheldrick, G.M. \&\ Stalke, D. (2015). \
  <i>J. Appl. Cryst.<\/\i> <b>48<\/\b>, 3--10. \
 \ /' $1.cif
fi

# If twinned then add Acta E reference and blank out the placeholder.
#
if [ "$twin_type" == "no twinning" ]; then 
  sed -i .temp 's/Twinning-placeholder//' $1.cif
else 
  sed -i .temp 's/Twinning-placeholder/\
Parkin, S.R. (2021). \
  <i>Acta Cryst.<\/\i> E<b>77<\/\b>, 452--465. \
 \ /' $1.cif
fi

# If Parsons' quotients were used to calculate the Flack 'x' parameter, then add 
# the proper reference, otherwise just blank out the placeholder.
#
quotients=$(grep -i "Parsons, Flack and Wagner, Acta Cryst." $1.cif)
if [ "$quotients" = "" ]; then
   sed -i .temp 's/Parsons-quotients-placeholder//' $1.cif
else
   sed -i .temp 's/Parsons-quotients-placeholder/\
Parsons, S., Flack, H.D. \&\ Wagner, T. (2013). \
  <i>Acta Cryst.<\/\i> B<b>69<\/\b>, 249--259. \
 \ /' $1.cif
   sed -i .temp "s/ Flack x determined/Flack \'x\' determined/" $1.cif
   sed -i .temp 's/ (Parsons, Flack and Wagner, Acta Cryst. B69 (2013) 249-259)/[Parsons, Flack and Wagner (2013)]/' $1.cif
fi

# BYPASS and SQUEEZE references are inserted in section 20 along with a blurb to 
# explain the usage and associated .fab file.

sed -i .temp 's/insert_acknowledgments_here/The D8 Venture diffractometer was funded by the NSF (MRI CHE1625732),\
and by the University of Kentucky. /' $1.cif 
#===============================================================================

# Get the time in milliseconds after completion of section 16.
#
if [ $section_timing == "on" ]; then
  time16=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 17: ADD TWINNING BLURB TO _publ_section_exptl_refinement SECTION------ 
#===============================================================================
#
# The tests made in section 7 for HKLF 5 or TWIN instructions are used to decide 
# whether to add some text to describe twinning. If "HKLF 5" is present, then it 
# assumes non-merohedry.  (See also sections 11 and 16, which add a citation and 
# reference for TWINABS if the dataset came from the X8 Proteum.) If the command 
# TWIN is present, but has no matrix, then it assumes inversion twinning. If the 
# TWIN line has a matrix specified, it checks the beta angle. If beta is exactly 
# 90 degrees (i.e. integer), it assumes twinning by merohedry, but if it is non-
# integer 90 then it will assume twinning by pseudo-merohedry.  This should take 
# all the most common situations into account, and should deal with beta exactly 
# 120 degrees properly as well (I have no test cases).  NB: This all works quite 
# well, leaving only trivial aesthetic formatting problems that the CIF standard 
# does not care about. No attempt is made yet to deal with cases where more than 
# one kind of twinning is present.
#
if [ "$twin_type" = "non-merohedry" ]; then
sed -i .temp 's/twinning_blurb/\
The crystal was twinned by non-merohedry, which was handled using regular \
SHELXL methods (HKLF 5 format *.hkl file, the BASF command, and MERG 0). \
For a concise description of different twin types, see Parkin (2021). \
 /' $1.cif
fi

if [ "$twin_type" = "inversion" ]; then
sed -i .temp 's/twinning_blurb/\
The crystal was twinned by inversion, which was dealt with using standard \
SHELXL methods (TWIN and BASF commands). \
For a concise description of twin nomenclature, see Parkin (2021). \
 /' $1.cif
sed -i .temp '/_chemical_absolute_configuration/s/_chemical_absolute_configuration .*$/_chemical_absolute_configuration  . \
# This crystal was twinned by inversion, so "absolute configuration" has \
# no relevance in this case. /' $1.cif
fi

if [ "$twin_type" = "perfect inversion" ]; then
sed -i .temp 's/twinning_blurb/\
The crystal was twinned by inversion with equal-sized twin components, which \
was dealt with using standard SHELXL methods (TWIN command). \
For a concise description of twin nomenclature, see Parkin (2021). \
 /' $1.cif
sed -i .temp '/_chemical_absolute_configuration/s/_chemical_absolute_configuration .*$/_chemical_absolute_configuration  ./' $1.cif
fi

if [ "$twin_type" = "merohedry" ]; then
sed -i .temp 's/twinning_blurb/\
The crystal was twinned by merohedry, which was dealt with using standard \
SHELXL methods (TWIN and BASF commands). \
For a concise description of different types of twinning, see Parkin (2021). \
 /' $1.cif
fi

if [ "$twin_type" = "pseudo-merohedry" ]; then
sed -i .temp 's/twinning_blurb/\
The crystal was twinned by pseudo-merohedry, which was dealt with using \
standard SHELXL methods (TWIN and BASF commands). \
For a concise description of the various twin types, see Parkin (2021). \
 /' $1.cif
fi

if [ "$twin_type" = "no twinning" ]; then
   sed -i .temp 's/twinning_blurb//' $1.cif
fi
#===============================================================================

# Get the time in milliseconds after completion of section 17.
#
if [ $section_timing == "on" ]; then
  time17=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 18: ADD CONSTRAINTS/RESTRAINTS BLURB TO _publ_section_exptl_refinement 
#===============================================================================
#
# Add blurbage for different combinations of constraints and restraints. It adds 
# different stuff depending on what constraints and/or restraints were used. The  
# resulting CIF might need some cosmetic clean-up to ensure that the line breaks 
# are aesthetically pleasing. NOTE: the CIF specifications obviously do not care 
# about such things, but well-placed line breaks make the CIF much more readable 
# by actual people. 
#
if [[ "$constraints" != "" ]] && [[ "$restraints" != "" ]]; then
   allstraints="constraints and restraints"
fi

if [ "$constraints" != "" ]; then
   control="constraints"
fi

if [ "$restraints" != "" ]; then
   control="restraints"
fi

if [ "$allstraints" != "" ]; then
   control="constraints and restraints"
fi

# Search and replace different blurbs depending on if there are both constraints 
# and restraints, just constraints, just restraints, or neither.
#
if [ "$control" = "constraints and restraints" ]; then
sed -i .temp 's/constraints_restraints_blurb/ \
To ensure satisfactory refinement for disordered groups in the structure, \
a combination of constraints and restraints were employed. The constraints \
insert_constraints were used to fix overlapping fragments. \
Restraints were used to ensure the integrity of ill-defined or disordered \
groups insert_restraints. \
/' $1.cif
elif [ "$control" = "restraints" ]; then
sed -i .temp 's/constraints_restraints_blurb/ \
Restraints were used to ensure satisfactory refinement of disordered or \
otherwise ill-defined groups insert_restraints. \
/' $1.cif
elif [ "$control" = "constraints" ]; then
sed -i .temp 's/constraints_restraints_blurb/ \
To ensure satisfactory refinement for disordered groups in the structure, \
constraints insert_constraints were used to equalize parameters \
of superimposed groups. /' $1.cif
else
   sed -i .temp 's/constraints_restraints_blurb//' $1.cif
fi

sed -i .temp "s/insert_constraints/$constraints/;\
s/insert_restraints/$restraints/" $1.cif
#===============================================================================

# Get the time in milliseconds after completion of section 18.
#
if [ $section_timing == "on" ]; then
  time18=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 19: ADD HYDROGEN ATOM DESCRIPTION TO _publ_section_exptl_refinement--- 
#===============================================================================
#
# Add semi-customized hydrogen atom blurbs to _publ_section_exptl_refinement. It 
# inserts the text immediately after the _audit_creation_method section.  If the 
# word "water" (case insensitive!) is present somewhere in the CIF, then it also 
# adds text for water hydrogens. Note: If water is present in the structure then 
# it is helpful to include a REM statement in the .RES before the oxygen atom of 
# the water. Remember: it is *always* a good idea to annotate your *.RES file so 
# that others can easily decipher the model. 
#
# NOTE: Find a better test for the presence of water ... 
# 
# The string variable hfix_types contains descriptions for all the riding H atom 
# types.  These were figured out in section 8.
#

# If there are no riding hydrogens in the model, put "?" in the CIF here.
#
if [ "$hfix_types" == "end_of_list" ]; then
   sed -i .temp 's/hydrogen_atoms_blurb/ ? \
/' $1.cif
fi

# Otherwise add some generic blurb and some placeholders.
#
sed -i .temp 's/hydrogen_atoms_blurb/H atoms were found in difference Fourier maps, but subsequently included \
in the refinement using riding models, with constrained distances set to \
riding_hydrogensH2O_hydrogensHUblurb \ /' $1.cif

# Insert the list of riding hydrogen atom descriptions into the CIF, overwriting 
# the place holder inserted earlier. 
#
sed -i .temp "s/riding_hydrogens/$hfix_types/" $1.cif

# Check for presence of "water" in the RES file.  A properly annotated RES ought 
# to include this in a REM comment. Here, it is used as a flag to decide whether 
# or not to add a separate water hydrogen blurb.
#
waterH=$(grep -i -m1 "water" $1.cif)
if [ "$waterH" != "" ]; then
   sed -i .temp 's/H2O_hydrogens/\
Water hydrogen atoms were refined using 1,2 and 1,3 distance restraints./' $1.cif
else
   sed -i .temp 's/H2O_hydrogens//' $1.cif
fi

# Check whether "-1.20000" or "-1.50000" are present in the embedded RES and use 
# that information to insert semi-custom blurb for Uiso(H) values.  Note:  Could 
# be made more sophisticated, which would enable it to be more specific.  At the 
# moment it is quite limited, but at least easily editable text is added.
#
type12=`grep "\-1.20000" $1.cif | awk '{print substr($7,2,3)}' | sort -u`
type15=`grep "\-1.50000" $1.cif | awk '{print substr($7,2,3)}' | sort -u`
HUtypes=$type12$type15

if [ "$HUtypes" == "1.21.5" ]; then
   sed -i .temp 's/HUblurb/\
U~iso~(H) parameters were set to values of either 1.2U~eq~ or 1.5U~eq~ \
(RCH~3~ and OH only) of the attached atom. \
/' $1.cif
fi

if [ "$HUtypes" == "1.2" ]; then
   sed -i .temp 's/HUblurb/\
U~iso~(H) values were set to 1.2U~eq~ of the attached atom. \
/' $1.cif
fi

if [ "$HUtypes" == "1.5" ]; then
   sed -i .temp 's/HUblurb/\
U~iso~(H) values were set to 1.5U~eq~ of the attached atom. \
/' $1.cif 
fi

if [ "$HUtypes" == "" ]; then
   sed -i .temp 's/HUblurb//' $1.cif   
fi

# Fix up the text for the types of hydrogens that have U~iso~ = 1.5U~eq~.  There 
# are usually only hydroxyl and methyl groups to worry about here.
#
if [ "$hydroxyl" != "yes" ]; then
   sed -i .temp 's/RCH~3~ and OH only/RCH~3~ only/' $1.cif
fi

if [ "$methyl" != "yes" ]; then
   sed -i .temp 's/RCH~3~ and OH only/OH only/' $1.cif
fi
#===============================================================================

# Get the time in milliseconds after completion of section 19.
#
if [ $section_timing == "on" ]; then
  time19=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 20: ADD SQUEEZE BLURB & REFERENCE TO _publ_section_exptl_refinement---
#===============================================================================
#
# Add a brief paragraph to describe the use of SQUEEZE for those structures that 
# used it, along with a reference to the original 'BYPASS' procedure. This works 
# by grep'ing for some blurb that SHELXL20xx adds to the _reflns_special_details
# section of the CIF. It could have equally well just tested for the presence of 
# a *.fab file or grep'ed the embedded RES for an ABIN command, but those do not 
# prove the use of SQUEEZE.  Presence of the specific blurb in the CIF, however, 
# does prove use of SQUEEZE.
#
squeeze_query=$(grep -m1 "Structure factors included contributions from the .fab file." $1.cif)
if [ "$squeeze_query" != "" ]; then
sed -i .temp 's/squeeze-blurb/ \
A region of poorly-defined electron density could not be accounted for in \
a satisfactory way by modelling it as solvent.  It was incorporated using \
the SQUEEZE method (van der Sluis \& Spek, 1990), as implemented in Platon \
(Spek, 2015). With SHELXL-20xx, this allows for the contribution from any \
unmodelled electron density to be added to the calculated intensities via \
the ABIN command, which obtains the required information from a file with \
a .fab extension (generated by Platon).\
/' $1.cif
sed -i .temp 's/squeeze-placeholder/ \
Spek, A.L. (2015). \
  <i>Acta Cryst.<\/\i> C<b>71<\/\b>, 9--18. \ /' $1.cif
sed -i .temp 's/bypass-placeholder/ \
 \
Sluis, P. van der \&\ Spek, A.L. (1990). \
  <i>Acta Cryst.<\/\i> A<b>46<\/\b>, 194--201. \
 \ /' $1.cif
else
  sed -i .temp 's/squeeze-blurb//' $1.cif
  sed -i .temp 's/squeeze-placeholder//' $1.cif
  sed -i .temp 's/bypass-placeholder//' $1.cif
fi
#===============================================================================

# Get the time in milliseconds after completion of section 20.
#
if [ $section_timing == "on" ]; then
  time20=$(gdate +%s%3N)
fi

#===============================================================================
#-SECTION 21: CLEAN UP LOOSE ENDS----------------------------------------------- 
#===============================================================================
#
# Some of these will not be needed once this script is finished (if it ever gets 
# finished).  The last few lines recombine the edited CIF, RES and HKL parts and 
# remove temporary files.  Note:  If used on Linux, .temp files will not need to 
# be removed because they would not have been created. As stated already, sed on 
# OSX is not as good as GNU sed.
#
# Edit the moiety formula that was extracted by Platon way back in SECTION 2. By
# waiting until the very end to make the insertion, CIFFIX is able to run at the 
# same time as Platon.  This is ok so long as Platon is fast enough that it ends 
# *before* CIFFIX gets to the point where it needs to insert the moiety formula. 
# This effectively cuts about a half second off the overall CIFFIX runtime - yes 
# it is *that* much faster! 
#
# Three different ways to get the moiety formula from the Platon-generated file.
# The one that uses sed is the fastest.
#
#moiety=`awk '/_chemical_formula_moiety/{getline; print}' $1_acc.cif`
#moiety=`echo $(grep -A 1 "_chemical_formula_moiety" $1_acc.cif | tail -n +2)`
#
wait $platon_pid
moiety=`sed -n '/_chemical_formula_moiety/{n;p;}' $1_acc.cif`

sed -i .temp "/_chemical_formula_moiety/s/_chemical_formula_moiety          ?/_chemical_formula_moiety          $moiety/" $1.cif

# A bit of tidy up. Might not be needed anymore, but does no harm.
#  
sed -i .temp 's/  ;/;/;s/ ;/;/' $1.cif
sed -i .temp 's/groups.;/groups. \
; /' $1.cif

# The next few lines tidy the _publ_section_exptl_refinement blurbs line lengths 
# by splitting the working part of the CIF with csplit, and using 'fmt'.  It has 
# to temporarily insert a dot in front of the semi-colon because fmt skips lines 
# that begin with a dot, but not those that begin with a semi-colon. These extra 
# dots are then removed with another sed command.  The annoying blank first line 
# that SHELXL insists on writing is stripped using 'tail'.  The three components 
# are then re-joined using cat. Lastly, the temporary file parts are removed.
#
csplit -n1 -k -s $1.cif '/_publ_section_exptl_refinement/' '/_publ_section_references/'
sed -i .temp 's/;/.;/' xx1
fmt -w74 xx1 > xx1_fmt
sed 's/.;/;/' xx1_fmt > xx1
tail -n +2 xx0 > xx0_tail
cat xx0_tail xx1 xx2 > $1.cif
rm xx0 xx0_tail xx1 xx1_fmt xx2

# Replace all trailing spaces with a single space. This makes it a little easier 
# to edit the added text blurbs without accidentally joining two words together.
#
sed -i .temp 's/[[:blank:]]*$/ /' $1.cif

# Fold lines longer than 80 characters, putting any breaks only at spaces.  This 
# should only be required for the few cases where things in the SHELXL-generated 
# _atom_site_ loops happen to spill over the legacy 80 character per line limit.
# It would not be necessary at all if the authors of certain unnamed CIF-editing 
# programs would update their wares to allow longer lines.  NOTE: Could not find 
# an option for fold-in-place, hence the use of a temporary file here.
#
fold -sw 80 $1.cif > .temp
mv .temp $1.cif

# Reconstruct the full CIF from its separated parts.
#
cat $1.cif $1_hkl_file.cif > new.cif
mv new.cif $1.cif 

# Remove unwanted temporary files.
#
rm $1_hkl_file.cif *.temp .*.temp $1.cif.temp $1_acc.cif $1.chk *.ome *_shelxl.ins check.def 2>/dev/null
#=============================================================================== 

# Get the time in milliseconds after completion of section 21.
#
time21=$(gdate +%s%3N)

#=============================================================================== 
#-SECTION 22: TIMING DIAGNOSTICS------------------------------------------------
#=============================================================================== 
#
# Section timing is for diagnostic purposes only, and would normally not be run. 
# Timing is turned on or off prior to SECTION 1.  NOTE: This section inelegantly
# writes the timing information to the screen in the middle of whatever else has 
# just been written. This could be tidied up, but there's not much point.

if [ $section_timing == "off" ]; then
  if [ $overall_timing == "on" ]; then
    finish_time=$(date +"%H:%M:%S")
    elapsed_time=`echo "scale=2;("$time21"-"$time0")/1000" | bc`
    if [[ $elapsed_time == .* ]]; then
      elapsed_time="0"$elapsed_time
    fi
    printf "%23s%8s%35s%4s%8s\r" " +  CIFFIX finished at "$finish_time" -------------- Total CIFFIX time: "$elapsed_time" secs  +"
    exit
  fi
  exit
fi

echo -e " +    Section by section breakdown of CIFFIX runtime:" >> timing.temp
echo -e " +    Section 1 took "$((time1 - time0))" milliseconds" >> timing.temp
echo -e " +    Section 2 took "$((time2 - time1))" milliseconds" >> timing.temp
echo -e " +    Section 3 took "$((time3 - time2))" milliseconds" >> timing.temp 
echo -e " +    Section 4 took "$((time4 - time3))" milliseconds" >> timing.temp 
echo -e " +    Section 5 took "$((time5 - time4))" milliseconds" >> timing.temp
echo -e " +    Section 6 took "$((time6 - time5))" milliseconds" >> timing.temp
echo -e " +    Section 7 took "$((time7 - time6))" milliseconds" >> timing.temp
echo -e " +    Section 8 took "$((time8 - time7))" milliseconds" >> timing.temp
echo -e " +    Section 9 took "$((time9 - time8))" milliseconds" >> timing.temp
echo -e " +    Section 10 took "$((time10 - time9))" milliseconds" >> timing.temp
echo -e " +    Section 11 took "$((time11 - time10))" milliseconds" >> timing.temp
echo -e " +    Section 12 took "$((time12 - time11))" milliseconds" >> timing.temp
echo -e " +    Section 13 took "$((time13 - time12))" milliseconds" >> timing.temp
echo -e " +    Section 14 took "$((time14 - time13))" milliseconds" >> timing.temp
echo -e " +    Section 15 took "$((time15 - time14))" milliseconds" >> timing.temp
echo -e " +    Section 16 took "$((time16 - time15))" milliseconds" >> timing.temp
echo -e " +    Section 17 took "$((time17 - time16))" milliseconds" >> timing.temp
echo -e " +    Section 18 took "$((time18 - time17))" milliseconds" >> timing.temp
echo -e " +    Section 19 took "$((time19 - time18))" milliseconds" >> timing.temp
echo -e " +    Section 20 took "$((time20 - time19))" milliseconds" >> timing.temp
echo -e " +    Section 21 took "$((time21 - time20))" milliseconds" >> timing.temp
echo -e " +    Total CIFFIX time "$((time21 - time0))" milliseconds" >> timing.temp
cat timing.temp
rm timing.temp
exit
#=============================================================================== 

