A Practical Introduction to the C Language for Computational Chemistry. Part 1

Write in C (Let it Be)

When I find my code in tons of trouble,
Friends and colleagues come to me,
Speaking words of wisdom:
“Write in C.”

As the deadline fast approaches,
And bugs are all that I can see,
Somewhere, someone whispers:
“Write in C.”

Write in C, write in C,
Write in C, oh, write in C.
LISP is dead and buried,
Write in C.

I used to write a lot of FORTRAN,
For science, it worked flawlessly.
Try using it for graphics!
Write in C.

If you’ve just spent nearly 30 hours
Debugging some assembly,
Soon you will be glad to
Write in C.

Write in C, write in C,
Write in C, yeah, write in C.
Only wimps use BASIC.
Write in C.

Write in C, write in C,
Write in C, oh, write in C.
Pascal won’t quite cut it.
Write in C.

Write in C, write in C,
Write in C, yeah, write in C.
Don’t even mention COBOL.
Write in C.

Parody Song by 
Brian Marshall

This series of tutorials will provide a short and practical introduction to the C language aiming students with interest in computational or physical chemistry. At the end of this tutorial, you will be able to write simple programs that can read data from files elaborate them and write the results of the calculations in output files. This tutorial is not a course in C programming language, therefore the motivated readers are encouraged to look for more comprehensive introductions to this language.

The tutorial is also based on OS based Unix systems such as Linux or MacOSX. Therefore, I recommed to give a look to my introductions to Unix OS:

In this first part of the tutorial, the following basic features of the C language are explained with progressive examples:

  • Variables, definition, expression calculation.
  • Print out the results of the calculations.
  • Input parameter from the keyboard
  • Conditional loops.
  • Counter loops.

MY FIRST PROGRAM IN C

The short program, described in Figure 1, calculates the Lennard-Jones (LJ) potential that it given by the following equation:

lj

of two argon atoms at a distance x=0.5 nanometers and print out the result on the screen.


/*
NAME: My_first_c_program
DOCUMENTATION: Calculate the LJ function for a given set
of parameter (sigma s and epsilon e) and distance x

AUTHOR: D. Roccatano
REVISION: 0.0
DATE: Feb 2007
*/

#include <stdio.h>;
#include <math.h>;

int main (int *argc, char *argv[] )
{

  float V,s,e,x;
  float C6,C12;
  float s6,ix6;

  s=0.272;
  e=0.3908;
  x=0.5

  s6=pow(s,6.);
  C6=4*e*s6;
  C12=4*e*s6*s6;
  ix6=1/pow(x,6.);

  V = C12*ix6*ix6-C6*ix6;

  printf ("Lennard-Jones parameters:\n");
  printf (" Sigma: %f\n",s);
  printf (" Epsilon: %f\n",e);
  printf ("Distance: %f\n",x);
  printf ( "V(%f)= %f\n",x,V);
}

To run the program you need to copy the program lines in a file (use, for example, the Kate editor) and save it with the name prog1.c. Then you need to compile it using the command (on Linux systems):

gcc prog1.c –o prog1.exe –lm

Finally, runs it from the command line typing: ./prog1.exe

The program comprises 7 sections as indicated in the callouts in Figure 1.

Fig1.png

Figure 1: Example of C program. The program calculates the value of the Lennard-Jones potential for a given distance between the particle.

Section 1. Comments are extremely useful to understand the program. Put always efforts to comment your program! Otherwise, it will be very difficult for another user (and probably also for you in few weeks!) to understand your program.

Section 2. Before compiling (e.g. transforming the human readable C language in binary code) the program is preprocessed to include external files, to declare machine specific instruction or to define macros. Statements for the preprocessor start with the symbol “#”. The two #include statements in our program request to the preprocessor cpp the inclusion of files containing the definition of I/O and math functions, respectively.

Section 3. C language is organized in function and the program itself is a special function called main () that incorporate all the other functions.

Section 4. The values elaborated by the program are stored in variables classified according to types. Some of the variable types defined in the C language are reported in the Appendix. Each type occupies a certain quantity of computer memory. The integer in the range [-232,232] (for a 32-bit processor) are defined by the type int. The float type is used to assign floating point variables. Single precision floating point variables are defined using 4 bytes (32 bits in total) in the following way [1]:

If you need more precision in your calculation then you can use the double precision floating point variables defined using double statement. The char type is used to assign alphanumeric characters (A,b,%,&,$, etc.); it is the 1-byte variable type. Variables are normally elaborated using specific functions and with certain restrictions depending on the type. In the case of C, those limitations are sometimes very fuzzy and this can be a point of strength for an expert c-guru but also a dangerous weakness for absolute beginners!

Section 5. The defined variable needs to be initialized by assigning (using the = symbol) initial numerical and alphanumerical values.

Section 6. Using the numerical functions of the language (see a short list of them in the Appendix); it is possible to calculate complex mathematical expressions. In the case of the LJ function, we try to minimize the use of the function pow () and increase the use of multiplications. This has the advantage to improve the performance of the code in term of computer time.

Caveats

The C language has a case-sensitive syntax, therefore let use lower case for writing the statements of the language. Take your time also to carefully test and debug your programs. Those are some examples of disaster that insidious bug of no fully-tested  program can provoke [2]:

  • The Patriot Missile failure, in Dharan, Saudi Arabia, on February 25 1991, which resulted in 28 deaths, is ultimately attributable to poor handling of rounding errors.
  • The explosion of the Ariane 5 rocket just after lift-off on its maiden voyage off French Guiana, on June 4, 1996, was ultimately the consequence of a simple overflow.
  • The sinking of the Sleipner A offshore platform in Gandsfjorden near Stavanger, Norway, on August 23, 1991, resulted in a loss of nearly one billion dollars. It was found to be the result of inaccurate finite element analysis.

TASK 1

Modify the program 1 to calculate also the first derivative of the LJ potential 

The general form of the force generated by a Lennard-Jones potential interaction and acting on the atom i, is given by

\frac{dV_{LJ}}{dr}= 4\epsilon\left[ -12\frac{\sigma^{12}}{r^{13}}+6\frac{\sigma^{6}}{r^{7}}\right]

For sake of simplicity, let consider only the mono dimentional case of two particles moving along the x-axis.


KEYBOARD INPUT OF PROGRAM PARAMETERS 

Every time you decide to calculate the potential for a new value of x, instead to modify the source code and recompile again, it is more convenient to input the new value from the keyboard. To accomplish this new task you need to use the function scanf (format, list of variables). The function scanf read a formatted input line and assign the line contents to a list of variables. The format specifier is reported in Appendix. This function will input the value of x from the keyboard every time you run the program. To use this new function, replace the assignment x=0.5, in program prog2.c, with the following lines and save the new program as prog3.c. Compile and run it.


TASK 2

Modify the program to input from the keyboard the Lennard-Jones parameters too using the statments


printf ("Insert the value of x:\n");
scanf("%f",&x);
printf("\n");


CONDITIONAL LOOPS

Running every time the program to get a new value of x is also quite inefficient. It would be more convenient if you can calculate different values of the LJ potential without restarting the program. This can be done using a conditional loop. This statement allows you to repeat a program block until a given condition is fulfilled. Two types of conditional loops are available in the C language:

do{  

 

BODY OF THE LOOP

} while ( condition )

while (condition) {

 

BODY OF THE LOOP

}

The do-while loop is called post-conditioned loop since the condition is evaluated after executing instructions in the body of the loop. The while loop is a preconditioned loop, namely, it evaluate the condition before executing the body of loop instructions


TASK 3

Modify the previous program by including the code in the following text box. Move in the body of the loop instructions present in 6th part of Figure 1. 

do {

/* BODY OF THE LOOP */

/* prompt for input */
    printf (“ enter the value of the distance:“);
    scanf(“%f”,&x);
    printf(“\n”);
} while (x > 0);

The program will now prompt for a new x value after calculating the previous one. It will end when the value of x is equal or less than zero (condition x> 0).


TASK 4

 Modify the program by implementing a while loop that performs the same task as the do-while loop.

COUNTING LOOPS

Suppose that you would like to make a graph of an LJ potential V(x). In this case, you need to calculate the function at N different values of x. Performing this task inputting distances by keyboard and recording the results would be a tedious and frustrating task especially when you want to calculate V(x) for hundred points. Fortunately, a computer can perform much better and faster this work by using counting loops. A counting loop can be implemented by modifying the program 1 by using a conditional loop as follows:


/* 1. Add to part 4 of prog1.c in Figure 1. */

int i,N;
float inc
...

/* 2. Add to part 5 of prog1.c in Figure 1. */
N=100;
i=0;
x=0.2; /* initial value of x; this can be input from keyboard */
inc= 0.005; /* increment value of x */

...

/* 3. Add to part 6 of prog1.c in Figure 1. */
while (i <= N) {

...

  printf ( “%f %f\n”, x,V,F);
  i++;
  x += inc;

}

The operator ++ and += are called unary operators since they operate on one variable only. The increment operator ++ is equivalent to the operation i=i+1. The operator += is equivalent to the operation x=x+inc. In the program, the while-loop cycle increments the value of i of one unit, and the value of x of Inc, until the value of i become equal to N, then the program will exit the loop. The new program needs as input: the starting value of the distance, the number of points and the distance increment. The only printf statement that should remain in the loop will print three column containing the value of the position, the potential V, and the force F acting on the particle at distance x.

The previous loop can be expressed in a more compact way by using the for-loop:

for(i=0; i <= N; i++) {
...

x += inc;

}


TASK 5

Try to implement both the loops and include also the possibility to input from keyboard starting distance value, distance increment and numbers of points. Write a modified version that requires as input the initial and final distance and the incremental value.


FURTHER TASKS

  1. Modify the previous programs to calculate the interaction of two different atoms (see Table I in the Appendix). (Hint: use the Berthelot- Lorentz mixing rules to obtain the mixed LJ parameters).
  2. Write a similar program to calculate the potential and the force for the one-dimensional Exp-6 (Buckingham like) and Columb potential (Exp-6 parameters in Table I).

 

APPENDIX

A QUICK REFERENCE FOR C PROGRAMMING

C BASIC DATA TYPES

  • char: character (a single byte)
  • int: integer
  • long: long integer
  • float: single-precision floating point
  • double: double-precision floating point

MATH FUNCTIONS

  • fabs(x) : absolute value
  • sin(x), cos(x), sinh(x), cosh(x): trigonometric functions
  • exp(x): exponential function
  • log(x): natural logarithm
  • pow(x, n): power function (e.g.: 25 = pow(2, 5))
  • floor(x): the greatest integer <x
  • ceil(x): the up integer >x.
  • rand (): return a number value
  • srand (): reseed the rand function.
  • %: modulo operator (division reminder)

PRINTF AND SCANF FORMAT SPECIFIERS

The syntax for a format place-holder is “%[flags][width][.precision][length]type“.

Where type can be any of:

  • ‘d’, ‘i’ : Print an int as a signed decimal number. ‘%d’ and ‘%i’ are synonymous for output, but are different when used with scanf() for input.
  • ‘f’, ‘F’ : Print a double in normal (fixed-point) notation.
  • ‘e’, ‘E’ : Print a real value in standard form ([-]d.ddd e[+/-]ddd).
  • ‘g’, ‘G’ : Print a double in either normal or exponential notation, whichever is more appropriate for its magnitude. ‘g’ uses lower-case letters, ‘G’ uses upper-case letters. This type differs slightly from fixed-point notation in that insignificant zeroes to the right of the decimal point are not included. Also, the decimal point is not included on whole numbers.
  • ‘s’ : Print a character string.
  • ‘c’ : Print a char (character).
  • ‘%’ : Print a literal ‘%’ character (this type doesn’t accept any flags, width, precision or length).

Flags can be omitted or be any of:

  • ‘+’ : Causes printf to always denote the sign ‘+’ or ‘-‘ of a number (the default is to omit the sign for positive numbers). Only applicable to numeric types.
  • ‘-‘ : Causes printf to left-align the output of this placeholder (the default is to right-align the output).
  • ‘#’ : Alternate form. For ‘g’ and ‘G’, trailing zeros are not removed. For ‘f’, ‘F’, ‘e’, ‘E’, ‘g’, ‘G’, the output always contains a decimal point. For ‘o’, ‘x’, and ‘X’, a 0, 0x, and 0X, respectively, is prepended to non-zero numbers.
  • ‘ ‘ : Causes printf to left-pad the output with spaces until the required length of output is attained.
  • ‘0’ : Causes printf use ‘0’ (instead of spaces) to left fill a fixed length field. For example (assume i = 3) printf(“%2d”, i) results in ” 3″, while printf(“%02d”, i) results in “03”

Width can be omitted or be any of:

  • a number : Causes printf to pad the output of this placeholder with spaces until it is at least number characters wide. If number has a leading ‘0’, then padding is done with ‘0’ characters.
  • ‘*’ : Causes printf to pad the output until it is n characters wide, where n is an integer value stored in the a function argument just preceding that represented by the modified type. For example printf(“%*d”, 5, 10) will result in “10” being printed with a width of 5.

.Precision can be omitted or be any of:

  • a number : For non-integral numeric types, causes the decimal portion of the output to be expressed in at least number digits. For the string type, causes the output to be truncated at number characters.

If the precision is zero, nothing is printed for the corresponding argument.

Length can be omitted or be any of:

  • ‘l’ : For integer types, causes printf to expect a long sized integer argument.
  • ‘L’ : For floating point types, causes printf to expect a long double argument.

LOGICAL AND OTHER OPERATORS

  • &&       :           AND
  • ||          :           OR
  • !           :           NOT
  • ==        :           EQUALITY NOTE: Be careful not to confuse EQUALITY a==b with ASSIGNMENT a=b. The latter set a equal to b while the former checks if a is equal to b.
  • !=         :           INEQUALITY

LENNARD-JONES PARAMETERS

TABLE I: LENNARD-JONES PARAMETERS FOR NOBLE GASES. Note that the values of epsilon are divided by the Boltzmann constant and therefore expressed as a temperature (Kelvin).

 Lennard-Jones Potential Exp-6 potential
Substance\sigma/nm\epsilon/K\alpha\sigma/nm\epsilon/K
Ar0.3336141.2180.3302152.0
Kr0.3575191.4180.3551213.9
Xe0.3924257.4220.3823326.0

 EXP-6 potential

The 6-Exp is also know as Buckingham potential and it was named after Richard Buckingham that proposed it as a more accurate representation of the term for the repulsive interaction [3]. The potential use three parameters. The parameter \epsilon is the same as in the LJ potential. The r_m is distance that give the minimum value of the potential (\epsilon), and \alpha is an adjustable dimensionless parameter.

expfunc

REFERENCES

[1] http://en.wikipedia.org/wiki/Single_precision

[2] http://www.ima.umn.edu/~arnold/disasters/disasters.html

[3]  Buckingham, R. A. (1938). “The Classical Equation of State of Gaseous Helium, Neon and Argon”Proceedings of the Royal Society A168 (933): 264–283.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.