What is Linux?
Linux is a free and open source operative system (OS) derived from the Unix OS. It is created by Linus Torvalds. Initially, it was developed for personal computer but now it is diffused on almost all the existing computer architecture. The OS is freely available in different software distribution assembled by commercial or public companies. These distros regularly release freely available open-source Linux distributions in the form of a downloadable CD, DVD or even memory stick bootable images for the easy installation and customization. The reader is recommended to try to install her own version of Linux using one of the distributions (see a list at the end of the chapter). For scientific applications, the Scientific Linux is recommended since it is customized for the use in academic and research institutions and contains most of the open source scientific packages.
In this chapter, only a short introduction to the operative system Linux is provided. The reader is encouraged to deepen her knowledge on the topic by consulting more detailed tutorials or books guide suggested at the end of the chapter.
The structure of the Linux OS
As all the Unix OS, Linux is a multitasking, multiuser OS. It is structured in different software interface layers that simplify the access of the user to the hardware resources of the computer. Three main layers compose the Linux OS: the kernel, the shell, and programs.
The kernel is the core and engine of Linux. Basically, the kernel is a program loaded into the RAM memory when the computer is turned on and it takes the control of the allocation of hardware resources used by the computer. The kernel knows what hardware resources are available (like the processor(s), the onboard memory, the disk drives, network interfaces, etc.), and it has the necessary programs to talk to all the devices connected to it.
The shell is a program that interfaces the user with the kernel. The shell is accessed by remote login to the Linux PC (for example by using the Windows program PUTTY) or via a terminal window (see Figure 1) in the window manager interface (see infra). In the first case, when a user access the remote Linux system via Internet, a login program
checks the username and password, and then it starts a command line interpreter program generically called shell. This program interprets the commands typed by the user and executes them. The commands are either built-in shell instructions or any other programs running on the Linux OS platform (for example programs for molecular dynamics simulation). These programs can be executed interactively (in this case the control of the shell is recovered as soon as the command finish the execution) or, using the multitasking features of Linux, send it in background mode. In the last case, the user gets back the control of the shell and the program continues to run in the background together with the shell process. The shell is its self a program that can be executed from another shell. The most common Unix shell are listed as follows:
- The Bourne Shell or sh: the original shell still used on UNIX systems and in UNIX-related environments. This is the basic shell, a small program with few features. While this is not the standard shell, it is still available on every Linux system for compatibility with UNIX programs.
- The bash or Bourne Again shell: the standard GNU shell, intuitive and flexible. Probably most advisable for beginning users while being at the same time a powerful tool for the advanced and professional user. On Linux, bash is the standard shell for common users. This shell is a so-called superset of the Bourne shell, a set of add-ons and plug-ins. This means that the bash shell is fully compatible with the Bourne shell: commands that work in sh, also work in bash. However, the reverse is not always the case.
- ksh or the Korn shell is another variant of the Bourne shell that offers standard configurations very useful for beginners.
- csh or C shell: the syntax of this shell resembles that of the C programming language. An enhanced version of this shell is the tcsh or TENEX C shell. This shell combines an enhanced user-friendliness with speed.
At the user login, a default shell is executed allowing the user to take the control of the file system. From this shell, the user can execute another shell using the command: /bin/ (e.g. /bin/csh), this possibly turns to be very useful in running script shell programs.
Window managers. Different, highly configurable graphical interfaces (known as window managers) are available for Linux OS. The most popular desktop environments are KDE (the K Desktop Environment) and GNOME (the GNU Network Object Model Environment). These offer the point-and-click, drag-and-drop functionality associated with other user-friendly environments (for example, in MS Windows or Mac OSX).
Files and processes
The kernel/shell recognizes two main types of objects: files or processes.
A process is an executing program identified by a unique PID (process identifier) that allows the system to keep track of it. The PID of a process can be obtained using the command ps. The output of this command shows currently running processes. An example of the output is the following:
UID PID PPID C STIME TTY TIME CMD 501 50854 50707 0 0:00.05 ttys000 0:00.09 bash 0 54062 50854 0 0:00.00 ttys000 0:00.00 ps -fa 501 51126 51125 0 0:00.07 ttys001 0:00.12 mdrun
Processes can be run from the same shell in interactive mode or in background mode. To run a program in the interactive mode just types the name of the program on the command line. For program requiring long computational time, the interactive mode can idle the shell interface, therefore is more convenient to run it in the background by adding at the end of the program name the character &. A program running in interactive mode can be continued in background mode by typing Ctrl-z and the command bg on the command line. Background processes can be visualized from the shell using the command jobs. To foreground, a job to interactive mode, select the job number (n) shown by the command jobs using the command fg %n.
Example (here “frodo>” is a shell prompt but it can be different for your Linux installation):
frodo> mdrun & frodo> jobs  mdrun frodo> fg %1
Processes can be managed using the command kill. This command can suspend/resume or terminate the processes using its PID. To terminate (kill) a process uses the command:
frodo> kill –9 PID
where PID is the number shown using the command ps.
A computer file is a resource to store information. It can be created using text editors or running programs. Files are organized in a hierarchical way using a directory, equivalent to the folders in Windows and Mac OS. Each user on a Linux system is assigned with a username and a password that allows him/her to access a home directory with the same username containing directories and files. Directories are organized in a tree structure. There is a root directory that contains all the systems and it is indicated by a backslash symbol (/). Under this root directory, you can find system directories. The home directories of all the users (with the exception of special user root) are in the system directory called /home or /people. The backslash at the beginning indicates the path, in this case, the home directory is just under the root directory. A path indicates the current position in the directory tree. For example, the user Mickey Mouse has his home directory in: /people/mmouse. The full path can be defined from the root directory or relative to a current directory. In this case, the special files “.” and “..” located in each directory are used to address the current directory and the previous one, respectively. Therefore, the user Mickey Mouse can access the contents in his home directory with ./ and in /people with ../.
Basic shell commands
Linux OS provides an extensive online documentation in the form of manual pages for each shell commands and programs. The manual pages are accessible by typing the command man followed by the name of the program. It is also possible to search using keywords with the option: man –k keyword.
The name of the user logged on the current shell can be obtained using the command whoami.
The user can create in her home directory, other subdirectories using the command:
It is a good practice to create directory trees to keep in order the data of your work. Therefore, make separate directories for each tutorial that will be assigned during the course.
To access the contents of one directory, you can use the command cd (=change directory) followed by the destination directory indicated by an absolute path or a relative one. For example, to move from the directories Exercises to Exercise01 in Figure 2, you can simply type cd Exercise01. To go back to the previous directory, you can use the special file “../” presents in each directory. Just typing cd, you can go back to your home directory from each location in the directory tree.
1) move to the Exercises01 directory with an absolute path
2) move two directory up
3) Where I m now? This command will show your current path.
The command pwd print the current location in the directory tree.
4) Now return to Exercises01
This command returns you to the last visited directory.
5) Finally, bring me at home.
A file name usually consists of a root name followed by a dot and a suffix (extension). The root name is used to identify the file and the suffix to specify the type of data. Unix system is case sensitive; therefore, two names as staad.dat and STAAD.dat can indicate different files in the same directory. Some standard file types (extensions) are: .txt (text files in ASCII characters), .f or (.for) (FORTRAN source files), .c (C source files), .o (object code (compiled) files) .exe (executable files), .com (command script files), .lis or .log (listings of program output), .dat (data files to be used by programs). Some files can be printed or shown on the screen (as. .txt, .for, .c, .com, .lis) because they contain ASCII characters; others are in binary format (e.g.: .exe) and cannot be visualized or printed with standard text editors. To check what files there are in your current directory, type the command “ls”. Shell commands can have several options used to perform the specific task. For the ls command, the following options are normally used.
Frodo> ls -a
prints the hidden files, e.g files or directories with names starting with a dot character.
Frodo> ls -l
list files with complete attributes (for example time of creation).
Create the directory tree COMPCHEM in your home account typing:
then move in it
and make a subdirectory and move in it
You can replace names with the wild-type asterisk character * that stands for “anything”. For example, rm *.dat will delete all your files with extension .dat. The command ls MD* will list all files with names beginning with MD.
makes a copy of file1 called file2 in the current directory
same as above, but a copy is saved one directory up
copy file1 into file2 but in given path
cp *.com ../
copy all files with the suffix ‘.com’ one directory up
renames the file
Move in the directory PRACTICAL1 give a look in the following directory with the command
Several directories and files will be listed. Copy the file Acetone.xyz containing the 3D structure of acetone in the current directory using the command:
cp /people/data/compchem/Acetone.xyz ./
the ./ at the end of cp command indicates the current directory.
To view the content of the file on the terminal screen, type the command cat name.ext (viz. cat Acetone.xyz). The last command just scrolls the contents of the file on the screen. If the file is very long and you want to read the file contents then you need to use the command less name.ext. The command less will show the file by a block of lines, use the space bar to continue reading successive text blocks.
Viewing File contents
Displays the whole file on the screen.
Displays the first lines only, hit return to scroll down.
As above, use arrow keys for scrolling up and down.
Other useful key commands for the program more and less:
b : key returns one page back,
q : quits the program.
head –n number_of_lines
Displays the number of lines after the command option –n from the beginning of the file
(use q to quit the command).
tail –n number_of_lines
Displays the number of lines after the command option –n before the end of the file
(use q to quit the command).
To delete files you can use the command
Removes the file from the current directory, the command will delete only the specified file. To remove directory you can use
It removes directories (note the directory must be empty).
Remove recursively the contents of the directory.
Each file in Linux has three levels of access: user, group, other. The accessibility is classified as the possibility to read, write (modify) and execute the file. These properties can be visualized using the command ls –l. A typical output is reported below:
–rw-r–r– 1 droccatano staff 4664 Oct 9 15:08 state.cpt
–rw-r–r– 1 droccatano staff 4664 Oct 9 15:08 state_prev.cpt
–rw-r–r– 1 droccatano staff 27052 Oct 9 15:08 topol.tpr
–rw-r–r– 1 droccatano staff 13992 Oct 9 15:08 traj.trr
–rw-r–r– 1 droccatano staff 57692 Oct 9 15:08 traj.xtc
The first column indicates the type of file (a letter d indicate a directory; a indicate a normal file), the next three groups of three characters indicate the user, group and other levels of accessibility. The three characters indicate the type of accessibility, a dash indicates protection.
The file and directory accessibility can be changed using the command chmod.
chmod ugo+rw file1 file2
Other useful commands are:
Compare the contents of two ASCII files (ex. diff file1 file2).
grep or egrep
Search a file for a pattern.
These two examples show how to use the command to extract information from the files.
egrep HETATM file.pdb
Extract all the line in the pdb file with the keyword HETATM.
egrep –n HETATM file.pdb
Extract all the line WITHOUT the keyword HETATM.
The sorting of the information in a file can be done using the program sort
that sort or merge files.
sort -k 1 –nr numberlist.dat
This command sort the numbers in the first column of the file in reverse order.
The count of the number of lines, words, and characters in a text file is done by the program
The command find is a very useful tool to look for a file or directory hidden in the directory tree. The following example allows to search and print files with extension “.txt” in the current directory and subdirectories that have been created less than 60 days ago.
find ./ -name “*.txt” -type f -mtime -60 -print
File Editing Programs
To create or change a text file, use the command vi (or vim) name file. The named file is either an existing file that you want to change or a new file you want to create. The vi editor has two working modes: command and editing mode. Typing the key ESC activates the command mode. In command mode, it is equivalent to access the menu bar of a graphically oriented editor. The commands are given by typing the command “:” followed by an option. The editor has many editing commands. Some of the commonly used ones are:
d delete present line
i insert text
y copy lines in the buffer
p paste a line
:line n1,line n2:d
Delete the line from line n1 to line n2. If the line n1 is left black the current line is considered. If line n2 is $ then the line from line N1 to the end are deleted.
write the edited contents in a file named namefile.
substitutes string-2 on the first occurrence of string-1 in the current line.
exit without saving the file.
: syntax on
syntax highlighting mode, very useful in program editing.
save and quit.
An alternative editor to the vim is emacs. This is a programmable editor with many functions and an elaborate interface and it needs a steeper learning curve than vi to be used in the most efficient way. However, the time spent to master this editor can turn to be very useful if you plan to continue to use Unix as favorite OS system.
Finally, if you have a more mouse-oriented attitude then Linux OS offers a broad variety of editors with a graphical user interface. Few examples are:
A powerful public domain program that can be used to edit to professional level documents (article, books, presentation) is latex. This text-processing program can be used to produce documents of very high quality and many journals accept also a Latex formatted manuscript. It is very useful to make very nice formulas representation. Latex is a sort of text processing language since all the instructions to format the text are written together with the text to be processed. It does require a graphical interface only for visualizing the final formatted document.
A description of latex would require too much space to be condensed in a short appendix. Therefore the interested reader can start looking to the books and internet link pages reported in the references at the end of the appendix.
Open software Linux alternatives to latex are for example OpenOffice a free compatible alternative to the Microsoft Office package.
Documents in postscript or pdf format can be visualized different visualizer as Acrobat, Xpdf, Ghostview, Evince. The availability of the different package is related to the Linux distribution that you decide to adopt.
The are many Linux distributions each of them come with its flavor. A good starting point to search for the one that best suits your needs is to look at this website: Distrowatch. Here a list of link to popular Linux distro: