Notes on the Quasi-Gaussian Entropy Theory Applied to Complex Systems

INTRODUCTION

The Quasi-Gaussian Entropy theory (QGE) is a theoretical method based on a novel statistical mechanics reformulation of the free energy distributions. It was originally developed by Dr. Andrea Amadei (University of Rome “Tor Vergata”, Italy) in collaboration with Prof Herman Berendsen, Dr. Emil Apol (the University of Groningen, The Netherlands) and Prof Alfredo Di Nola (University of Rome “La Sapienza”, Italy).  The foundations of the QGE theory are reported in a series of papers collected in the Ph.D. thesis of both Dr. Amadei and Dr. Apol cited in the bibliography. The theory was further developed and applied to different systems spanning from simple fluids to proteins.

In these brief note, the mathematical basis of QGE for the study of the thermodynamics of proteins in solution as in the Ref. [1] is detailed. 

THE QGE THEORY OF THE FLUID STATE

For a fluid state system of {N} solute molecules at high dilution, the partition function can be expressed as

\displaystyle Q = \frac{(8 \pi^2 V)^N}{N!} \left( \Theta \int^* e^{-\beta {{\cal U}'}} \prod_{j=1}^{n} ( \det\tilde{m}_{j} )^{1/2} ( \det \tilde{M} )^{1/2} d {\mathbf {\xi}} d {\mathbf {\xi}}\right)^N\ \ \ \ \ (1)

where {{\cal U}'} is the excess energy, basically the potential energy, including the quantum vibrational ground state energy, of the subsystem defined by a single solute molecule and {n} solvent molecules, {V} the overall volume of the system, \mathbf {\xi} the generalized internal coordinates of a single solute molecule with fixed rototranslational coordinates and \mathbf {x} the coordinates of the {n} solvent molecules within the solute molecular volume. Moreover {\tilde{m}_{j}} is the mass tensor of the {j}^{th} solvent molecule, {\tilde{M}} the mass tensor of the solute and {\Theta} a temperature dependent factor including the quantum corrections

\displaystyle \Theta = \frac{(2 \pi k T )^{(d+d_s)/2} (q_{ref,s}^{qm})^n q_{ref}^{qm}}{n! h^{(d+d_s)} (1+\gamma)(1+\gamma_s)^n} \ \ \ \ \ (2)

with {1+\gamma} and {1+\gamma_s} the symmetry coefficient for the solute and the solvent respectively, {d} and {d_s} the number of classical degrees of freedom in the solute and solvent molecules and {q_{ref,s}^{qm}} and {q_{ref}^{qm}} the solvent and solute molecular quantum vibrational partition functions respectively. Finally the integral is taken within the solute molecular volume {V_m=V/N}, and the star denotes an integration only over the accessible configurational space. From the previous equations it follows that the whole partition function can be obtained from the solute molecular partition function { Q=Q_m^N }

Q_m = \frac{8 \pi^2 V \Theta}{e^{-1}} \int^* e^{-\beta {{\cal U}'}} \prod_{j=1}^{n} ( \det\tilde{m}_{j} )^{1/2}( \det \tilde{M} )^{1/2} d {\mathbf {\xi}} d {\mathbf {x}} \ \ \ \ \ (3)

where we used the approximation {N! \cong N^N e^{-N}}. Hence, the whole thermodynamics is defined by {Q_m} as {A=-NkT \ln Q_m}. This clearly means that if we want to describe the thermodynamics of the same system using the isobaric ensemble we must use a solute molecular isobaric partition function defined as

\displaystyle \Xi_m = \int e^{-\beta p V_m } Q_m(\beta, V_m) \frac{d V_m}{v} \ \ \ \ \ (4)

providing {G = -NkT \ln \Xi_m} (note that {v} is an arbitrary volume constant necessary to make adimensional {\Xi_m}). It is possible to show that

\beta G(\beta) - \beta_0 G(\beta_0) = -N\ln \left\{\frac{\Xi_m(\beta)}{\Xi_m(\beta_0)}\right\} = - N \ln \langle e^{-\Delta \beta {\cal H}} \rangle_{\beta_0} \ \ \ \ \ (5)

with {\Delta \beta=\beta - \beta_0}, the subscript {\beta_0} indicating an average in the {\beta_0} ensemble and {\cal H} = {\cal U}'+pV_m. The ensemble average in Eq. 5 can be expressed as

\displaystyle \langle e^{- \Delta \beta \cal H} \rangle_{\beta_0} = \int \rho ({\cal H} ) e^{-\Delta \beta {\cal H} } d {\cal H} \ \ \ \ \ (6)

where {\rho({\cal H})} is the enthalpy probability distribution function.

Instead of using a perturbation expansion, in the QGE theory, the free energy is obtained by modeling the distribution function and hence its moment generating function or Laplace transform, Eq. 6. For homogeneous fluid state systems, it was shown that a rather good model in the isobaric ensemble is the diverging Gamma state model for enthalpy fluctuations.


NOTE: The Gamma Distribution

The gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, and chi-squared distribution are special cases of the gamma distribution.

For the two parameters k, \theta, it defined as

f(x;k,\theta)=\frac{x^{k-1}e^{-\frac{x}{\theta}}}{\theta^k \Gamma(k)} for x>0 and k,\theta >0

where $\Gamma(\alpha)$ is the gamma function

\Gamma(k) =\int^{\infty}_0 x^{k-1}e^{-x}dx.

IN the following figure, the plot of the gamma distribution (left) for different value of k, \theta and the \Gamma(k) function are shown.


When we deal with a very complex system involving a macromolecule, it is likely that we need more sophisticated models. For the canonical ensemble, that the use of mixing distributions for Gamma state models provides a very powerful method to obtain more sophisticated and accurate models for fluid state systems. We can use a similar approach in the present case, assuming that the solute molecular configurational space of the internal coordinates can be partitioned into a set of {L} subspaces, each one defining a solute-solvent system exactly described by a “local” diverging Gamma state (note that the pure water thermodynamics is well described over a wide temperature range by a single diverging Gamma state). We can rewrite the total free energy change as

\beta G(\beta) - \beta_0 G(\beta_0) = -N \ln \left\{ \sum_{i=1}^{L} \epsilon_i e^{- [n \Delta (\beta \mu_s) - \Delta (\beta \mu_i) ]} \right\}

\Delta (\beta \mu) = \beta \mu(\beta) - \beta_0 \mu(\beta_0)

where {n \mu_s + \mu_i} is the Gibbs free energy of the system defined by the solute molecular volume which contains {n} solvent molecules and a single solute molecule in the {i}th conformation ({i}th subspace), {\mu_s} is the chemical potential of the solvent and clearly {\mu_i} is the chemical potential of the solute in the {i}^{th} conformation. Finally,

\displaystyle \epsilon_i=\frac{\Xi_{m,i}(\beta_0)}{\Xi_{m}(\beta_0)} = e^{-\beta_0 (\mu_{0,i}-\mu_0)} \ \ \ \ \ (7)

\displaystyle \epsilon_i=\frac{\Xi_{m,i}(\beta_0)}{\Xi_{m}(\beta_0)} = e^{-\beta_0 (\mu_{0,i}-\mu_0)} \ \ \ \ \ (8)

with {\mu_{0,i}=\mu_i(\beta_0), \mu_0=\mu(\beta_0)}, and {\Xi_{m,i}(\beta_0)} the partition function corresponding to the ith conformation at {\beta_0}. At high dilution the solvent molecular partial properties are identical to the pure solvent ones (hence independent of the solute). Within the assumption that each {n \mu_s + \mu_i} can be well modeled by a “local” diverging Gamma state, at infinite solute dilution we have

\displaystyle \mu_i = h_{0,i} - T_0 c_{p0,i} + T(c_{p0,i} - s_{0,i}) + T c_{p0,i}\ln\frac{T_0}{T} \ \ \ \ \ (9)

with {h_{0,i}}, {c_{p0,i}} and {s_{0,i}} the partial molecular enthalpy, heat capacity and entropy of the solute in the {i}^{th} conformation at the reference temperature {T_0=\beta_0^{-1}/k}. >From the other general equations of the diverging Gamma state properties we can also obtain all the other partial molecular properties of the solute, e.g. the enthalpy {h_i} and heat capacity

{c_{p,i}} h_i = \left(\frac{\partial \beta \mu_i}{\partial \beta}\right)_{p,n} = h_{0,i} + (T -T_0) c_{p0,i}

c_{p,i} = \left(\frac{\partial h_i}{\partial T}\right)_{p,n} = c_{p0,i}

where the zero subscript indicates the value at the reference temperature {T_0}. In the present case where we deal with complex macromolecules like proteins at fixed pressure, the summation in Eq. 1 is likely to involve a very large number of Gamma states corresponding to different protein configurational subspaces (conformations). Hence, in order to keep handable the mathematical derivations and especially the model application, we must use drastic simplifications. We will first assume that we can decompose the huge number of Gamma states into two subgroups: one associated with the folded state of the protein and one with the unfolded state. Moreover, we will assume also that within each subgroup the partial molecular heat capacity is the same for the different Gamma states and {h_{0,i} \cong h_0^{0} + j \delta} with {h_0^{0}} and {\delta} the overall “ground state” enthalpy (at {T_0}) and enthalpy gap of the subgroup. Hence, using the subscripts {f} and {u} to define the folded and unfolded state properties respectively, we obtain

\Delta(\beta \mu) = \Delta\left( \beta\frac{G(T)}{N}\right) - n \Delta(\beta\mu_s) \cong - k T \ln \left\{e^{-(\beta \mu_f-\beta_0 \mu_0)} + e^{-(\beta \mu_u - \beta_0 \mu_0)} \right\}

e^{-(\beta \mu_f-\beta_0 \mu_0)} = e^{-\Delta(\beta \mu^0_f)} \epsilon_f \langle e^{-\Delta \beta \delta_f j} \rangle_f

e^{-(\beta \mu_u - \beta_0 \mu_0)} = e^{-\Delta (\beta \mu^0_u)} \epsilon_u \langle e^{-\Delta \beta \delta_u j} \rangle_u

\langle e^{-\Delta \beta \delta_f } \rangle_f = \sum_{j=0}^{L_f-1} e^{-\Delta \beta \delta_f j} w_f(j)

\langle e^{-\Delta \beta \delta_u } \rangle_u  = \sum_{j=0}^{L_u-1} e^{-\Delta \beta \delta_u j} w_u(j)

w_f(j) = \frac{\epsilon_{f,j}}{\epsilon_f}

w_u(j) = \frac{\epsilon_{u,j}}{\epsilon_u}

where {\epsilon_f}, {\epsilon_u}, {\mu_f}, {\mu_u} are the total fractions and chemical potentials of the folded and unfolded subgroups and

\Delta (\beta \mu^{0}_f) = h^{0}_{0,f} \Delta \beta - c_{p0,f}T_0 \Delta \beta - \frac{c_{p0,f}}{k} \ln \frac{T}{T_0}

\Delta (\beta \mu^{0}_u) = h^{0}_{0,u} \Delta \beta - c_{p0,u}T_0 \Delta \beta - \frac{c_{p0,u}}{k} \ln \frac{T}{T_0}

From the previous equations we readily obtain the other partial molecular properties, i.e. enthalpy, entropy and heat capacity,

h = \left(\frac{\partial \beta \mu}{\partial \beta}\right)_{n,p} = h_f +\chi (h_u - h_f)

\chi = \frac{e^{-\beta (\mu_u - \mu_f)}}{1+ e^{-\beta (\mu_u -\mu_f)}}

s = \frac{h -\mu}{T}

c_{p} = \left(\frac{\partial h}{\partial T}\right)_{p,n} = c_{p,u} - (1-\chi) (c_{p,u} - c_{p,f}) + (1 - \chi) \chi \frac{(h_u -h_f)^2}{kT^2}

where obviously

h_f = \left(\frac{\partial \beta \mu_f}{\partial \beta}\right)_{p,n}

 h_u = \left(\frac{\partial \beta \mu_u}{\partial \beta}\right)_{p,n}

c_{p,f} = \left(\frac{\partial h_f}{\partial T}\right)_{p,n}

 c_{p,u} = \left(\frac{\partial h_u}{\partial T}\right)_{p,n}

are the corresponding partial molecular properties in the folded or unfolded state. Note that in the limit of a differential {\delta_f} and {\delta_u} a continuous Gamma states partition of the solute intramolecular phase space is involved.

For describing the state of complex systems, as biomacromolecules, the discrete-like diverging Gamma state partition seems to provide a better general description. The use of a discrete-like diverging Gamma states partition implies that the thermodynamics of a solvated macromolecule should be a complex mixture between a typical fluid state behavior and a discrete-like “energy” fluctuation. In order to proceed further we must model the discrete probability distribution {w(j)}. A simple discrete distribution, which is phisically acceptable and proved to be succesfull to model quantum solid state, is the negative binomial distribution providing

\langle e^{-\Delta \beta \delta_f j} \rangle = \left\{ \frac{1-q_f}{1 -q_f e^{-\beta \delta_f }} \right\}^{Z_f}

\langle e^{-\Delta \beta \delta_u j} \rangle = \left\{ \frac{1-q_u}{1 -q_u e^{-\beta \delta_u }} \right\}^{Z_u}

where {q} and {Z} are two pure numbers characteristic of the negative binomial distribution. With these last equations we can express the partial molecular properties of the folded and unfolded states as

\beta \mu_f - \beta_0 \mu_0 = h_{0,f}^{0} \Delta \beta - c_{p0,f} T_0 \Delta \beta - \frac{c_{p0,f}}{k} \ln \frac{T}{T_0} - \ln \epsilon_f - Z_f \ln \left\{\frac{1-q_f}{1 -q_f e^{-\beta \delta_f }}\right\}

\beta \mu_u - \beta_0 \mu_0 = h_{0,u}^{0} \Delta \beta - c_{p0,u} T_0 \Delta \beta - \frac{c_{p0,u}}{k} \ln \frac{T}{T_0} -\ln \epsilon_u - Z_u \ln \left\{\frac{1-q_u}{1 -q_u e^{-\beta \Delta_u }} \right\}

h_f = h_{0,f}^{0} + (T -T_0) c_{p0,f} + \frac{ Z_f q_f \delta_f}{e^{\beta \delta_f } - q_f}

 h_u = h_{0,u}^{0} + (T -T_0) c_{p0,u} + \frac{ Z_u q_u \delta_u}{e^{\beta \delta_u } - q_u}

and so

c_{p,f} = c_{p0,f} + \frac{ Z_f q_f k (\delta_f \beta)^2 e^{- \beta \delta_f } } {(1- q_f e^{- \beta \delta_f })}

c_{p,u} = c_{p0,u} + \frac{ Z_u q_u k (\delta_u \beta)^2 e^{- \beta \delta_u } } {(1- q_u e^{- \beta \delta_u })}

s_f  = \frac{h_f - \mu_f}{T}

s_u  = \frac{h_u - \mu_u}{T}

We can simplify further the model assuming that q_f = q_u = q
\delta_f = \delta_u = \delta and taking the reference temperature {T_0} as the equilibrium temperature, i.e., {\epsilon_f/\epsilon_u=1}.

 With these simplifications we obtain

\beta (\mu_u - \mu_f) = (h_{0,u}^{0} - h_{0,f}^{0}) \Delta \beta - (c_{p0,u} - c_{p0,f}) T_0 \Delta \beta - \frac{(c_{p0,u} - c_{p0,f})}{k} \ln \frac{T}{T_0}- (Z_u-Z_f)\ln \left\{\frac{1-q}{1 -q e^{-\beta \delta }}\right\}

 h_{u} - h_{f} = h_{0,u}^{0} - h_{0,f}^{0} (c_{p0,u} - c_{p0,f}) (T-T_0) + \frac{(Z_u-Z_f) q \delta}{e^{\Delta \beta \delta}}

c_{p,u} - c_{p,f} = c_{p0,u} - c_{p0,f} + \frac{(Z_u-Z_f)q k (\delta \beta)^2 } {(e^{\Delta \beta \delta} -q)^2} e^{\Delta \beta \delta}

which can be used to obtain the solute partial molecular properties, e.g., via Eq.~<a href=”#eqcv”>1</a> the partial molecular heat capacity.

 NOT YET FINISHED!

BIBLIOGRAPHY

  1. Roccatano, A. Di Nola, A. Amadei. A theoretical model for the folding/unfolding thermodynamics of single-domain proteins, based on the quasi-Gaussian entropy theory. J. Phys. Chem. B, 108, 5756-5762 (2004).
  1. M.E.F Apol. The quasi-Gaussian entropy theory: Temperature dependence of thermodynamic properties using distribution functions. Ph.D. Thesis, Groningen (The Netherlands), 1997.
  2. A. Amadei. Theoretical models for fluid thermodynamics based on the quasi-Gaussian Entropy theory. Groningen (The Netherlands), Ph.D. Thesis, Groningen (The Netherlands), 1998.
 

About Danilo Roccatano

I have a Doctorate in chemistry at the University of Roma “La Sapienza”. I led educational and research activities at different universities in Italy, The Netherlands, Germany and now in the UK. I am fascinated by the study of nature with theoretical models and computational. For years, my scientific research is focused on the study of molecular systems of biological interest using the technique of Molecular Dynamics simulation. I have developed a server (the link is in one of my post) for statistical analysis at the amino acid level of the effect of random mutations induced by random mutagenesis methods. I am also very active in the didactic activity in physical chemistry, computational chemistry, and molecular modeling. I have several other interests and hobbies as video/photography, robotics, computer vision, electronics, programming, microscopy, entomology, recreational mathematics and computational linguistics.
This entry was posted in Research, Science Topics, What is new. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.