Mean-field Method

Mean-field is a concept in statistical mechanics to approximately treat the many body problem. It is an effective theory.

Paramagnetic Material

In paramagnetic materials, the “polarized” spins $ {s_i}$ determine the overall “magnetization” of the material. When the spins are all aligned, we get the maximium “magnetization”. If the directions of the spins are completely random, the material is not “magnetized”. In other words, we can use a quantity $ m=\langle s \rangle$, which is called magnetization, to represent how well the overall “magnetization” of the material is.

For simplicity, we assume that the spins can only be up (s=1) or down (s=-1).

The idea of mean-field is to treat each spin as a mean-field $ m$ and the fluctuations around this mean-field. Mathematically speaking

$$ s_i = m + \delta s_i, $$

where $ \delta s_i = s_i -m$ is the so called fluctuations. This treatment of the material requires the material to be homogeneous. Under the condition of homogeneous material, we can think of all the spins are the same. Thus we can look at one spin $ s_\alpha$ and all other spins become a background environment of this particular spin. This background environment is called mean-field which provides a potential field to interact with.

On the other hand, the average of spins $ \langle s \rangle$ is equivalent to the ensemble average of this particular spin $ \langle s_\alpha \rangle$

$$ \langle s \rangle = \langle s_\alpha \rangle_{ensemble} = \sum P(E_\alpha) s_\alpha, $$

where $ P(E_\alpha)$ is calculated by the Boltzmann distribution,

$$ P(E_\alpha) = \frac{e^{\beta E_\alpha} }{ \sum e^{\beta E_\alpha} }. $$

So we need to calculate the energy of the spin first. Using the Hamiltonian for spins

$$ E_\alpha = - s_\alpha H_{total}, $$

where $ H_{total}$ is the total magnetic field including the mean-field. This energy finally becomes

$$ \begin{align} E_\alpha &= - s_\alpha \left( H_{ext} + J \sum_{i, near \alpha} s_i \right) \\ &= -s_\alpha \left( H_{ext} + J \sum_{i, near \alpha} ( m + (s_i - m) ) \right) \\ & = -s_\alpha \left( H_{ext} + J z m + J \sum_{i, near \alpha} ( s_i - m ) \right), \end{align} $$

where $ z$ is the number of spins interacting with $ s_\alpha$.

Mean-field theory assume a homogeneous background, the summation of fluctuations, $ \sum_{i, near \alpha} ( s_i - m)$ should be 0.

Then energy is calcuated as

$$ E_\alpha = - s_\alpha (z J m + H). $$

Using this we get the ensemble average of spin $ \alpha$,

$$ \begin{align} \langle s_\alpha \rangle &= \sum_{s_\alpha} s_\alpha \frac{e^{\beta E_\alpha} }{ \sum e^{\beta E_\alpha} } \\ &= \tanh \left( \beta (z J m + H) \right). \end{align} $$

By definition, we know that

$$ m = \langle s \rangle = \langle s_\alpha \rangle_{ensemble} = \tanh \left( \beta (z J m + H) \right). $$

Such a transcendental equation tells us the magnetization of the material.

Neuroscience

Mean-field theory is also applied to networks of neurons. Since it is an effective theory, we can only talk about the statistical features of neuron networks as mean-field is applied.

This method is explained in Deco20081. This section is an interpretation of it if not a rewritten.

Two Phase Spaces

Similar to the $ \mu$ space (6 D) and $ \Gamma$ space (6N D) in statistical mechanics, it takes a lot of dimensions if we would like to describe a neuron network using a point in a phase space ($ \Gamma$ space of network).

To see this, consider a network of $ N$ neurons, each neuron requires $ \zeta$ variables $ { \xi_1, \xi_2,\cdots, \xi_\zeta}$ for a complete description (say, PSP $ V$, current $ I$, $ \mathcal T$, as mentioned in Deco20081). If we would like to describe the network using these $ \zeta$ variables, it is represented as a bunch of points in such a phase space, which we name $ \mu$ space. On the other hand, we can construct a $ N\zeta$ dimensional phase space, where each dimension is a variable of a certain neuron. Suppose the neurons are given ids $ {}i$. The dimensions are $ \xi{j,i}$ which stands for the $ \xi_j$ variable of neuron i.

We could imagine how convoluted the new phase space ($ \Gamma$ space) has become.

In any case, the evolution of the network is mapped to the motion of a point in $ \Gamma$ space or motion of $ N$ points in $ \mu$ space.

Equation of Motion

The dynamics of the system is described using Fokker-Planck equation,

$$ \partial_t p = - \nabla\cdot (f-D \nabla) p, $$

where $ p$ is the density of points in phase space. This is a statistical formalism.

Drift and Diffusion

The two terms on the RHS are dubbed as drift term and diffusion term for the physics they are describing.
$$ V_i = V_L + \sum_{j=1}^N J_{ij} e^{-(t-t_j^{(k)})/\tau} \sum_k \Theta(t-t_j^{(k)}), $$

where $ \Theta$ is the Heaviside function. The complication of this model comes from the fact that all neurons are potentially coupled.

Leaky Integrate-and-fire Model

LIF is a model utilizing capacitors and resistors to describe the evolution of membrane potential. The point of a capacitor is to reset the potential when it reaches the threshold.

The EoM is

$$ \tau \frac{d}{dt} V = - (V - V_L) + IR, $$

where $ I$ is the total synaptic current input of the neuron, $ V_L$ is the leak.

For a neuron $ i$ in a network, $ I$ comes from all the synaptic delta pulse inputs of other neurons $ j$,

$$ I_i R = \tau \sum_{j=1}^N J_{ij} \sum_k \delta(t-t_j^{(k)}), $$

where $ t_j^{(k)}$ denotes the time for kth spike of neuron $ j$.

The equation can be solved using Green’s function method.

Following the idea of phase space, the authors of Deco20081 derived Fokker-Planck equation for spiking neuron models. We could imagine that the $ \Gamma$ space for LIF models would be a $ N$ dimensional one $ {V_i}$, where $ N$ is the number of neurons. On the other hand, $ \mu$ space description requires only 1 D, which is potential. In $ \mu$ space, we define a probability density $ p(v,t)$, which gives us the probability of neuron falls in a potential range $ [v,v+dv]$ when multiplied by the size of the phase space $ dv$.

The Chapman-Kolmogorov equation describes the evolution of such a density function

$$ p(v,t+dt) = \int_{-\infty}^\infty p(v-\epsilon,t)\rho(\epsilon\vert v-\epsilon) d\epsilon, $$

where $ \rho(\epsilon\vert v-\epsilon)$ is the probability of neuron to be with potential $ v$ at $ t+dt$ given neuron potential at $ t$ to be $ v-\epsilon$. But this integral form is not always useful. Kramers-Moyal expansion of it gives us the differential form

$$ \begin{equation} \partial_t p(v,t) = \sum_{k=1}^\infty \frac{ (-1)^k }{k!} \frac{\partial^k}{\partial v^k} \left( p(v,t) \lim_{dt\to 0} \frac{ \langle \epsilon^k\rangle_v }{dt} \right), \label{eqn-prob-density-equation-with-kramers-moyal-expansion} \end{equation} $$

where

$$ \langle \epsilon^k\rangle_v = \int_{-\infty}^\infty \epsilon^k\rho(\epsilon\vert v) d\epsilon, $$

which is an average.

The plan is to find method to calcualte $ \langle \epsilon^k \rangle_v$ which is an average with respect to $ v$.

Mean-field Approximation

By assuming each neurons are identical in this network, we face the same situation as in the paramagnetic material. By saying so we are going to assume that the time average (system average, in paramagnetic materials) is the same as the ensemble average, which treats all other neurons as a background and consider only one neuron.

The key of this practice is that $ \langle \epsilon^k \rangle_v = \langle \epsilon^k \rangle_{ensemble}$. On the other hand, we know that $ \langle \epsilon^k \rangle_{ensemble} = dV$, if we find this $ dV$ that calculates the change of potential.

We need to find a quantity that is useful to describe the network, which should be similar to our magnetization $ m$. The choice can either be average potential of each neuron $ V_{avg}$ or the total membrane potential $ V=NV_{avg}$. The authors in Deco20081 choose the average membrane potential.

We denote mean firing rate as $ A$. The equation of $ V$ is

$$ dV = \langle J\rangle N A dt - \frac{V-V_L}{\tau} dt, $$

where $ \langle J\rangle$ is the avarage synaptic weight and mean firing rate or population activity is

$$ A = \lim_{dt\to 0} \frac{ \text{number of spikes within time } [t,t+dt] }{dt} \frac{1}{N}. $$

Since $ \epsilon=dV$, we plug it in to Eq. \ref{eqn-prob-density-equation-with-kramers-moyal-expansion}. To save keystrokes, we calcualte the moments first.

$$ \begin{align} M^{(k)} &= \lim_{dt\to 0} \frac{1}{dt}\langle \epsilon\rangle_v \\ & = \lim_{dt\to 0} \frac{1}{dt}\langle \epsilon\rangle_{ensemble} \\ & = \lim_{dt\to 0} \frac{1}{dt} dV \\ & = \cdots \\ & = \langle J^k \rangle N A. \end{align} $$

In the derivation we dropped all terms that contains order of infinitesimal varibales $ \mathcal O (dt^2)$.

The diffusion approximation will help us drop the $ k>2$ moments. This is a Taylor expansion anyway.

A Simple Argument for Diffusion Approximation?

There should be one.

Regardless of real understanding or not,

$$ \begin{align} M^{(1)} &= \frac{\mu(t)}{\tau} - \frac{ v-V_L }{\tau} \\ M^{(2)} &= \frac{\sigma^2(t)}{\tau}, \end{align} $$

are related to the drift coefficient $ \mu(t)$ and the diffusion coefficient $ \sigma(t)$.

Then we obtain the Fokker-Planck equation

$$ \partial_t p(v,t) = \frac{1}{2\tau} \sigma^2(t) \partial^2_v p(v,t) + \partial_v \left[ \frac{ v - V_L - \mu(t) }{\tau} p(v,t) \right], $$

where $ \sigma(t)$ is diffusion coefficient and $ \mu(t)$ is drift coefficient.

References and Notes


  1. uses leaky integrate-and-fire model, which is used to find out the evolution of the membrane potential $ V_i$ of neuron $ i$, which has a solution ↩︎ ↩︎ ↩︎ ↩︎

Planted: by ;

Lei Ma (2020). 'Mean-field Method', Intelligence, 10 April. Available at: https://intelligence.leima.is/toolbox/random-thoughts/mean-field/.