Processing math: 95%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Physics LibreTexts

1.3: Entropy and Probability

( \newcommand{\kernel}{\mathrm{null}\,}\)




























































































































































































































































































































\( \newcommand\Dalpha

ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[1], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dbeta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[2], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dgamma
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[3], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Ddelta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[4], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Depsilon
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[5], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dvarepsilon
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[6], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dzeta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[7], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Deta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[8], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dtheta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[9], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dvartheta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[10], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Diota
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[11], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dkappa
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[12], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Dlambda
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[13], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)





\( \newcommand\Dvarpi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[14], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)











\( \newcommand\DGamma
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[15], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\DDelta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[16], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\DTheta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[17], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)









































































\( \newcommand\Vmu
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[18], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vnu
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[19], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vxi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[20], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vom
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[21], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vpi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[22], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vvarpi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[23], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vrho
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[24], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vvarrho
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[25], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vsigma
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[26], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vvarsigma
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[27], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vtau
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[28], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vupsilon
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[29], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vphi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[30], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vvarphi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[31], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vchi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[32], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vpsi
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[33], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\Vomega
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[34], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\VGamma
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[35], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)
\( \newcommand\VDelta
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/p[1]/span[36], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)

















\newcommand\BI{\mib I}}










































\)










































































\newcommand { M}

























\newcommand { m}














































}


















\( \newcommand\tcb{\textcolor{blue}\)
\( \newcommand\tcr{\textcolor{red}\)



































1$#1_$






















































































\newcommand\SZ{\textsf Z}} \( \newcommand\kFd{k\ns_{\RF\dar}\)

\newcommand\mutB{\tilde\mu}\ns_\ssr{B}



\( \newcommand\xhihOZ
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/span[1], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)



\( \newcommand\labar
ParseError: invalid DekiScript (click for details)
Callstack:
    at (Template:MathJaxArovas), /content/body/div/span[2], line 1, column 1
    at template()
    at (Bookshelves/Thermodynamics_and_Statistical_Mechanics/Book:_Thermodynamics_and_Statistical_Mechanics_(Arovas)/01:_Fundamentals_of_Probability/1.03:_Entropy_and_Probability), /content/body/p/span, line 1, column 23
\)





















Entropy and Information Theory

It was shown in the classic 1948 work of Claude Shannon that entropy is in fact a measure of information5. Suppose we observe that a particular event occurs with probability p. We associate with this observation an amount of information I(p). The information I(p) should satisfy certain desiderata:

  • Information is non-negative, I(p)0.
  • If two events occur independently so their joint probability is p1p2, then their information is additive, I(p1p2)=I(p1)+I(p2).
  • I(p) is a continuous function of p.
  • There is no information content to an event which is always observed, I(1)=0.

From these four properties, it is easy to show that the only possible function I(p) is I(p)=Alnp , where A is an arbitrary constant that can be absorbed into the base of the logarithm, since logbx=lnx/lnb. We will take A=1 and use e as the base, so I(p)=lnp. Another common choice is to take the base of the logarithm to be 2, so I(p)=log2p. In this latter case, the units of information are known as bits. Note that I(0)=. This means that the observation of an extremely rare event carries a great deal of information6

Now suppose we have a set of events labeled by an integer n which occur with probabilities {pn}. What is the expected amount of information in N observations? Since event n occurs an average of Npn times, and the information content in pn is lnpn, we have that the average information per observation is S=INN=npnlnpn , which is known as the entropy of the distribution. Thus, maximizing S is equivalent to maximizing the information content per observation.

Consider, for example, the information content of course grades. As we shall see, if the only constraint on the probability distribution is that of overall normalization, then S is maximized when all the probabilities pn are equal. The binary entropy is then S=log2Γ, since pn=1/Γ. Thus, for pass/fail grading, the maximum average information per grade is log2(12)=log22=1 bit. If only A, B, C, D, and F grades are assigned, then the maximum average information per grade is log25=2.32 bits. If we expand the grade options to include {A+, A, A-, B+, B, B-, C+, C, C-, D, F}, then the maximum average information per grade is log211=3.46 bits.

Equivalently, consider, following the discussion in vol. 1 of Kardar, a random sequence {n1,n2,,nN} where each element nj takes one of K possible values. There are then KN such possible sequences, and to specify one of them requires log2(KN)=Nlog2K bits of information. However, if the value n occurs with probability pn, then on average it will occur Nn=Npn times in a sequence of length N, and the total number of such sequences will be g(N)=N!Kn=1Nn! . In general, this is far less that the total possible number KN, and the number of bits necessary to specify one from among these g(N) possibilities is log2g(N)=log2(N!)Kn=1log2(Nn!)NKn=1pnlog2pn , up to terms of order unity. Here we have invoked Stirling’s approximation. If the distribution is uniform, then we have pn=1K for all n{1,,K}, and log2g(N)=Nlog2K.

Probability distributions from maximum entropy

We have shown how one can proceed from a probability distribution and compute various averages. We now seek to go in the other direction, and determine the full probability distribution based on a knowledge of certain averages.

At first, this seems impossible. Suppose we want to reproduce the full probability distribution for an N-step random walk from knowledge of the average X=(2p1)N, where p is the probability of moving to the right at each step (see §1 above). The problem seems ridiculously underdetermined, since there are 2N possible configurations for an N-step random walk: σj=±1 for j=1,,N. Overall normalization requires {σj}P(σ1,,σN)=1 , but this just imposes one constraint on the 2N probabilities P(σ1,,σN), leaving 2N1 overall parameters. What principle allows us to reconstruct the full probability distribution P(σ1,,σN)=Nj=1(pδσj,1+qδσj,1)=Nj=1p(1+σj)/2q(1σj)/2 , corresponding to N independent steps?

The principle of maximum entropy

The entropy of a discrete probability distribution {pn} is defined as S=npnlnpn , where here we take e as the base of the logarithm. The entropy may therefore be regarded as a function of the probability distribution: S=S({pn}). One special property of the entropy is the following. Suppose we have two independent normalized distributions {pAa} and {pBb}. The joint probability for events a and b is then Pa,b=pAapBb. The entropy of the joint distribution is then S=abPa,blnPa,b=abpAapBbln(pAapBb)=abpAapBb(lnpAa+lnpBb)=apAalnpAabpBbbpBblnpBbapAa=apAalnpAabpBblnpBb=SA+SB . Thus, the entropy of a joint distribution formed from two independent distributions is additive.

Suppose all we knew about {pn} was that it was normalized. Then npn=1. This is a constraint on the values {pn}. Let us now extremize the entropy S with respect to the distribution {pn}, but subject to the normalization constraint. We do this using Lagrange’s method of undetermined multipliers. We define S({pn},λ)=npnlnpnλ(npn1) and we freely extremize S over all its arguments. Thus, for all n we have 0=Spn=(lnpn+1+λ)0=Sλ=npn1 . From the first of these equations, we obtain pn=e(1+λ), and from the second we obtain npn=e(1+λ)n1=Γe(1+λ) , where Γn1 is the total number of possible events. Thus, pn=1/Γ, which says that all events are equally probable.

Now suppose we know one other piece of information, which is the average value X=nXnpn of some quantity. We now extremize S subject to two constraints, and so we define S({pn},λ0,λ1)=npnlnpnλ0(npn1)λ1(nXnpnX) . We then have Spn=(lnpn+1+λ0+λ1Xn)=0 , which yields the two-parameter distribution pn=e(1+λ0)eλ1Xn . To fully determine the distribution {pn} we need to invoke the two equations npn=1 and nXnpn=X, which come from extremizing S with respect to λ0 and λ1, respectively: 1=e(1+λ0)neλ1XnX=e(1+λ0)nXneλ1Xn .

General formulation

The generalization to K extra pieces of information (plus normalization) is immediately apparent. We have Xa=nXanpn , and therefore we define S({pn},{λa})=npnlnpnKa=0λa(nXanpnXa) , with X(a=0)nX(a=0)=1. Then the optimal distribution which extremizes S subject to the K+1 constraints is pn=exp{1Ka=0λaXan}=1Zexp{Ka=1λaXan} , where Z=e1+λ0 is determined by normalization: npn=1. This is a (K+1)-parameter distribution, with {λ0,λ1,,λK} determined by the K+1 constraints in Equation [Kpoc].

Example

As an example, consider the random walk problem. We have two pieces of information: σ1σNP(σ1,,σN)=1σ1σNP(σ1,,σN)Nj=1σj=X . Here the discrete label n from §3.2 ranges over 2N possible values, and may be written as an N digit binary number rNr1, where rj=12(1+σj) is 0 or 1. Extremizing S subject to these constraints, we obtain P(σ1,,σN)=Cexp{λjσj}=CNj=1eλσj , where Ce(1+λ0) and λλ1. Normalization then requires TrP{σj}P(σ1,,σN)=C(eλ+eλ)N , hence C=(coshλ)N. We then have P(σ1,,σN)=Nj=1eλσjeλ+eλ=Nj=1(pδσj,1+qδσj,1) , where p=eλeλ+eλ,q=1p=eλeλ+eλ . We then have X=(2p1)N, which determines p=12(N+X), and we have recovered the Bernoulli distribution.

Of course there are no miracles7, and there are an infinite family of distributions for which X=(2p1)N that are not Bernoulli. For example, we could have imposed another constraint, such as E=N1j=1σjσj+1. This would result in the distribution P(σ1,,σN)=1Zexp{λ1Nj=1σjλ2N1j=1σjσj+1} , with Z(λ1,λ2) determined by normalization: σP(σ)=1. This is the one-dimensional Ising chain of classical equilibrium statistical physics. Defining the transfer matrix Rss=eλ1(s+s)/2eλ2ss with s,s=±1 , R=(eλ1λ2eλ2eλ2eλ1λ2)=eλ2coshλ1I+eλ2τxeλ2sinhλ1τz , where \tau^x and \tau^z are Pauli matrices, we have that Z\ns_{ring}=\Tra\!\big(R^N\big) \quad,\quad Z\ns_{chain}=\Tra\!\big(R^{N-1}S\big)\ , where S\ns_{ss'}=e^{-\lambda\ns_1(s+s')/2} , \begin{split} S&=\begin{pmatrix} e^{-\lambda\ns_1} & 1 \\ 1 & e^{\lambda\ns_1}\end{pmatrix}\\ &=\cosh\lambda\ns_1 \, \MI + \tau^x - \sinh\lambda\ns_1\,\tau^z\ . \end{split} The appropriate case here is that of the chain, but in the thermodynamic limit N\to\infty both chain and ring yield identical results, so we will examine here the results for the ring, which are somewhat easier to obtain. Clearly Z\ns_{ring}=\zeta_+^N +\zeta_-^N , where \zeta\ns_\pm are the eigenvalues of R: \zeta\ns_\pm=e^{-\lambda\ns_2}\cosh\lambda\ns_1\pm\sqrt{e^{-2\lambda\ns_2}\sinh^2\!\lambda\ns_1 + e^{2\lambda\ns_2}}\quad . In the thermodynamic limit, the \zeta\ns_+ eigenvalue dominates, and Z\ns_{ring}\simeq \zeta_+^N. We now have X=\Big\langle\sum_{j=1}^N\sigma\ns_j\Big\rangle = -{\pz\ln Z\over\pz\lambda\ns_1} = -{N\sinh\lambda\ns_1\over \sqrt{\sinh^2\!\lambda\ns_1 + e^{4\lambda\ns_2}}}\ . We also have E=-\pz\ln Z/\pz\lambda\ns_2. These two equations determine the Lagrange multipliers \lambda\ns_1(X,E,N) and \lambda_2(X,E,N). In the thermodynamic limit, we have \lambda\ns_i=\lambda\ns_i(X/N,E/N). Thus, if we fix X/N=2p-1 alone, there is a continuous one-parameter family of distributions, parametrized \ve=E/N, which satisfy the constraint on X.

So what is it about the maximum entropy approach that is so compelling? Maximum entropy gives us a calculable distribution which is consistent with maximum ignorance given our known constraints. In that sense, it is as unbiased as possible, from an information theoretic point of view. As a starting point, a maximum entropy distribution may be improved upon, using Bayesian methods for example (see §5.2 below).

Continuous probability distributions

Suppose we have a continuous probability density P(\Bvphi) defined over some set \ROmega. We have observables X^a=\int\limits_\ROmega\!\!d\mu\>X^a(\Bvphi)\,P(\Bvphi)\ , \label{constcont} where d\mu is the appropriate integration measure. We assume d\mu=\prod_{j=1}^D d\vphi\ns_j, where D is the dimension of \ROmega. Then we extremize the functional S^*\big[P(\Bvphi),\{\lambda\ns_a\}\big]=-\int\limits_\ROmega\!\!d\mu\> P(\Bvphi)\ln P(\Bvphi) - \sum_{a=0}^K\lambda\ns_a \Bigg( \int\limits_\ROmega\!\!d\mu\> P(\Bvphi)\,X^a(\Bvphi) - X^a \Bigg) with respect to P(\Bvphi) and with respect to \{\lambda\ns_a\}. Again, X^0(\Bvphi)\equiv X^0 \equiv 1. This yields the following result: \ln P(\Bvphi)=-1-\sum_{a=0}^K \lambda\ns_a\, X^a(\Bvphi)\ . The K+1 Lagrange multipliers \{\lambda\ns_a\} are then determined from the K+1 constraint equations in Equation [constcont].

As an example, consider a distribution P(x) over the real numbers \MR. We constrain \int\limits_{-\infty}^\infty\!\!dx\>P(x)=1 \quad,\quad \int\limits_{-\infty}^\infty\!\!dx\>x\,P(x)=\mu \quad,\quad \int\limits_{-\infty}^\infty\!\!dx\>x^2\,P(x)=\mu^2+\sigma^2\ . Extremizing the entropy, we then obtain P(x)=\CC\,e^{-\lambda\ns_1 x - \lambda\ns_2 x^2}\ , where \CC=e^{-(1+\lambda\ns_0)}. We already know the answer: P(x)={1\over\sqrt{2\pi\sigma^2}}\,e^{-(x-\mu)^2/2\sigma^2}\ . In other words, \lambda\ns_1=-\mu/\sigma^2 and \lambda\ns_2=1/2\sigma^2, with \CC=(2\pi\sigma^2)^{-1/2}\,\exp(-\mu^2/2\sigma^2).


This page titled 1.3: Entropy and Probability is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Daniel Arovas.

Support Center

How can we help?