Common problems of Shannon entropy in information theory.
Theorem: Deterministic Distribution Minimizes The Entropy
What is the minimum value of as ranges over the set of -dimensional probability vectors? Find all p's which achieve this minimum.
Solution: We wish to find all probability vectors which minimize
Now , with equality iff or 1 . Besides we know that .
Hence the only possible probability vectors which minimize are those with for some and . There are such vectors, i.e., , and the minimum value of is 0 .
Source:
- Why is Entropy maximised when the probability distribution is uniform?
- Why does when ?
Proof
Heuristically, the probability density function on with maximum entropy turns out to be the one that corresponds to the least amount of knowledge of , in other words the Uniform distribution.
Now, for a more formal proof consider the following: A probability density function on is a set of nonnegative real numbers that add up to 1 . Entropy is a continuous function of the -tuples , and these points lie in a compact subset of , so there is an -tuple where entropy is maximized. We want to show this occurs at and nowhere else.
Suppose the are not all equal, say . (Clearly .) We will find a new probability density with higher entropy. It then follows, since entropy is maximized at some -tuple, that entropy is uniquely maximized at the -tuple with for all .
Since , for small positive we have . The entropy of minus the entropy of equals
To complete the proof, we want to show this is positive for small enough . Rewrite the above equation as
Recalling that for small , the above equation is which is positive when is small enough since .
Simpler Proof
Recalling that with equality if and only has a uniform distribution over .
Then for any r.v. , it's maximum entropy is , and is achieved by a uniform distribution over .
Drawing with and without replacement
Drawing with and without replacement. An urn contains red, white, and black balls. Which has higher entropy, drawing balls from the urn with replacement or without replacement?
Solution:
Let random variable denotes the outcome color of the th ball. The alphabate of all is the same: , where
- if the ball is red,
- if the ball is white
- if the ball is black.
Let denote the PMF of .
For the case with replacement:
For all , Since the total number of red, white, black balls are , doesn't change during the drawing process, so all are independant(because of the replacement).
Due to Theorem: Entropy is additive for independent r.v., For the case without replacement:
the picks are not independent anymore, due to the chain rule for joint entropy, we have
Meanwhile, due to Theorem: Theorem: Conditioning reduces entropy, we have: Hence, the sample with replacement has higher entropy.