Study on Alias Method
@miloyip has published a post recently which motioned the Alias Method to generate a discrete random variable in O(1). After some research, I find out that it is a neat and clever algorithm. Following are some notes of my study on it.
What is Alias Method
Alias method is an efficient algorithm to generate a discrete random variable with specified probability mass function using a uniformly distributed random variable.
Let Z be the discrete random variable which has n possible outcomes z0, z1,
…, zn-1. To make the discussion below simple, we study another variable
Y, where P{Y = i} = P{Z = zi}. And when Y takes on value i, let Z be
zi. So Z can be generated from Y.
Random variable X is uniformly distributed in (0, n), which probability
density function is
Now generate a variable Y' that
A(i) is the alias function. When x falls in range [i, i + 1) (i is an integer), y has the probability F(i) to be i, and probability 1 - F(i) to be A(i). Because x is uniformly distributed,
Let's denote the set of values j that satisfies A(j) = i as A-1(i). The
generated variable Y' has following probability mass function:
Alias method is the algorithm to construct A and F so that P{Y' = i} equals to P{Y = i} for all i. Because the domain of both A and F are integers 0, 1, …, n - 1, they can be stored in array and values can be looked up in O(1), where the space efficiency is in O(n).
In miloyip's implementation, A and F are stored in std::vector<AliasItem> mAliasTable, where A's values are stored in AliasItem::index and F's values
are AliasItem::prob.
Algorithm
Construct Steps
Initialize the set S to be {0, 1, …, n - 1} and n variables pi that with values:
Denote the number of elements in S as ||S||. We have a important invariant that
At the begin of the algorithm, the invariant holds because the sum of all probabilities must equal to 1.
The algorithm is performed using following steps.
- If there is an element i in set S such that pi < 1/n, there must be a j in set S such that pj > 1/n.1 Let A(i) = j and F(i) = pi / (1/n) = pi * n. Remove i from S and subtract n/1 - pi from pj. It is easy to verify that the invariant still holds after these changes.2
- Repeat step 1 until S is empty or there is no more elements i in S that pi < 1/n. If S is empty, the algorithm finishes. Otherwise for all remaining i in S, we must have pi = 1/n.3 Let A(i) = i and F(i) = pi * n = 1 for all remaining i, and remove them from the set S.
The algorithm finishes when S becomes empty, and an element is removed only when its corresponding A and F has been determined, so all values of A and F has been generated.
In miloyip's implementation, pi is stored in AliasItem::prob before i is
removed from the set. When i is removed from the set, AliasItem::prob is set
to F(i).
Correctness
The invariant holds at the beginning and at the end of each step, it guarantees that the algorithm can finish. It is easy to prove it using mathematical induction. So we only need to prove P{Y'=i} = P{Y=i} for any i, i.e.,
Denote p'i as the value of pi when i is removed from set S. Check the construction steps again, we get following properties:
- No pi can increase. Thus pi <= P{Y=i} in all steps and p'i <= P{X=i}.
- pi decreases only when its initial value P{Y=i} > 1/n. So if P{Y=i} <= 1/n, pi = P{Y=i} throughout the algorithm and p'i = P{Y=i}.
- F(i) = p'i * n
- i is removed only when pi <= 1/n, i.e., p'i <= 1/n, thus F(i) = p'i * n <= 1.
- A(j) is set to a value i != j only if pi > 1/n (see step 1), i.e., P{Y=i} > 1/n.
Now consider value i when P{Y=i} < 1/n, P{Y=i} = 1/n and P{Y=i} > 1/n.
P{Y=i} < 1/n
If P{Y=i} < 1/n,
From property 2 and property 3, F(i) = p'i * n = P{Y=i} * n
Apparently A-1(i) = {}, because A is either set to value j where pj > 1/n in step 1 or k where pk = 1/n in step 2.
Thus
which completes the proof.
Intuitive Presentation
The algorithm can be presented in intuitive meaning. The range (0, n] is split into n consecutive sub ranges (i, i + 1] for i = 0, 1, …, n - 1. The probability of X falls into any range is (i + 1 - i) * 1/n = 1/n.
For P{Y=i} = 1/n, we can allocate the whole slot i to it. Let Y = i when x falls in (i, i + 1] which has the probability 1/n.
If P{Y=i} < 1/n, we can allocate the starting part (i, i + n*P{Y=i}] in (i, i + 1]. Let Y = i when x falls in (i, i + n*P{Y=i}], where the probability is n*P{Y=i}*(1/n) = P{Y=i}.
If P{Y=i} > 1/n, we can allocate unused ranges in (j + n*P{Y=j}, j + 1] for any j that P{Y=j} < 1/n. However, unused range is not allowed to be split again.
See the figure below, which demonstrates how to generate Y with probability mass function (n = 5)
- P{Y=0} = 0.16
- P{Y=1} = 0.1
- P{Y=2} = 0.32
- P{Y=3} = 0.22
- P{Y=4} = 0.2
P{Y=4} = 1/n, so let Y = 4 only when x falls in (4, 5], which probability is (5-4)*0.2 = 0.2.
P{Y=0} = 0.16 < 0.2, so let Y = 0 only when x falls in (0, 0.16*5], i.e., (0, 0.8], which probability is (0.8-0)*0.2 = 0.16. (0.8, 1] is unused.
P{Y=1} is the same. (1, 1.5] is allocated and (1.5, 2] is unused.
P{Y=2} = 0.32 > 0.2, it needs ranges with total length 0.32 * 5 = 1.6. We allocates the range (0.8, 1] and (1.5, 2]. The remaining length 1.6 - 0.2 - 0.5 = 0.9 < 1, then we can allocate a part of its own slot. Finally, three ranges have been allocated, (0.8, 1], (1.5, 2] and (2, 2.9]. (2.9, 3] is unused.
Follow the same step to handle Y=3. The final allocation is depicted in D. The allocation is not unique, F depicts another solution.
Footnotes:
1 If all j except i that pj <= 1/n, Sum up both end of the
inequalities for all j and pi < 1/n, we can get
which is conflict with the invariant.
2 The right side has decreased 1/n because ||S|| has decreased 1. The left side has decreased pi + (n/1 - pi) = 1/n, because i is removed from the set and (n/1 - pi) is subtracted from pj. Thus both side decrease the same amount, the equality still holds.
3 Because no pi < 1/n, then pi >= 1/n. To satisfy the invariant, no pi can be larger then 1/n. Thus for all i in S, pi = 1/n.
source of this post.
No related posts.
