A.10 Dirichlet distributions

A.10.1 Beta distribution

Suppose we have a collection of agents of various types in a model, and we know the total number of types, K. Then, we often describe the “demography” by types of collections of agents by empirical distributions, that is, by a K-dimensional vector of fractions of agents.

The fractions are distributed on a finite-dimensional simplex. The simplest nontrivial probability distribution is the Dirichlet distribution. This distribution arises naturally every time we deal with models of agents of several types or with a finite number of choices.⁶ Here, we proceed without going into the reasons.

With only two types or choices, the distribution is particularly simple because the fraction is (p, 1 - p), where p has a Beta distribution,

A.10.2 Dirichlet distribution

With K ≥ 2 the probability distribution on ∆_κ with a simple structure is defined on the simplex ∆_κ by the density

⁶ There are actually deeper reasons than technical convenience to use the Dirichlet distributions, as discussed in Zabell (1992), for example, in the case of exchangeable partitions induced by agents.

We can directly manipulate the joint density expression as follows: Firstwrite the product of the density for Yj = yj, j = 1,..., K, S = s = y₁ +----------------------------------------------- + y_κ,

write the product of the y's in terms of the p's and s, and note that the expression separates into the product of two expressions in terms of s and the ps.

Then, making use of the Jacobian of the transformation, we end up with

This factored form shows that the sum S of K i.i.d. Gamma random variables is also a Gamma random variable with parameter Kα, and that the fractions Pj are independent of S, and have the density called the symmetric Dirichlet disstribution D(α, K), where the first arguments denotes the parameter, and the second indicates that it is the density for p₁,... p_κ, and where we use ^pκ = 1 ^{- p}1 — ∙ ∙ ∙ — ^pκ-1.

We also note that the Laplace transform (or the moment generating function) of the gamma distribution is

for θ > —1. This shows that the Gamma distribution is infinitely divisible, because (1 + θ) ^α' is the Laplace transform of the Gamma distribution with parameter at for every positive '.

Hence, we have the Levy-Khinchin representation

This identifies γ (dz) = z^-1e^-z dz as the measure for the gamma process.

Let ∏ be a Poisson process on the real half line S = (0, ∞). The count function is defined as

for every A in S. This function is such that

for disjoint Aj in S. This is a completely random measure with integer values.

j = 1,..., K, defines K components of a random vector in ∆_κ that has the density of the Dirichlet distribution D(α, K).

A.10.3 Marginal Dirichlet distributions

and integrate q_κ_-1 out. What is left is the Dirichlet distribution

D(a1, a₂,..., aκ-2, aκ-1 + aκ).

A.10.4 Poisson-Dirichlet distribution

Given a Poisson process with the above rate function, the Poisson-Dirichlet process can be constructed as shown by Kingman (1993, Chap. 9).

A.10.5 Size-biased sampling

We follow Kingman (1993, p. 98) to show that size-biased samples of Dirichlet distributions have the same distribution as those due to the residual allocation process. See also Pitman (1996).

Suppose that a vector p = (p₁, p₂,..., p_K) with exchangeable components has the symmetric Dirichlet distribution D(a, K). Let ν be a random variable on {1, 2,..., K} with probability given by

In sampling from a population of agents of K types with fractions pj, j = 1,..., K, type j will be drawn with probability pj, that is, the first sample is of type j with probability pj. For this reason p_ν is said to be obtained by size-biased sampling. If all agents of the same type are removed, then the remaining agents have fractions p₁,..., p_ν_-1, p_ν₊₁,..., p_K.We can renormalize the fractions by dividing the components of this vector by 1 — pj. Denote this vector by q⁽¹⁾.

The joint density for (p_ν, p₁,..., p_ν_-1, p_ν+₁,..., p_κ) is D(a + 1, a, a,..., a).

This can be seen by noting that by symmetry the vector (p_j∙, p₁,..., pj-₁, pj₊₁,..., p_K) has the same density as the original Dirichlet distribution. This occurs with probability pj. Hence, the density of (p_ν, p₁,..., p_ν-₁, p_ν₊₁,..., p_κ) is Kp₁ times the density of the original Dirichlet distribution, which is the density for the distribution D(a + 1, a,..., a), where a is repeated K — 1 times. From this, the marginal density for p_ν is seen to be

If we let a go to zero, while letting Ka approach θ, then the marginal density approaches Beta(1, θ).

Also, given p_ν, the sum of the remaining components is p₁ + ∙ ∙ ∙ p_ν_-₁ + p_ν+₁ + ∙ ∙ ∙ + p_K = 1 — p_ν. Let q⁽¹⁾ be the renormalized vector, that is, the vector in which the conditional joint distribution for the remaining components has the same distribution as that ofwhere p⁽¹⁾ has the (K — 1)-

dimensional Dirichlet distribution, D(a, K — 1). Now, apply size-biased sampling to q⁽¹⁾ to produce q⁽²⁾, and so on.

At the end of this process, we obtain the rearranged vector q of p, such that where the vs are independent, and v j has the density of B (a + 1, (K — j)a). This density approaches B(1, θ) as a approaches zero and K infinity in such as way that Ka approaches θ.

This is the reverse of starting from the random variables distributed as B (1, θ) and denoting the kth largest of the q's as p_k. This process produces (p₁, p₂,...), which has the Poisson-Dirichlet distribution with parameter θ as its limiting distribution. See Kingman (1993).

<< | >>

↑

Source: Aoki M.. Modeling Aggregate Behaviour & Fluctuations in Economics. Cambridge: Cambridge University Press,2002. — 281 p.. 2002

A.10 Dirichlet distributions

More on the topic A.10 Dirichlet distributions: