Dynamics of clustering processes
An organization in category i may move up or down to category j in some small time interval. Transition rates w(i, j) specify the probability intensity of such a move. Once such transition rates are specified for all possible moves, all the machinery in connection with the master-equation solutions can, in principle, be employed to calculate probability densities for stationary or nonstationary accumulation or decumulation processes.
This is how one would address growth problems. Examples of growth processes described in terms of the partition vectors are found in Kelly (1979, Chap. 8). Some basic elements of this approach have already been explained. We add further detail and explanations next.10.5.1 Examples of clustering-process distributions
The first example is in Kelly (1979). Instead of the biological interpretation he gives, we think of it as a model of a sector of the economy in which an opportunity for a submarket opens up and a firm of size one enters. This is modeled by a transition rate
with positive constants n and λ, that is, all entrants are scaled to be of unit size. In addition, an organization of size j may grow by a unit to become size j + 1. The probability intensity of this event is modeled as
where we define
because an agent joining a cluster of size
j causes the number of clusters of size j + 1 to increase by one, and that of size j to decrease by one. The vector u1 is the same as e1.
The organization also may become smaller by a unit. Assume that
for j ≥ 1, where e0 = 0.
The equilibrium probability distribution is obtained, as usual, by the application of the detailed-balance conditions to be
where
The identity due to Kendall (1967),
is used again to calculate the normalizing constant. This expression shows that the random variables a1, a2,... are independent, each with Poisson distribution with mean β j.
Instead of organizations growing or decaying by unit amounts, we may model the process of merger, acquition, or breakups of large firms more directly. Here, we refer to a cluster of size r as an r-cluster.
For the conditional-probability setup, we follow Costantini and Garibaldi (1979, 1989) and posit
where we assume that all additive parameters fi above are the same f, and set θ — Kf. We note as before that
and for the composite event
The detailed-balance conditions yield the equilibrium distribution as
This is the justly famous Ewens sampling formula (Ewens 1972, 1990), taken up in the next subsection.
Next, we adapt some examples from Kelly (1979) and Watterson (1976). There are n basic units partitioned into distinct clusters or collections, with ai being the number of groups consisting of i units. Recall that we mean by units some basic building blocks from which objects that cluster are made up.
The transition rate w(a, a + ei) — f represents the process in which a basic unit, or singleton (called an isolate in Kelly (1979, Chap. 8)) joins a group of size i at rate f, i — 1, 2,.... The transition rate w (a, a - ei) — β refers to the rate at which a unit leaves that group to become an isolate. Call a cluster of size i an i -cluster.
is the transition rate for one «-cluster breaking up into one r-cluster and one λ-cluster. In the simple example described above, we have
and
for i ≥ 2.
where B is a normalizing constant, provided there exist positive numbers c1, c2,... satisfying
This is seen to be true by verifying the detailed-balance conditions
8 We use ai rather than mi used by Kelly because we use mi to denote the number of structures later. See Arratia and Tavare (1994).
We can easily verify that the detailed-balance conditions are satisfied. In the simple example, we note that
satisfies the detailed-balance conditions.
For a closed model with n fixed, the equilibrium distribution
is an example of the assemblies analyzed by Arratia and Tavare (1994). Here mi = β∕α serves as the number of labeled structures on a set of size i, that is, the number of labeled structures in this example is independent of the size of the block.
In the more general development that follows the example, if we set
then mr is the number of labeled structures on a set of size r. 
To open this model, that is, to treat n as a random variable, let an entrant be of unit size,
and
Then, if there exist positive numbers c1, c2,... such that
and
id="Picutre 580" class="lazyload" data-src="/files/uch_group31/uch_pgroup304/uch_uch7235/image/image580.jpg">
then
This shows that a is reversible in equilibrium, and the partition variables a1, a2,... are independent Poisson random variables with mean cr. Again, we see the truth of this by verifying the detailed-balance conditions π (a)υ = π ( a + e1)μ(a1 + 1). These are Kelly's Theorems 8.1 and 8.2.
Suppose that we have u = r + s, and abbreviate λrsu by λrs and similarly μrsu by μrs. Suppose also that r clusters enter or leave, instead of firms of unit size. This is modeled by
and
Example. Suppose that
The latter probability
intensity reflects the number of ways a firm of size r + s may break up into r- and s-clusters.
From the entry we must have
in addition to the usual
Define the positive constants cr by
10.5.2 Example: Ewens sampling formula
We have seen that the number of selection structures with a given partition vector a is given by
Let p(n) be the sum of the above over all possible a subject to the constraint JZ iai = n, and assign probability N(n, a)∣p(n) to this a.
As Kelly, Ewens, and others point out, letting K go to infinity causes the probability that any given type is present in the model (samples of agents) to tend to zero. One way to avoid this is to work with order statistics, to which we return later. Here, we restate the Ewens formula in terms of the partition vector before letting K go to infinity.
To express (10.1) in terms of the a's, we note that among n1, n2,..., nn, a number ai are equal to i, i = 1, 2,..., n. Put differently, the product n1n2 ∙ ∙ ∙ nn = 1α12α2 ∙ ∙ ∙ nan. Also, there are K!∣a1!a2! ∙ ∙ ∙ an! many ways of satisfying the constraint. We rewrite the equilibrium distribution (10.1), which is written in terms of the state vector n, in terms of the partition vector. For simplicity, take fk = f for all k. Thus, we have9
[1] If the sum of a’s is k < K, then K! in the numerator is replaced with K !∣(K - k)!, which is the number of ways of selecting k types out of K.
and compare the coefficients of xn on both sides.
See Section 11.2, and Hoppe (1987) as well. He uses the device of urn models. In particular, after sampling n agents, the probability that the next draw is a new type, that is, the entry rate for an agent of a type not represented in the sample, is θ∕(θ + n - 1). Costantini and Garibaldi (1989, 2000) have a natural way of deriving this. Their approach is sketched in Appendix A.3.1. Kelly (1977) has derived that in a sequential sample of size n, the probability that the type of the first-drawn agent has representative i in this sample of size n is
Hoppe observes that the first agent observed carries label (type) 1. Its future occurrence in the sample can be described by a two-type Polya urn of binomial type with success probability Beta(1, θ). Hence
10.5.3 Dynamics of partition patterns: Example
Here is a simple Markovian dynamics, expressed in terms of partition vectors in Kelly (1979). See also Aoki (1996a, Sec. 4.5.5) for details. Recall that a partition vector defined by
indicates that aj is the number of subgroups with exactly j agents in each of them. Only a finite number of elements of a are nonzero, as in Shepp and Lloyd (1966), and thus the set of all partition vectors is countable. Assume that the transition rates are specified by
and
where υ is a constant and λ and υ are functions of n. Note that when one agent leaves a group with j people in it to join another subgroup with j people, the net effect is to reduce the number of subgroups with j agents by one and increase the number of subgroups with j + 1 agents in each of them. It has been suggested in the last part of Chapter 9 that this scheme may be applied to modify the model of Kiyotaki and Wright (1993) to allow money traders to hold several units of money.
In this example, n is also a random variable and has the transition rates
and
These are obtained from the basic transition rates specified above by appropriate additions.
In this case there is an equilibrium probability distribution for n verifying the detailed-balance condition
where we assume that
The equilibrium probability distribution for the partition vector also exists in a similar form:
where we use the relation
That this is a bona fide probability
distribution is verified by the sum becoming one. To see this use the Kendall- Kelly identity. The detailed-balance conditions are easily verified to show the existence of the equilibrium distribution.
Note that this shows that the structure is an assembly with
id="Picutre 607" class="lazyload" data-src="/files/uch_group31/uch_pgroup304/uch_uch7235/image/image607.jpg">members.
This probability has the form of the product of Poisson random variables with mean
The products of Poisson distri
butions are not as special as they may at first appear. Some models in the next section have this type of equilibrium distributions. Actually, distributions of Poisson random variables, conditional on their weighted sums, arise in a fairly general context, as shown by Arratia and Tavare (1992).
10.6