The usage of cluster algorithms for
applications started in
the late 1970's. A number of different approaches were proposed
[Bab80], see review in [Mor98]. Of these, we will here
only discuss those based on binary joining. In this kind of
approach, initially each final-state particle is considered to be a
cluster. Using some distance measure, the two nearest clusters are
found. If their distance is smaller than some cut-off value,
the two clusters are joined into one. In this new configuration, the
two clusters that are now nearest are found and joined, and so on until
all clusters are separated by a distance larger than the cut-off.
The clusters remaining at the end are often also called jets. Note that,
in this approach, each single particle belongs to exactly one
cluster. Also note that the resulting jet picture explicitly depends
on the cut-off value used. Normally the number of clusters is allowed to
vary from event to event, but occasionally it is more useful to have
the cluster algorithm find a predetermined number of jets (like 3).
The obvious choice for a distance measure is to use squared invariant
mass, i.e. for two clusters
and
to define the
distance to be
| (290) |
Another instability may be seen by considering the clustering in a simple 2-jet event. By the time that clustering has reached the level of three clusters, the `best' the clustering algorithm can possibly have achieved, in terms of finding three low-mass clusters, is to have one fast cluster around each jet, plus a third slow cluster in the middle. In the last step this third cluster would be joined with one of the fast ones, to produce two final asymmetric clusters: one cluster would contain all the slow particles, also those that visually look like belonging to the opposite jet. A simple binary joining process, with no possibility to reassign particles between clusters, is therefore not likely to be optimal.
The solution adopted [Sjö83] is to reject invariant
mass as distance measure. Instead a jet is defined as a collection
of particles which have a limited transverse momentum with respect to
a common jet axis, and hence also with respect to each other. This
picture is clearly inspired by the standard fragmentation picture,
e.g. in string fragmentation. A distance measure
between two
particles (or clusters) with momenta
and
should
thus not depend critically on the longitudinal momenta but only on the
relative transverse momentum. A number of such measures were tried,
and the one eventually selected was
![]() |
(291) |
For small relative angle
, where
and
, this measure reduces to
![]() |
(292) |
The basic scheme is of the binary joining type, i.e. initially each
particle is assumed to be a cluster by itself. Then the two clusters
with smallest relative distance
are found and, if
, with
some predetermined distance, the two clusters are joined
to one, i.e. their four-momenta are added vectorially to give the
energy and momentum of the new cluster. This is repeated until the
distance between any two clusters is
. The number
and momenta of these final clusters then represent our reconstruction
of the initial jet configuration, and each particle is assigned to one
of the clusters.
To make this scheme workable, two further ingredients are necessary,
however. Firstly, after two clusters have been joined, some particles
belonging to the new cluster may actually be closer to
another cluster. Hence, after each joining, all particles in the event
are reassigned to the closest of the clusters. For particle
, this
means that the distance
to all clusters
in the event has
to be evaluated and compared. After all particles
have been considered, and only then, are cluster momenta recalculated
to take into account any reassignments. To save time, the assignment
procedure is not iterated until a stable configuration is reached,
but, since all particles are reassigned at each step, such an
iteration is effectively taking place in parallel with the cluster
joining. Only at the very end, when all
, is
the reassignment procedure iterated to convergence -- still with the
possibility to continue the cluster joining if some
should
drop below
due to the reassignment.
Occasionally, it may occur that the reassignment step leads to an empty
cluster, i.e. one to which no particles are assigned. Since such a
cluster has a distance
to any other cluster, it is
automatically removed in the next cluster joining. However, it is
possible to run the program in a mode where a minimum number of jets
is to be reconstructed. If this minimum is reached with one cluster
empty, the particle is found which has largest distance to the cluster
it belongs to. That cluster is then split into two, namely the
large-distance particle and a remainder. Thereafter the reassignment
procedure is continued as before.
Secondly, the large multiplicities normally encountered means that,
if each particle initially is to be treated as a separate cluster, the
program will become very slow. Therefore a smaller number of clusters,
for a normal
event typically 8-12, is constructed as a starting
point for the iteration above, as follows. The particle with the
highest momentum is found, and thereafter all particles within a
distance
from it, where
.
Together these are allowed to form a single cluster. For the
remaining particles, not assigned to this cluster, the procedure is
iterated, until all particles have been used up. Particles in the
central momentum region,
are treated
separately; if their vectorial momentum sum is above
they are allowed to form one cluster, otherwise they are left
unassigned in the initial configuration. The value of
,
as long as reasonably small, has no physical importance, in that the
same final cluster configuration will be found as if each particle
initially is assumed to be a cluster by itself: the particles
clustered at this step are so nearby anyway that they almost
inevitably must enter the same jet; additionally the reassignment
procedure allows any possible `mistake' to be corrected in later
steps of the iteration.
Thus the jet reconstruction depends on one single parameter,
, with a clearcut physical meaning of a transverse
momentum `jet-resolution power'. Neglecting smearing from
fragmentation,
between two clusters of equal energy
corresponds to half the invariant mass of the two original partons.
If one only wishes to reconstruct well separated jets, a large
should be chosen, while a small
would allow the separation of close jets, at the
cost of sometimes artificially dividing a single jet into two. In
particular,
quark jets may here be a nuisance. The value of
to use for a fixed jet-resolution power in principle
should be independent of the c.m. energy of events, although
fragmentation effects may give a contamination of spurious extra jets
that increases slowly with
for fixed
.
Therefore a
GeV was acceptable at PETRA/PEP,
while 3-4 GeV may be better for applications at LEP and beyond.
This completes the description of the main option of the PYCLUS routine. Variations are possible. One such is to skip the reassignment step, i.e. to make use only of the simple binary joining procedure, without any possibility to reassign particles between jets. (This option is included mainly as a reference, to check how important reassignment really is.) The other main alternative is to replace the distance measure used above with the one used in the JADE algorithm [JAD86].
The JADE cluster algorithm is an attempt to save the invariant mass
measure. The distance measure is defined to be
![]() |
(293) |
The
measure is very closely related to the squared mass
distance measure: the two coincide (up to the difference in
normalization) if
. However, consider a pair of
particles or clusters with non-vanishing individual masses and a fixed
pair mass. Then, the larger the net momentum of the pair, the
smaller the
measure. This somewhat tends to favour clustering
of fast particles, and makes the algorithm less unstable than the one
based on true invariant mass.
The successes of the JADE algorithm are well known: one obtains a good
agreement between the number of partons generated on the matrix-element
(or parton-shower) level and the number of clusters reconstructed from
the hadrons, such that QCD aspects like the running of
can
be studied with a small dependence on fragmentation effects. Of
course, the insensitivity to fragmentation effects depends on the choice
of fragmentation model. Fragmentation effects are small in the string
model, but not necessarily in independent fragmentation scenarios.
Although independent fragmentation in itself is not credible,
this may be seen as a signal for caution.
One should note that the JADE measure still suffers from some of the
diseases of the simple mass measure (without reassignments), namely
that particles which go in opposite directions may well be joined
into the same cluster. Therefore, while the JADE algorithm is a good
way to find the number of jets, it is inferior to the standard
measure for a determination of jet directions and energies
[Bet92]. The
measure also gives narrower jets, which
agree better with the visual impression of jet structure.
Later, the `Durham algorithm' was introduced [Cat91],
which works as the JADE one but with a distance measure
![]() |
(294) |
The main difference therefore is that the standard PYCLUS option allows reassignments, while the Durham algorithm does not. The latter is therefore more easily calculable on the perturbative parton level. This point is sometimes overstressed, and one could give counterexamples why reassignments in fact may bring better agreement with the underlying perturbative level. In particular, without reassignments, one will make the recombination that seems the `best' in the current step, even when that forces you to make `worse' choices in subsequent steps. With reassignments, it is possible to correct for mistakes due to the too local sensitivity of a simple binary joining scheme.