Measuring
Biological Diversity
Anne Magurran's book, Measuring Biological
Diversity, published in 2004 by Blackwell Publishing,
is destined to become an important reference for students of
ecology. It is a useful compendium of information and citations
for diversity analysis, and a good summary of how ecologists
think about diversity analysis. A critique of this book is therefore
not so much a critique of the author's thinking but a critique
of the field as a whole.
I will concentrate on Chapter 4, "An index
of diversity...", Chapter 5, "Comparative studies
of diversity", and Chapter 6, "Diversity in space
(and time)". These chapters are the core of the book. Some
background material related to this criticism can be found in
my Oikos paper, Entropy
and diversity.
Magurran Chapter 4,
"An index of diversity...":
Box 4.1 (p. 101):
Every student wants to know what index to use
for a particular problem, so Magurran's Box 4.1, "How to
choose a diversity index", will be popular. Unfortunately
this box contains misleading information. The criticisms below
are organized according to the item numbers in Magurran's box.
1. Magurran wisely encourages students to think
about the aspects of diversity they need to measure, and she
correctly argues that it is not good to calculate many different
diversity indices and then pick the one that gives the most
attractive answer. Nevertheless, different indices of diversity
place different emphases on the common or rare species of a
sample, and it is therefore legitimate (even recommendable)
to present more than one diversity. Best is to present a continuous
graph of the Hill numbers (see Magurran p.148-149) if sample
size is large enough to allow the accurate estimation of the
quantities involved.. If sample size is small, it is best to
present just the diversities of order 0, 1, and 2 (N0, N1, and
N2 in Magurran's notation) and to calculate each of these using
Chao estimators (discussed for N0, which is species richness,
in Magurran p. 87; for N1, which is the exponential of Shannon
entropy, take the exponential of the estimator given in Chao
and Shen 2003; for N2, which is the reciprocal of Simpson concentration,
follow the method outlined in Chao and Shen 2003). It may be
possible to derive a continuous estimator of the Hill numbers;
Anne Chao is currently working on this problem with me. If we
are able to solve it, then it would be possible and desireable
to plot a continuous graph of Hill numbers even when sample
size is small. Such a graph contains complete information about
the diversity of the system under study; the value of any standard
diversity index can be calculated from such a graph.
2. Magurran's second recommendation, that sample
size be large enough to support the planned analysis, is of
course correct.
3.
So is her third recommendation, that samples be replicated if
possible.
4.
Her fourth point is that an estimate of species richness will
often be the most appropriate measure of diversity. This is
a common opinion in ecology but in fact species richness should
always be a last resort, unless presented alongside frequency-based
measures as described above. As Lande 1996 shows (and Magurran
herself shows in the first chapters of the book), species richness
is the diversity measure that is slowest to converge to a definite
value as sample size increases, and indeed it often does not
approach an asymptote at all (see graph below). Also, when repeated
samples are drawn from the same ecosystem, species richness
shows more variability than any other measure of diversity.
More important, estimates of species richness are the least
ecologically meaningful measures of diversity, because they
give vagrants and very rare residents the same weight as the
dominant species. An ecosystem with ten equally common species
is much more diverse in terms of ecological interactions than
a system with one dominant species and nine vagrant species,
yet species richness assigns both ecosystems the same diversity.
Measures which take into account the frequencies of the species
are much more meaningful ecologically than species richness.
Many
ecologists are suspicious of frequency-based indices of diversity
because these seem difficult to interpret. Conversion of these
indices to effective number of species, as described elsewhere
on this website, gives them many of the same intuitive properties
as species richness, and should help in breaking down this prejudice
.
5.
Magurran's fifth recommendation is to consider the log-series
parameter alpha or the Simpson index if a frequency-based measure
of diversity is justified (as explained in the preceding paragraph,
a frequency-based measure is always justified if
species frequencies are available). This is a common opinion
but it is not good advice. All Simpson indices emphasize the
most common species disproportionately compared to their frequencies
in the sample. When they are converted to N2, the Hill number
or diversity of order 2, they are useful to include alongside
N1 and N0. But they should not be the sole diversity index used
in a study unless the focus is particularly on the most dominant
species, or on the probabilities of interspecific encounters.
Another argument against Simpson measures is that they cannot
generally be decomposed into meaningful independent alpha and
beta components.
Her
recommendation that the log-series parameter alpha be used as
a diversity index even when the species do not follow a log-series
distribution seems unwise (though she is not alone in making
this recommendation). There seem to be strong reasons to avoid
this index for general use. The log-series alpha depends only
on n (the number of individuals sampled) and S (the number of
species in the sample). When the data is not log-series distributed
this index throws away almost all the information in the sample
(since it depends only on the sample size and the number of
species in the sample, not the actual species frequencies) and
gives counterintuitive results. For example, a sample containing
ten species with abundances
[91, 1, 1, 1, 1, 1, 1, 1, 1, 1]
has the same diversity, according to this index, as a sample
containing ten species with abundances
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10], whereas ecologically
and functionally the second community is much more diverse than
the first.
6.
Magurran's recommendation to avoid Shannon measures reveals
a common prejudice shared by many ecologists. The arguments
presented are not logical. Magurran says "Given its sensitivity
to sample size there appear to be few reasons for choosing it
over species richness." Yet Shannon measures (whether the
entropy or its exponential) are much less sensitive
to sample size than the measure she recommends in her #4, species
richness (and she spent much of the first part of the book pointing
out how sensitive species richness is to sample size!) See the
graph below, taken from Kempton 1979. The sensitivity of Shannon
measures to sample size can be corrected using the same techniques
that Magurran discusses for correcting species richness; see
Chao and Shen 2003 for comparisons of different methods.
|
From Kempton, R. 1979.The structure of species abundance
and measurement of diversity, Biometrics. An unbiased measure
would have values close to 100%. Species richness is the
most biased measure and the slowest to level off as sample
size increases. Even with very large sample sizes, species
richness is less than 50% of its asymptotic or true value
in this simulation. |
Magurran
goes on to say that converting Shannon entropy to effective
number of species ( the exponential of Shannon entropy, which
is N1, the diversity or Hill number of order 1) "does not
overcome the fundamental problems of this measure". She
misses the point of taking the exponential. As I have shown
in my Oikos paper, Entropy
and diversity, and as Hill showed in his 1973 paper
and as MacArthur showed in his 1965 article, taking the exponential
of Shannon entropy makes it behave intuitively in comparisons
and ratios. Please see Effective
numbers of species for additional explanation on
this point. (Converting any diversity index
to effective number of species makes it behave intuitively in
comparisons and ratios.)
It
is especially unfair for Magurran to recommend the log-series
parameter alpha over the exponential of Shannon entropy. I have
explained above that the log-series parameter alpha can only
be given a reasonable interpretation when the species distribution
is log-series distributed. Even in that case, when a species
distribution does follow a log series, the log-series parameter
alpha has no advantage over the exponential of Shannon entropy--
in that case the log-series alpha and the exponential of Shannon
entropy are related by a simple transformation. See Bulmer 1974.
It
is worth pointing out that Shannon entropy and its exponential
are the near-universal measures of uncertainty and diversity
in physics, chemistry, information theory, and computer science.
They are the only measures of diversity that weigh all species
proportionately to their frequencies in the sample, rather than
favoring common or rare species as Simpson indices or species
richness do. This alone is reason enough to select them
as the best general-purpose diversity measures. Shannon measures
are also the only measures which can be decomposed
into meaningful independent alpha and beta diversities when
applied to a region with multiple unequal-sized communities.
No other diversity measures can be used to measure regional
alpha and beta diversity. This is also reason enough
to select them as the best measures for general use. (Lande's
1996 decomposition of non-Shannon measures into "alpha"
and "beta" does not yield a beta that is independent
of alpha; see the counterexample in Diversity
and similarity.) If just one number must be chosen
to characterize the diversity of an ecosystem, the exponential
of Shannon entropy is the most reasonable choice by far. Incidentally,
the Shannon measures do not need to be borrowed from other disciplines
but can be derived within biology by a careful consideration
of the properties required of an ideal diversity measure. (I
show this in my upcoming paper on the mathematics of beta diversity.)
7.
The Berger-Parker index throws away all frequency information
except that of the single most abundant species, so it can give
misleading ideas of dominance when there are two or three almost-equally
dominant species. Dominance can be nicely read from the shape
of a graph of Hill numbers as recommended in #1 above. A horizontal
graph indicates complete evenness. A good index of evenness
is the ratio of Hill numbers N1/N0, N2/N1, or N2/N0. A value
of unity indicates complete equitability and a value near zero
indicates high dominance.
8
and 9. No comments.
Parametric measures of
diversity (p. 102-106):
Magurran's treatment of parametric measures of
diversity gives the impression that these are generally useful
measures. However, log series alpha and log normal lambda cannot
generally be interpreted if the species distributions are not
log series or log normally distributed. The work she cites in
support of these measures, that of Kempton and collaborators,
involves data that are nearly log-series distributed, and their
goal is slightly different from that of most investigators using
diversity indices. Kempton et al. do not so much want to describe
the diversity of a particular sample but rather want to characterize
their study sites in a way that does not vary across years.
They therefore search for parameters that emphasize a particular
time-invariant aspect of their samples. While it is interesting
to search for characteristics of a sample that are invariant
with respect to year-to-year variation, a general-purpose diversity
index should neutrally describe the sample at hand without making
implicit hypotheses about the underlying structure of the sample.
If an objective characterization of the diversity of a particular
sample is the goal, nonparametric measures (corrected for small-sample
bias) are superior. They are capable of interpretation even
when the distributions of species are not log-series or log
normal.
The Q statistic is actually a nonparametric index,
as Magurran notes.
Nonparametric measures
of diversity (p. 102-106):
For most users of Magurran's book this is an important
section. The organization of the section (and the book as a
whole) reflects the common view that there are many unrelated
diversity indices in ecology. Yet the standard diversity indices
(those based on sums of powers of the species frequencies, or
limits of such sums) are all closely related and vary only in
their emphasis on common or rare species. This set of indices
includes species richness, Simpson concentration, inverse Simpson
concentration, the Gini-Simpson index, the Hurlbert-Smith-Grassle
or NESS (normalized expected species sampled) index for m =
2, all Renyi entropies, all HCDT (Havrda-Charvat-Daroczy-Tsallis)
or “Tsallis” entropies, all Hill numbers, Shannon
entropy (the limit of both Renyi and HCDT entropies as q approaches
unity), Patil and Taillee’s average rarity index (a version
of HCDT entropy; see Ricotta 2003), Varma entropy, Arimoto entropy,
Sharma and Mittal entropy, and others (see Taneja 1989). All
of these, in spite of their apparent differences, lead to the
same expression for the effective number of species, as explained
in Entropy
and diversity. This expression, which unites all
these diversity measures, has a single free parameter, q, which
determines its sensitivity to common or rare species. This section
would have been more profound had this unity been used as an
organizing principle.
Information statistics (p. 106-114):
As I have already explained in my review of Magurran's
Box 4.1, her treatment of Shannon measures is prejudiced. She
cites some critics of Shannon measures in support of her opinion,
but the articles cited are not very well thought out. Many of
these criticisms focus on the fact that for small samples, Shannon
measures show a consistent negative bias. As I mention above,
this bias is much smaller than that of the commonly-recommended
species richness index, and this bias can be almost completely
removed by using the methods suggested in Chao and Shen 2003.
In any case, sampling properties should not be the primary criteria
for choosing a measure. It does no good to have an unbiased,
rapidly-converging estimator if that estimator doesn’t
measure what one needs to measure. More important than sampling
properties is a measure’s ability to correctly capture
the theoretical concept being studied. Shannon measures are
the only standard diversity indices that do not disproportionately
favor either common or rare species, and are the only measures
that correctly capture the concepts of alpha and beta diversity
when community weights are unequal.
Magurran mentions the various logarithmic bases
that have been used with the Shannon index. Base 2 is used not
just for historical reasons; it has some interesting advantages.
When the Shannon entropy is expressed in logs to the base 2,
it gives the mean depth of the maximally-efficient key to the
species of the ecosystem being studied. Still, in general the
base is unimportant, as Magurran says. The really meaningful
number is not the value of the Shannon entropy, which depends
on the base used, but rather the value of the exponential of
the entropy, and since the exponential is taken to the same
base as the logarithm, the two cancel out and the final diversity
is independent of the choice of base. The most convenient base
is the base of the natural logarithm, e.
On p. 108 Magurran notes correctly that it is
very hard to know if the difference between a Shannon H of 2.35
and a Shannon H of 2.47 is small or substantial. She then notes
that some investigators "sidestep the problem" (her
words) by taking the exponential of H, but she thinks this still
does not shed much light on the question. Like most ecologists,
she has not appreciated MacArthur's 1965 and Hill's 1973 insight
that conversion to effective number of species (e.g. the exponential
of Shannon entropy) makes it possible to accurately judge the
differences between two or more diversities. This is the key
"blind spot" that burdens the field of diversity analysis.
It is possible to prove mathematically that effective numbers
of species (regardless of the index on which they are based)
possess a reasonable and intuitive doubling property, so that
the proportional difference between two effective numbers of
species really reflects the proportional difference in an intuitively-defined
diversity. This is explained in Hill 1973 and elsewhere on my
website: Effective
number of species, Entropy
and diversity, Diversity
of a single community, Comparing
diversities of two or more communities.
Magurran then makes another error that is widespread
in ecology. Her original question was whether the two values
of Shannon H were "substantially different", but she
then treats that question as if it could be answered by employing
a statistical test. The statistical significance of the difference
has little to do with the magnitude or biological significance
of the difference. To make this clearer, it is useful to consider
an example of tossing a coin to see if it is biased. A good
measure of the magnitude of the bias is the
mean proportion of heads obtained after tossing the coin N times.
A good measure of the statistical significance
of the bias is the p-value obtained (using the binomial distribution
with mean 0.50) after tossing the coin N times. If N is large
enough, even the most miniscule deviation from a mean of 0.50,
say 0.50001, can result in a highly significant p-value, say
1/100000. This highly significant p-value proves that the coin
is biased but says nothing about the magnitude or practical
importance of the bias. The measure that reveals the actual
magnitude of the bias is the mean proportion of heads. Ecologists
are often guilty of stopping after obtaining a significant p-value,
when the more interesting question is the real magnitude of
the effect being measured, the equivalent of the mean proportion
of heads in the coin toss.The statistical methods described
by Magurran are unobjectionable, but if the p-value is significant,
then the experimenter must ask: What is the real magnitude of
the effect? The
effective number of species permits the calculation of the real
magnitude,and hence permits ecologists to judge
the absolute sizes of effects, as explained elsewhere on this
site.
Perhaps ecologists' confusion of statistical significance
versus the actual magnitude
of an effect has contributed to the failure to appreciate the
importance of effective number of species. As long as ecologists
are only interested in the statistical significance of a result,
any index with known statistical properties is good enough.
Only if we want to move beyond mere statistical significance
does it become important to find measures that behave intuitively
and whose changes in magnitude can be properly judged. That's
where effective numbers of species come into play.
There
is more to be said about some of the other evenness measures
Magurran treats in this section, but that will have to wait
until I have more time.
Dominance
and evenness measures (p. 114 -121):
The
main theme here is the Simpson index D, with its variants 1/D,
1-D, and ln(D). Much space is devoted to the relative merits
of these alternatives. It is unfortunate that a broader mathematical
perspective has not been taken in this book and in the diversity
literature; all these alternatives (and any other monotonic
function of D) yield the same effective number of species and
are therefore equivalent.
Magurran
closes her treatment of Simpson indices with the comment "Simpson's
index remains inexplicably less popular than the Shannon index."
This is not so inexplicable when one realizes that all Simpson
indices disproportionately emphasize the most common species
in the sample, making them insensitive to changes of diversity
that affect only the nondominant species. Studies of the discriminating
power of indices have consistently shown that Simpson measures
are less discriminating than Shannon measures. Furthermore,
Simpson indices cannot be decomposed into meaningful independent
alpha and beta components when applied to unequally weighted
communities.
The
other measures of evenness discussed here are of little importance.
The
review of the rest of this chapter must wait until I have more
time.
Magurran
Chapter 5, "Comparative studies of diversity" (p.
131-161)
For now I include just a few comments on the most
important issues raised here. On p. 134 Magurran says that the
Simpson index is less sensitive to sample size than Shannon
measures, and she therefore recommends its use. Simpson measures
are less sensitive to sample size only because they overemphasize
the most common species in the sample. This is not a good reason
to use them. As mentioned earlier, the methods of Chao and Shen
2003 virtually eliminate the small-sample bias of Shannon measures.
On p.148 Magurran repeats her advice to use the
log-series parameter alpha even when the species distributions
do not follow a log-series. This is odd advice: alpha is uninterpretable
in this situation.
Also on p.148 she superficially introduces the
Hill numbers. She does not explain their value in providing
the answer to the question of how distinct are two different
diversity measurements. See elsewhere on this website for examples
of their use.
Magurran
Chapter 6, "Diversity in space (and time)" (p. 162-184)
Again, just a few quick comments for now on the
most important issues.
Most of this section is devoted to measuring beta
diversity using presence-absence indices. This
reflects their widespead but biologically indefensible use in
ecology. Ecologically significant differences between communities
involve differences in frequency, not mere presence or absence.
A northern Wisconsin pine forest is a very different community
than a northern hardwood forest, but usually there are one or
two maples in a given pine forest, and vice-versa. Presence-absence
indices of beta treat both communities as identical. Frequency
measures are always preferable when frequency data is available.
(Frequency measures can also be constructed from stratified
presence-absence data.)
One
reason for the emphasis on presence-absence measures in the
literature is the lack of good general definitions of beta for
frequency-based diversity indices. It turns out that a general,
easy to interpret definition of beta can be derived from first
principles, and the new beta which results gives the effective
number of distinct communities in the region, for any given
diversity index. Interestingly, the most popular indices of
similarity and overlap (Jaccard, Sorensen, Horn, and Morisita-Horn)
are all monotonic transformations of this new beta. The derivation
of this new beta is the subject of my paper, Partitioning
diversity into independent alpha and beta components,
in press in Ecology ("Concepts and Synthesis"
section).. When communities are equally-weighted (as for
example when they are being compared in the abstract) the beta
diversity for any index is necessarily the effective number
of species of the gamma index divided by the effective number
of species of the alpha index. (This definition is not invented
but rather derived mathematically.) When community weights are
unequal, the surprising conclusion of my paper is that the only
logically consistent measure of beta diversity is :
Beta
diversity = exp(Shannon H_gamma)/exp(Shannon H_alpha)
When
community weights are unequal, all other decompositions of diversity
indices into alpha and beta components proposed in the literature
either result in a beta that is not independent of alpha, or
an alpha which could exceed gamma. Both these outcomes are generally
regarded as unacceptable (Wilson and Shmida 1984 for independence,
Lande 1996 for alpha not to exceed gamma).