Review of Magurran's "Measuring Biological Diversity"

Measuring Biological Diversity

Anne Magurran's book, Measuring Biological Diversity, published in 2004 by Blackwell Publishing, is destined to become an important reference for students of ecology. It is a useful compendium of information and citations for diversity analysis, and a good summary of how ecologists think about diversity analysis. A critique of this book is therefore not so much a critique of the author's thinking but a critique of the field as a whole.

I will concentrate on Chapter 4, "An index of diversity...", Chapter 5, "Comparative studies of diversity", and Chapter 6, "Diversity in space (and time)". These chapters are the core of the book. Some background material related to this criticism can be found in my Oikos paper, Entropy and diversity.

Magurran Chapter 4, "An index of diversity...":

Box 4.1 (p. 101):

Every student wants to know what index to use for a particular problem, so Magurran's Box 4.1, "How to choose a diversity index", will be popular. Unfortunately this box contains misleading information. The criticisms below are organized according to the item numbers in Magurran's box.

1. Magurran wisely encourages students to think about the aspects of diversity they need to measure, and she correctly argues that it is not good to calculate many different diversity indices and then pick the one that gives the most attractive answer. Nevertheless, different indices of diversity place different emphases on the common or rare species of a sample, and it is therefore legitimate (even recommendable) to present more than one diversity. Best is to present a continuous graph of the Hill numbers (see Magurran p.148-149) if sample size is large enough to allow the accurate estimation of the quantities involved.. If sample size is small, it is best to present just the diversities of order 0, 1, and 2 (N0, N1, and N2 in Magurran's notation) and to calculate each of these using Chao estimators (discussed for N0, which is species richness, in Magurran p. 87; for N1, which is the exponential of Shannon entropy, take the exponential of the estimator given in Chao and Shen 2003; for N2, which is the reciprocal of Simpson concentration, follow the method outlined in Chao and Shen 2003). It may be possible to derive a continuous estimator of the Hill numbers; Anne Chao is currently working on this problem with me. If we are able to solve it, then it would be possible and desireable to plot a continuous graph of Hill numbers even when sample size is small. Such a graph contains complete information about the diversity of the system under study; the value of any standard diversity index can be calculated from such a graph.

2. Magurran's second recommendation, that sample size be large enough to support the planned analysis, is of course correct.

3. So is her third recommendation, that samples be replicated if possible.

4. Her fourth point is that an estimate of species richness will often be the most appropriate measure of diversity. This is a common opinion in ecology but in fact species richness should always be a last resort, unless presented alongside frequency-based measures as described above. As Lande 1996 shows (and Magurran herself shows in the first chapters of the book), species richness is the diversity measure that is slowest to converge to a definite value as sample size increases, and indeed it often does not approach an asymptote at all (see graph below). Also, when repeated samples are drawn from the same ecosystem, species richness shows more variability than any other measure of diversity. More important, estimates of species richness are the least ecologically meaningful measures of diversity, because they give vagrants and very rare residents the same weight as the dominant species. An ecosystem with ten equally common species is much more diverse in terms of ecological interactions than a system with one dominant species and nine vagrant species, yet species richness assigns both ecosystems the same diversity. Measures which take into account the frequencies of the species are much more meaningful ecologically than species richness.

Many ecologists are suspicious of frequency-based indices of diversity because these seem difficult to interpret. Conversion of these indices to effective number of species, as described elsewhere on this website, gives them many of the same intuitive properties as species richness, and should help in breaking down this prejudice .

5. Magurran's fifth recommendation is to consider the log-series parameter alpha or the Simpson index if a frequency-based measure of diversity is justified (as explained in the preceding paragraph, a frequency-based measure is always justified if species frequencies are available). This is a common opinion but it is not good advice. All Simpson indices emphasize the most common species disproportionately compared to their frequencies in the sample. When they are converted to N2, the Hill number or diversity of order 2, they are useful to include alongside N1 and N0. But they should not be the sole diversity index used in a study unless the focus is particularly on the most dominant species, or on the probabilities of interspecific encounters. Another argument against Simpson measures is that they cannot generally be decomposed into meaningful independent alpha and beta components.

Her recommendation that the log-series parameter alpha be used as a diversity index even when the species do not follow a log-series distribution seems unwise (though she is not alone in making this recommendation). There seem to be strong reasons to avoid this index for general use. The log-series alpha depends only on n (the number of individuals sampled) and S (the number of species in the sample). When the data is not log-series distributed this index throws away almost all the information in the sample (since it depends only on the sample size and the number of species in the sample, not the actual species frequencies) and gives counterintuitive results. For example, a sample containing ten species with abundances
[91, 1, 1, 1, 1, 1, 1, 1, 1, 1]
has the same diversity, according to this index, as a sample containing ten species with abundances
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10], whereas ecologically and functionally the second community is much more diverse than the first.

6. Magurran's recommendation to avoid Shannon measures reveals a common prejudice shared by many ecologists. The arguments presented are not logical. Magurran says "Given its sensitivity to sample size there appear to be few reasons for choosing it over species richness." Yet Shannon measures (whether the entropy or its exponential) are much less sensitive to sample size than the measure she recommends in her #4, species richness (and she spent much of the first part of the book pointing out how sensitive species richness is to sample size!) See the graph below, taken from Kempton 1979. The sensitivity of Shannon measures to sample size can be corrected using the same techniques that Magurran discusses for correcting species richness; see Chao and Shen 2003 for comparisons of different methods.

From Kempton, R. 1979.The structure of species abundance and measurement of diversity, Biometrics. An unbiased measure would have values close to 100%. Species richness is the most biased measure and the slowest to level off as sample size increases. Even with very large sample sizes, species richness is less than 50% of its asymptotic or true value in this simulation.

Magurran goes on to say that converting Shannon entropy to effective number of species ( the exponential of Shannon entropy, which is N1, the diversity or Hill number of order 1) "does not overcome the fundamental problems of this measure". She misses the point of taking the exponential. As I have shown in my Oikos paper, Entropy and diversity, and as Hill showed in his 1973 paper and as MacArthur showed in his 1965 article, taking the exponential of Shannon entropy makes it behave intuitively in comparisons and ratios. Please see Effective numbers of species for additional explanation on this point. (Converting any diversity index to effective number of species makes it behave intuitively in comparisons and ratios.)

It is especially unfair for Magurran to recommend the log-series parameter alpha over the exponential of Shannon entropy. I have explained above that the log-series parameter alpha can only be given a reasonable interpretation when the species distribution is log-series distributed. Even in that case, when a species distribution does follow a log series, the log-series parameter alpha has no advantage over the exponential of Shannon entropy-- in that case the log-series alpha and the exponential of Shannon entropy are related by a simple transformation. See Bulmer 1974.

It is worth pointing out that Shannon entropy and its exponential are the near-universal measures of uncertainty and diversity in physics, chemistry, information theory, and computer science. They are the only measures of diversity that weigh all species proportionately to their frequencies in the sample, rather than favoring common or rare species as Simpson indices or species richness do. This alone is reason enough to select them as the best general-purpose diversity measures. Shannon measures are also the only measures which can be decomposed into meaningful independent alpha and beta diversities when applied to a region with multiple unequal-sized communities. No other diversity measures can be used to measure regional alpha and beta diversity. This is also reason enough to select them as the best measures for general use. (Lande's 1996 decomposition of non-Shannon measures into "alpha" and "beta" does not yield a beta that is independent of alpha; see the counterexample in Diversity and similarity.) If just one number must be chosen to characterize the diversity of an ecosystem, the exponential of Shannon entropy is the most reasonable choice by far. Incidentally, the Shannon measures do not need to be borrowed from other disciplines but can be derived within biology by a careful consideration of the properties required of an ideal diversity measure. (I show this in my upcoming paper on the mathematics of beta diversity.)

7. The Berger-Parker index throws away all frequency information except that of the single most abundant species, so it can give misleading ideas of dominance when there are two or three almost-equally dominant species. Dominance can be nicely read from the shape of a graph of Hill numbers as recommended in #1 above. A horizontal graph indicates complete evenness. A good index of evenness is the ratio of Hill numbers N1/N0, N2/N1, or N2/N0. A value of unity indicates complete equitability and a value near zero indicates high dominance.

8 and 9. No comments.

Parametric measures of diversity (p. 102-106):

Magurran's treatment of parametric measures of diversity gives the impression that these are generally useful measures. However, log series alpha and log normal lambda cannot generally be interpreted if the species distributions are not log series or log normally distributed. The work she cites in support of these measures, that of Kempton and collaborators, involves data that are nearly log-series distributed, and their goal is slightly different from that of most investigators using diversity indices. Kempton et al. do not so much want to describe the diversity of a particular sample but rather want to characterize their study sites in a way that does not vary across years. They therefore search for parameters that emphasize a particular time-invariant aspect of their samples. While it is interesting to search for characteristics of a sample that are invariant with respect to year-to-year variation, a general-purpose diversity index should neutrally describe the sample at hand without making implicit hypotheses about the underlying structure of the sample. If an objective characterization of the diversity of a particular sample is the goal, nonparametric measures (corrected for small-sample bias) are superior. They are capable of interpretation even when the distributions of species are not log-series or log normal.

The Q statistic is actually a nonparametric index, as Magurran notes.

Nonparametric measures of diversity (p. 102-106):

For most users of Magurran's book this is an important section. The organization of the section (and the book as a whole) reflects the common view that there are many unrelated diversity indices in ecology. Yet the standard diversity indices (those based on sums of powers of the species frequencies, or limits of such sums) are all closely related and vary only in their emphasis on common or rare species. This set of indices includes species richness, Simpson concentration, inverse Simpson concentration, the Gini-Simpson index, the Hurlbert-Smith-Grassle or NESS (normalized expected species sampled) index for m = 2, all Renyi entropies, all HCDT (Havrda-Charvat-Daroczy-Tsallis) or “Tsallis” entropies, all Hill numbers, Shannon entropy (the limit of both Renyi and HCDT entropies as q approaches unity), Patil and Taillee’s average rarity index (a version of HCDT entropy; see Ricotta 2003), Varma entropy, Arimoto entropy, Sharma and Mittal entropy, and others (see Taneja 1989). All of these, in spite of their apparent differences, lead to the same expression for the effective number of species, as explained in Entropy and diversity. This expression, which unites all these diversity measures, has a single free parameter, q, which determines its sensitivity to common or rare species. This section would have been more profound had this unity been used as an organizing principle.

Information statistics (p. 106-114):

As I have already explained in my review of Magurran's Box 4.1, her treatment of Shannon measures is prejudiced. She cites some critics of Shannon measures in support of her opinion, but the articles cited are not very well thought out. Many of these criticisms focus on the fact that for small samples, Shannon measures show a consistent negative bias. As I mention above, this bias is much smaller than that of the commonly-recommended species richness index, and this bias can be almost completely removed by using the methods suggested in Chao and Shen 2003. In any case, sampling properties should not be the primary criteria for choosing a measure. It does no good to have an unbiased, rapidly-converging estimator if that estimator doesn’t measure what one needs to measure. More important than sampling properties is a measure’s ability to correctly capture the theoretical concept being studied. Shannon measures are the only standard diversity indices that do not disproportionately favor either common or rare species, and are the only measures that correctly capture the concepts of alpha and beta diversity when community weights are unequal.

Magurran mentions the various logarithmic bases that have been used with the Shannon index. Base 2 is used not just for historical reasons; it has some interesting advantages. When the Shannon entropy is expressed in logs to the base 2, it gives the mean depth of the maximally-efficient key to the species of the ecosystem being studied. Still, in general the base is unimportant, as Magurran says. The really meaningful number is not the value of the Shannon entropy, which depends on the base used, but rather the value of the exponential of the entropy, and since the exponential is taken to the same base as the logarithm, the two cancel out and the final diversity is independent of the choice of base. The most convenient base is the base of the natural logarithm, e.

On p. 108 Magurran notes correctly that it is very hard to know if the difference between a Shannon H of 2.35 and a Shannon H of 2.47 is small or substantial. She then notes that some investigators "sidestep the problem" (her words) by taking the exponential of H, but she thinks this still does not shed much light on the question. Like most ecologists, she has not appreciated MacArthur's 1965 and Hill's 1973 insight that conversion to effective number of species (e.g. the exponential of Shannon entropy) makes it possible to accurately judge the differences between two or more diversities. This is the key "blind spot" that burdens the field of diversity analysis. It is possible to prove mathematically that effective numbers of species (regardless of the index on which they are based) possess a reasonable and intuitive doubling property, so that the proportional difference between two effective numbers of species really reflects the proportional difference in an intuitively-defined diversity. This is explained in Hill 1973 and elsewhere on my website: Effective number of species, Entropy and diversity, Diversity of a single community, Comparing diversities of two or more communities.

Magurran then makes another error that is widespread in ecology. Her original question was whether the two values of Shannon H were "substantially different", but she then treats that question as if it could be answered by employing a statistical test. The statistical significance of the difference has little to do with the magnitude or biological significance of the difference. To make this clearer, it is useful to consider an example of tossing a coin to see if it is biased. A good measure of the magnitude of the bias is the mean proportion of heads obtained after tossing the coin N times. A good measure of the statistical significance of the bias is the p-value obtained (using the binomial distribution with mean 0.50) after tossing the coin N times. If N is large enough, even the most miniscule deviation from a mean of 0.50, say 0.50001, can result in a highly significant p-value, say 1/100000. This highly significant p-value proves that the coin is biased but says nothing about the magnitude or practical importance of the bias. The measure that reveals the actual magnitude of the bias is the mean proportion of heads. Ecologists are often guilty of stopping after obtaining a significant p-value, when the more interesting question is the real magnitude of the effect being measured, the equivalent of the mean proportion of heads in the coin toss.The statistical methods described by Magurran are unobjectionable, but if the p-value is significant, then the experimenter must ask: What is the real magnitude of the effect? The effective number of species permits the calculation of the real magnitude,and hence permits ecologists to judge the absolute sizes of effects, as explained elsewhere on this site.

Perhaps ecologists' confusion of statistical significance versus the actual magnitude of an effect has contributed to the failure to appreciate the importance of effective number of species. As long as ecologists are only interested in the statistical significance of a result, any index with known statistical properties is good enough. Only if we want to move beyond mere statistical significance does it become important to find measures that behave intuitively and whose changes in magnitude can be properly judged. That's where effective numbers of species come into play.

There is more to be said about some of the other evenness measures Magurran treats in this section, but that will have to wait until I have more time.

Dominance and evenness measures (p. 114 -121):

The main theme here is the Simpson index D, with its variants 1/D, 1-D, and ln(D). Much space is devoted to the relative merits of these alternatives. It is unfortunate that a broader mathematical perspective has not been taken in this book and in the diversity literature; all these alternatives (and any other monotonic function of D) yield the same effective number of species and are therefore equivalent.

Magurran closes her treatment of Simpson indices with the comment "Simpson's index remains inexplicably less popular than the Shannon index." This is not so inexplicable when one realizes that all Simpson indices disproportionately emphasize the most common species in the sample, making them insensitive to changes of diversity that affect only the nondominant species. Studies of the discriminating power of indices have consistently shown that Simpson measures are less discriminating than Shannon measures. Furthermore, Simpson indices cannot be decomposed into meaningful independent alpha and beta components when applied to unequally weighted communities.

The other measures of evenness discussed here are of little importance.

The review of the rest of this chapter must wait until I have more time.

Magurran Chapter 5, "Comparative studies of diversity" (p. 131-161)

For now I include just a few comments on the most important issues raised here. On p. 134 Magurran says that the Simpson index is less sensitive to sample size than Shannon measures, and she therefore recommends its use. Simpson measures are less sensitive to sample size only because they overemphasize the most common species in the sample. This is not a good reason to use them. As mentioned earlier, the methods of Chao and Shen 2003 virtually eliminate the small-sample bias of Shannon measures.

On p.148 Magurran repeats her advice to use the log-series parameter alpha even when the species distributions do not follow a log-series. This is odd advice: alpha is uninterpretable in this situation.

Also on p.148 she superficially introduces the Hill numbers. She does not explain their value in providing the answer to the question of how distinct are two different diversity measurements. See elsewhere on this website for examples of their use.

Magurran Chapter 6, "Diversity in space (and time)" (p. 162-184)

Again, just a few quick comments for now on the most important issues.

Most of this section is devoted to measuring beta diversity using presence-absence indices. This reflects their widespead but biologically indefensible use in ecology. Ecologically significant differences between communities involve differences in frequency, not mere presence or absence. A northern Wisconsin pine forest is a very different community than a northern hardwood forest, but usually there are one or two maples in a given pine forest, and vice-versa. Presence-absence indices of beta treat both communities as identical. Frequency measures are always preferable when frequency data is available. (Frequency measures can also be constructed from stratified presence-absence data.)

One reason for the emphasis on presence-absence measures in the literature is the lack of good general definitions of beta for frequency-based diversity indices. It turns out that a general, easy to interpret definition of beta can be derived from first principles, and the new beta which results gives the effective number of distinct communities in the region, for any given diversity index. Interestingly, the most popular indices of similarity and overlap (Jaccard, Sorensen, Horn, and Morisita-Horn) are all monotonic transformations of this new beta. The derivation of this new beta is the subject of my paper, Partitioning diversity into independent alpha and beta components, in press in Ecology ("Concepts and Synthesis" section).. When communities are equally-weighted (as for example when they are being compared in the abstract) the beta diversity for any index is necessarily the effective number of species of the gamma index divided by the effective number of species of the alpha index. (This definition is not invented but rather derived mathematically.) When community weights are unequal, the surprising conclusion of my paper is that the only logically consistent measure of beta diversity is :

Beta diversity = exp(Shannon H_gamma)/exp(Shannon H_alpha)

When community weights are unequal, all other decompositions of diversity indices into alpha and beta components proposed in the literature either result in a beta that is not independent of alpha, or an alpha which could exceed gamma. Both these outcomes are generally regarded as unacceptable (Wilson and Shmida 1984 for independence, Lande 1996 for alpha not to exceed gamma).