From the Narrow Desert: Diversity

Showing posts with label Diversity. Show all posts

Tuesday, March 30, 2021

Synchronicity: Diversity in forests

Last Friday, I posted "Calculating beta diversity," in which I explored different types of diversity by considering various hypothetical forests and the tree species in them.

The next day, Saturday, a student had some questions about an article on an English reading comprehension test he had taken. The article was called "What Is a Community?" and began thus:

The Black Hills forest, the prairie riparian forest, and other forests of the western United States can be separated by the distinctly different combinations of species they comprise. It is easy to distinguish between prairie riparian forest and Black Hills forest -- one is a broad-leaved forest of ash and cottonwood trees, the other is a coniferous forest of ponderosa pine and spruce trees.

Not only is that an example of diversity in forests, it is specifically the beta diversity I focused on in my post -- that is, one forest differing from another in terms of its species composition (as opposed to alpha diversity among trees within a single forest).

Incidentally, Kevin McCall (who, unlike me, is a trained mathematician) has taken up the quest for a formula interrelating the various types of diversity. Check out his "Summary and discussion of ecological formulas" if you're interested.

Friday, March 26, 2021

Calculating beta diversity

Diversity? Comme au courant! Well, you know I like to cover all the bases.

⁂

There are (or were when I was in college) three types of ecological diversity: alpha, beta, and gamma. Let's say we're talking about a territory in which there are a number of separate forests, each of which contains a number of trees which may be classified into discrete species.

Gamma diversity is the total diversity of trees within the territory. If we select two of the territory's trees at random, what is the probability that they will be of different species? That number (a "diversity index") would be a quantification of the territory's gamma diversity. You can think of the gamma as standing for global; we're looking at the diversity of individuals (trees) in the entire territory, without considering any of the smaller subgroups (forests) among which those individuals are distributed.

Alpha diversity is the diversity of trees within each forest. If we randomly select two trees from the same forest, what is the probability that they will be of different species? This is an index of alpha diversity. Think of alpha as representing the article a -- the internal diversity within a single forest. We can calculate a diversity index for each of the forests in the territory, and the mean of these numbers will be the alpha diversity of the territory as a whole.

For maximum simplicity, let's just look at territories that have only two forests (Forest 1 and Forest 2), which each have the same number of trees, and only two tree species (redwoods and bluewoods).

Calculating diversity indices is quite straightforward. You take the percentages for each species in the population (for example, the trees of Charliestan are 75% redwood and 25% bluewood). For each species, the probability of randomly selecting two members of that species is its percentage squared -- so the sum of the squares of all the percentages is the probability of selecting two trees of the same species. For Charliestan, that probability is .75² + .25² = .625; the diversity index (the probability of selecting two trees of different species) is 1 minus that number, or 37.5%.

Note that when there are only two species, the highest possible diversity index is 50%. Note also that gamma diversity sets a cap for alpha diversity. The two measures can be equal (as in Bakerstan and Charliestan), or gamma can be higher (as in Ablestan and Dogstan), but alpha can never be higher than gamma.

When there is a difference between gamma (the diversity of trees in the territory) and alpha (the diversity of trees within the forests of the territory), that difference must be accounted for by beta diversity: diversity between forests. (Think of beta as standing for between -- though of course you really ought to say among if there are more than two.) The remainder of this post will discuss the relative merits of various ways of calculating beta diversity.

⁂

Approach 1: Forests as units

Can we calculate beta diversity as a diversity index of the sort we have used for alpha and gamma? Well, we could, but that would mean treating entire forests the way we have been treating trees -- as unanalyzable units to be classified into a finite number of discrete "species." For example, if a territory had 10 spruce-fir forests, 5 oak-hickory forests, and 5 maple-beech-birch forests, we could calculate its beta diversity as 1 - (.5² + .25² + .25²) = .625.

The obvious problem with this is that forests just aren't unanalyzable units, and classifying them qualitatively seems the wrong way to go about things. Forests can be more or less similar in their species profile; it's not a binary same/different question. Imagine a spruce-fir forest that is pretty much just spruce and fir, and a maple-beech-birch forest that is also pretty much just what it says on the tin. Now imagine a different country where the spruce-fir forest also has significant numbers of maple and birch trees, and where both the spruce-fir and the maple-beech-birch forests have plenty of hemlocks. This latter country obviously has less beta diversity -- that is, its forests are more similar to one another -- but this approach can't see that.

(Incidentally, this objection also applies to some extent to alpha and gamma diversity. Doesn't a white-red-jack forest, where all the major species are species of pine, have less diversity than an oak-gum-cypress forest? Isn't a neighborhood that's half black and half white more diverse than one that's half German and half Austrian?)

⁂

Approach 2: All the gamma that's not alpha

The logic is simple: gamma diversity is total diversity; some of it is accounted for by alpha diversity; all the rest must be beta diversity.

Robert Whittaker's original equation for beta diversity was β = γ/α, which is obviously suboptimal. It would make 1 the minimum figure for beta diversity, when it is the hypothetical maximum for alpha and gamma, making it incommensurable with the other two types of diversity. It is also unable to deal countries like Ablestan, which have 0 alpha diversity and thus cause a divide-by-zero error.

Later ecologists (perhaps for the reasons I mention) decided to subtract rather than divide, making the new formula β = γ - α. Let's look at our three territories again (reproduced here so you don't have to scroll up).

Using the subtractive formula, we get 0 beta diversity for Bakerstan and Charliestan -- which is correct, since in each of those countries the two forests are identical in terms of species profile -- 12.5% for Dogstan, and 50% for Ablestan. But, wait, isn't that a little strange? The two forests of Ablestan are 100% different -- not a single tree in Forest 1 is the same species as any tree in Forest 2 -- so shouldn't the beta diversity be 100%?

Compare Ablestan to Easystan -- which, unlike the territories we have looked at so far, has yellowwoods.

Both alpha and gamma are higher for Easystan, which makes sense. It has greater global (gamma) diversity, and Forest 2 has greater internal (alpha) diversity. But shouldn't its beta diversity -- the difference between the two forests -- be exactly the same as Ablestan's? In both territories, the trees in Forest 1 are 100% different from those in Forest 2. But the subtractive formula gives us a beta of only 37.5% for Easystan, lower than Ablestan's 50%. Obviously this formula is not capturing the intuitive meaning of beta diversity.

Or consider Foxstan, which differs from Ablestan only in that its forests are not the same size; 75% of its trees are in Forest 1.

Both Ablestan and Foxstan have an alpha of 0, which is correct because there is no internal diversity within their forests at all. Ablestan has a higher gamma because it is half redwoods and half bluewoods -- the maximum diversity possible when there are only two species. In Foxstan, redwoods are a solid majority, making it less diverse.

What about beta? In each territory, how different is one forest from the other? Well, it seems obvious that both Ablestan and Foxstan have equal, because maximal, beta diversity. In both countries, the trees in Forest 1 are 100% different from the trees in Forest 2. If anything, we might even say that the two forests differ more in Foxstan than in Ablestan, because they differ in size as well as in species profile. But if we use the formula β = γ - α, and alpha is 0, each territory's beta is equal to its gamma, which means Foxstan has less beta diversity than Ablestan. This seems clearly wrong.

⁂

Approach 3: An outgroup diversity index

Both gamma and alpha are calculated by means of a diversity index -- the probability that two randomly selected trees will be of different species. For gamma, the figure is for any two trees in the territory; for alpha, it is for any two trees that are in the same forest. So can't we get beta by calculating a diversity index for any two trees that are not in the same forest?

No, this doesn't work, either. Consider the case of Bakerstan and Charliestan.

Both of these territories should have a beta of 0, because each has two identical forests. But -- precisely because the two forests are identical -- comparing two random trees from different forests is the same as comparing two from the same forest, or from the territory as a whole, so β = α = γ. This method gives Bakerstan a beta of 50%, when it ought to be 0. That's a pretty serious error!

So maybe we should say beta diversity is outgroup diversity (call it xi) minus ingroup diversity (alpha): β = ξ - α. That would give us the desired 0 beta value for Bakerstan and Charliestan. Does it work more generally? No. It fails the Easystan test.

In Ablestan, xi is 1 and alpha is 0, so beta is also 1. This is correct, since the two forests are maximally different from each other.

In Easystan, the two forests are also maximally different from one another -- not a single tree in Forest 1 is the same species as any tree in Forest 2 -- so its xi is 1, and its beta ought to be 1 as well. But because it has an alpha of 25%, its beta is only 75%.

⁂

Approach 4: Slice-matching

And now we come to my final answer! I assume I'm not the first to have thought of it, but I'm much too lazy and unprofessional to play the "literature review" game when it's so much more fun to just reinvent the wheel. I do hope I'm not making an original contribution to diversitology here because, you know, that would just be sad. (Alas, my experience with astronomy does not fill me with optimism.)

This method yields the correct values of 1 for Ablestan, Easystan, and Foxstan; and 0 for Bakerstan and Charliestan. Of the territories we have looked at so far, only Dogstan has a non-trivial beta value, so we will look at it first to demonstrate how the slice-matching method works.

You take the two forests' pie charts and remove all matching slices. That is, you can cut a slice out of a pie and remove it if and only if you can remove a slice of the same size and color from the other pie chart. You keep doing this until you can't do it anymore, and the percentage of the pies remaining is your beta diversity. (When I talk about the "size" of a slice, I mean its relative size as a percentage of its pie; beta diversity is not affected by differences in absolute size among forests.)

For Dogstan, we can remove a 25%-sized slice of blue from each pie, the a 25%-sized slice of red, and then we're done. We still have 50% of each pie left, so Dogstan's beta diversity is 50%.

What if there are more than two territories in the forest? Do we remove only those slice that can be removed from every forest? No, that clearly won't work. Imagine a territory with 4 all-redwood forests and 1 all-bluewood forest; no slices could be removed, and thus the beta would be 1 -- maximal beta diversity, despite the fact that three of the four forests are identical. No, slice-matching can only be done between a pair of forests, and the beta diversity of the whole territory is calculated by taking the mean beta of all possible pairs of forests. In our example, there are 5 forests and thus 10 possible pairs of forests. Of these, 6 are red-red pairs with beta of 0, and 4 are red-blue pairs with beta of 1. The mean beta diversity for the whole territory would thus be 40%.

As a further illustration of how this works, let's take a look at Georgestan and Howstan -- territories which each have four different forests and four different tree species.

The bottom row of pie charts shows the species distribution for each of the forests. I have so designed these distributions as to give the two territories identical gamma diversity, but Georgestan's diversity is more of the alpha variety (each forest is internally diverse), while Howstan's is more beta (each forest is different from the other forests). I've limited myself to pie slices that are multiples of 12.5%, so as not to overtax my MSPaint skillz.

The pyramid of pie charts above each bottom row shows the "slice-matching" results for each pair of forests. Go diagonally down to the left and to the right to see which two forests each chart is comparing. For example, the pie at the apex of the Georgestan pyramid is comparing the forests F1 and F4, which are highly similar. Slice-matching rules allow us to remove quarter slices of red, yellow, and blue, and an eighth slice of green, from each forest. What is left -- the slices that cannot be matched -- is shaded black and represents the beta diversity between those two forests, which in this case is 12.5%. Looking at the corresponding pie at the apex of the Howstan pyramid, we can see that its F1 and F2 are very different, with 50% beta diversity. Beta diversity for a whole territory is simply the mean beta diversity of all possible forest pairs.

The diversity figures for the two territories, then, are as follows:

Georgestan

gamma = 75%
alpha = 72.7%
beta = 18.8%

Howstan

gamma = 75%
alpha = 57.8%
beta = 52.1%

We can compare these to the extreme cases of Itemstan (all four forests look like Georgestan's F1) and Jigstan (the forests are all red, all yellow, all green, and all blue, respectively).

Itemstan

gamma = 75%
alpha = 75%
beta = 0%

Jigstan

gamma = 75%
alpha = 0%
beta = 100%

⁂

Is there a formula?

Robert Whittaker had a simple formula -- β = γ/α. -- which we have found inadequate. Can the slice-matching approach to beta diversity also be reduced to a formula? This much seems intuitively obvious:

If gamma is held constant, increasing alpha causes beta to decrease and vice versa. This seems to imply that we should be able to derive alpha if we know beta and gamma, or derive beta if we know alpha a gamma. (Seems. I haven't fully thought this through yet.)

We clearly cannot derive gamma if we know alpha and beta. Jigstan and Ablestan both have an alpha of 0 and a beta of 1, but their gamma is different. This is only possible because Ablestan has two forests but Jigstan has four, so perhaps a fourth variable -- the number of forests -- has to be included in the formula. My hunch (just a hunch) is that any one of those variables should be derivable from the other three, hopefully in a tolerably elegant manner.

Perhaps some of my more mathematically gifted readers (you know who you are!) would like to give it a shot.

Tuesday, March 30, 2021

Synchronicity: Diversity in forests

Friday, March 26, 2021

Calculating beta diversity

Ace of Hearts