Friday, September 23, 2016

Book review: Statistics Done Wrong: the Woefully Complete Guide



How to Lie with Statistics came out in 1954. It has long been considered a classic with over a half-million copies sold.

The second edition of Statistics Done Wrong might be a true successor to the classic. The first edition, recently released, is still a good read for any scientist.

The author, Alex Reinhart, spends covers some basics about statistics and then empirical cases where statistics have been used incorrectly.

It's a good book. I learned a few things while reading it and was impressed to see that important examples from recent news were included as cautionary tales. I think most scientists should spend the time to read through this. If they don't learn anything, they should at least feel good about that. My guess is that they would.

That said, the difficulty with the book is that it isn't mature yet. The section on p-values is probably the most important, but I finished the section not quite sure what the author wanted to impress upon the reader. The reader is admonished to use ranges instead of p-values, but it just isn't clear why. As the book matures, the author should have better examples to get his points across. In contrast, his examples for base rate fallacies were mature. They were poignant and the reader should have a clear idea how to calculate what percentage of positives are likely false.

Another problem with any statistics book is that statistics is too broad to be covered in any thin volume no more than a single book could cover how to lie with language.

At any rate, Statistics Done Wrong is an admirable effort. At the very least it should set off alarm bells with researchers to ask questions about their statistics a little more deeply.

I'm looking forward to seeing what the second edition holds.


Monday, September 19, 2016

Tuesday, September 13, 2016

The trajectory of nitrogen availability

It is a simple fact that N availability is rising throughout the world, likely causing a planetary boundary to be exceeded. Considering that humans have doubled global N2 fixation, it's impossible that it hasn't.

It is also a simple fact that CO2 concentrations have been rising, which likely should be causing N to become progressively more limiting.

It is also a simple fact that no one has taken the time to comprehensively address whether N availability has been increasing or decreasing in the ecosystems of the world. There are almost no time series of direct measurements of N supplies or availability to test whether N availability is going up or down.

As a result, it is unresolved as to whether N availability is increasing or decreasing in ecosystems across the world.

Andrew Elmore and Dave Nelson (with a little help from me) report in the latest issue of Nature Plants new data that looks at whether N availability is increasing or decreasing in US eastern deciduous forests.

Short answer: N availability looks to be decreasing.

Using ratios of N isotopes in wood as a proxy for N availability, Elmore et al. show that N availability has been declining in the forests they examined for some time.

That's a pretty big result.

Not only do they show this, but they also show that the declines are tied to spring phenology. Years with warmer springs have the lowest N availability.

Mechanistically, one link between phenology in N availability is that years with warmer springs have greater increases in plant demand for N than any increases in N supplies, leading to declines in N availability.

One question that arises from this work...if N availability is declining in these forests, how sure are we that we have crossed a planetary boundary for N? Are the world's terrestrial ecosystems really eutrophying?

Elmore, A. J., D. M. Nelson, and J. M. Craine. 2016. Earlier springs are causing reduced nitrogen availability in North American eastern deciduous forests. Nat Plants 2:16133.



Monday, September 5, 2016

Study on old trees

a

The oldest trees in the world are often in the most stressful environments, or so it seems. Yet, there has never been a quantitative attempt to assess tree longevity.

Di Filippo et al. make a first attempt at this by analyzing tree-ring data for broad-leaved deciduous trees in the Northern Hemisphere.

Given the massive impact of humans on old-growth forests, any study like this will have caveats, but the data are interesting.

For example, they report that 300-400 years is a good baseline for tree lifespan (if that concept even applies to trees).  They also report a maximum longevity of 600-700 years for deciduous trees in general.

They also show that the really old trees spent a long time growing slowly. The idea is that mortality rates increase with size, so staying small is a good way to avoid mortal blows like wind throw.

The relationship they show with maximum age for Fagus was interesting. Essentially, in warm places, the maximum age of Fagus was a lot less than in cold places. They cannot answer whether this is a direct or indirect effect, but they did not find the same relationship for Quercus species.

The authors don't believe the evidence assembled indicates a biological limitation to longevity in trees, e.g. meristems senesce after a certain amount of time.

Instead, trees can only roll the dice so many times. And it's hard to roll the dice for more than a few hundred years and not lose.


Di Filippo, A., N. Pederson, M. Baliva, M. Brunetti, A. Dinella, K. Kitamura, H. D. Knapp, B. Schirone, and G. Piovesan. 2015. The longevity of broadleaf deciduous trees in Northern Hemisphere temperate forests: insights from tree-ring series. Frontiers in Ecology and Evolution 3.

Thursday, September 1, 2016

Quantifying cattle diet across broad gradients

Just a quick note on a new paper that we just published.

I've worked before with Texas A&M's GANLab to assess patterns across the US of forage quality for cattle. In that paper, we saw that cattle in cool, wet regions had the highest forage quality, which suggested the warming would reduce forage quality. It was an important paper for understanding how global warming would affect the ability of grassland to sustain grazers.

Although that work showed geographic patterns of forage quality, we couldn't tell how the species that the cattle consumed might be changing.

Just this month, we published a new paper where we sequenced the plant DNA in fecal samples of cattle across the US to answer that question.

Those results are pretty interesting, too.

In short, cattle in warmer grasslands are relying more on forbs than cattle in cooler grasslands. That suggests warming will shift the diet of cattle, potentially to compensate for lower forage quality. This is pretty similar to what we saw for bison.

The specific results are important, but the general approach is even more interesting. This is the first time the diet of an herbivore was quantified over such a large spatial scale and with such specificity. For example, we could see the species of grasses shift as one moved south, and the unique diet of cattle in southern Texas (a fair amount of live oak there).






Plant productivity and climate: a back and forth

The process of science is one we do not talk about much. There are reams of studies on statistical tests for a given data set, and meta-analyses have moved science forward for bringing together different data sets to test an idea.

But how does science decide the "truth" when there are different assumptions between different studies? What process gets used when words are used in different ways? No statistical test or meta-analysis can bridge that gap.

Here's an example...

In August of 2014, Michaletz et al. published a paper that analyzed data on plant production for over a thousand forests across the world. It has long been understood that production is greatest in warm, wet forests (think tropical rain forests) and least in cold, dry forests (think bristlecone pine).When we warm or irrigate forests, they grow more, too. Seems like the role of climate is pretty well settled.

In their review, the authors found that, indeed, production correlated with temperature and precipitation, but according to them, this was too simple. When viewed through metabolic scaling theory, climate had only an indirect effect on production. The authors asserted that "age and biomass explained most of the variation in production whereas temperature and precipitation explained almost none".  In short, warm, wet forests are more productive only because they tend to be older and larger there, not because warm or wet conditions promote growth. By this idea, if you compare two forests of equal size and age, but one forest was in a cold, dry environment, and the other was in a warm, wet environment, there would be no difference in their production.

The authors have published many excellent papers on metabolic scaling, really developing a line of thought to begin to unify some fractured thought on how plants work. If this result held, it would be a coup de grace in many ways.

So how did the authors rule out that climate directly affected production?

The authors calculated a rate of monthly production by dividing production by the length of the growing season. This removed the influence of differences in the length of the growing season to compare forests across the world more equally, essentially asking if forests in warm, wet places grow more each month than ones in cold, dry places. When they did this, they found that "In contrast to results for NPP, average growing season temperature,...mean annual precipitation, and mean growing season precipitation explained little to no variation in global [monthly production]."

And with that result, the authors move on to test other factors, such as stand age and biomass, independent of climate, finding that "A large proportion of variation in NPP...was explained by just two variables: stand biomass and plant age."

The Michaletz  paper was published in Nature, which is often publishes some of the most important results in our discipline only after intense scrutiny. It seemed like that question was settled. Climate only affects how big forests get and how old they are, it doesn't make a given forest grow any faster per se.

Well, I guess it can be said that one person's assumption is another person's legerdemain.

This past January a new paper was published in Global Change Biology. Chu et al. reanalyze the Michaletz data and start with the title "Does climate directly influence NPP globally?" The authors assert that the Michaletz study had "flaws that affected that study’s conclusions". They also "present novel analyses to disentangle the effects of stand variables and climate in determining NPP."

In short, the authors state that ruling out climate's direct effect by calculating monthly production was erroneous. Growing season length and mean climate are highly correlated. In their view, it was incorrect to rule out the direct effect of climate by dividing production by growing season length and then examining the resultant metric against climate variables.**

**This debate, in part, is the Knops-Vitousek debate all over again...

Instead, using different analytic techniques, Chu et al. simultaneously test the roles of growing season length and other climate variables on production.

Their conclusion? Climate does directly affects production.

At this point, I'm not writing about this to weigh in on which side is right or more right or right under specific conditions.

I only pose this question.

Now what?

How does our discipline resolve the tension here?

Were the assumptions by Michaletz right? Are the two camps' differences semantic? Which conclusion should be accepted? Does climate directly affect production or only indirectly?

At this point, if it was convenient for a scientist's argument for climate not to affect production, they just cite the Michaletz paper. If the contrary held, just cite the Chu paper.

In the legal world, when different circuit courts come to different conclusions, this can lead to "forum shopping" where a plaintiff can simply go to the circuit that is most favorable to their case. That shouldn't be if the goal is to have one set of laws to govern a nation.

Like the legal world, it seems like being able to cite either one of two opposing ideas is not sustainable for science either.

It is interesting to note that in the US  federal court system, two contrary ideas existing at the same time would be the equivalent of a "circuit split" where two circuit courts come to two different conclusions about how to interpret the law. This tension would often be resolved at the next higher court, the Supreme Court. And the decision of the Supreme Court would resolve the differences of opinion.

All I note here is that science doesn't have that. We have no formalized process for resolving a split. Split conclusions can theoretically last indefinitely. And scientists can cite whichever side they believe in more or find most convenient.

I think that is fascinating.