Climate correlations with tree rings

A workflow for systematically screening climate-growth correlations

Last updated on Nov 23, 2020 8 min read 0 Comments Statistics

Image credit: R. Peters

CLIMCOR Workflow

1. Background

This tutorial is about systematically screening different season lengths when performing climate-growth relationship analyses on tree-ring width measurements. Trees do not care about months or seasons as we define them. Therefore, it is essential to check all possible season lengths and ending months.

Because we like working with the time-series class (ts()) we need to change the class() of all R objects (tree-ring and climate data) to that of a time series. Additionally we need some custom-built functions to detrend.

2. Set-up steps

First, we need to load some required libraries that the code is based on.

#install package if necessary
packages <- c("dplR","MASS","dyn","lattice","latticeExtra","zoo","pspline")
try(install.packages(setdiff(packages, rownames(installed.packages()))))
library(dplR)
library(MASS)
library(dyn)
library(lattice)
library(latticeExtra)
library(zoo)
library(pspline)

3. Import data

Below one will load an example tree-ring series and climate data.

#load tree-ring data
TRW<-read.rwl("https://raw.githubusercontent.com/deep-org/materials/main/data/general_statistics/01_climate_cor_trw/saviese.txt")
## Attempting to automatically detect format.
## Assuming a Tucson format file.
## There does not appear to be a header in the rwl file
## There are 32 series
## 1    	10S01A  	 1921	 2005	0.01
## 2    	10S01B  	 1926	 2005	0.01
## 3    	10S03A  	 1928	 2005	0.01
## 4    	10S03B  	 1920	 2005	0.01
## 5    	10S04A  	 1924	 2005	0.01
## 6    	10S04B  	 1924	 2005	0.01
## 7    	10S05A  	 1931	 2005	0.01
## 8    	10S05B  	 1928	 2005	0.01
## 9    	10S08A  	 1923	 2005	0.01
## 10   	10S08B  	 1924	 2001	0.01
## 11   	10S20A  	 1924	 2005	0.01
## 12   	10S20B  	 1926	 2005	0.01
## 13   	10S21A  	 1926	 2005	0.01
## 14   	10S21B  	 1926	 2005	0.01
## 15   	10S22A  	 1926	 2005	0.01
## 16   	10S22B  	 1926	 2005	0.01
## 17   	10S23A  	 1926	 2005	0.01
## 18   	10S23B  	 1926	 2005	0.01
## 19   	10S24A  	 1925	 2005	0.01
## 20   	10S24B  	 1924	 2005	0.01
## 21   	10S25A  	 1893	 2005	0.01
## 22   	10S25B  	 1887	 2005	0.01
## 23   	10S26B  	 1943	 2005	0.01
## 24   	10S28A  	 1820	 2005	0.01
## 25   	10S28B  	 1819	 2005	0.01
## 26   	10S29A  	 1902	 2005	0.01
## 27   	10S50A  	 1872	 2005	0.01
## 28   	10S50B  	 1894	 2005	0.01
## 29   	10S51A  	 1887	 2005	0.01
## 30   	10S51B  	 1898	 2005	0.01
## 31   	10S57A  	 1841	 2005	0.01
## 32   	10S58A  	 1872	 2005	0.01

#apply detrending
TRW<-detrend(TRW,method="Spline",nyrs=50)

#transform the data.frame to a time series object starting in the first year of the original data.frame
TRW<-ts(apply(TRW,1,tbrm),start=as.numeric(rownames(TRW))[1])

The first climate data we will work with are precipitation time series from the E-OBS dataset, you can download from the KNMI climate explorer.

#loading climate data in the CRU format with 13 columns (1= year, 2:13= months)
climate<-read.table("https://raw.githubusercontent.com/deep-org/materials/main/data/general_statistics/01_climate_cor_trw/precip.txt",header=F)
climate[climate==-999.9]<-NA #transform -999.9 to NAs

Note, E-OBS reports mean daily precipitation per month, so the values have to be multiplied by the number of days per month. With CRU and most other products, this step can be skipped.

#correct for leap years for February (%%4==0 means: which year divided by 4 has the rest = 0, i.e. the module of division by 4 equaling 0)
month.length<-c(31,28,31,30,31,30,31,31,30,31,30,31)
for(i in 2:13){
  if(i==3){
    climate[,3]<-climate[,3]*(28+1*(climate[,1]%%4==0))
  }else{
    climate[,i]<-climate[,i]*month.length[i-1]
  }
}

4. Data and function preparation

Next, we want to aggregate the monthly climate data to seasons of differing lengths. Maximum season length is by default set to 12 months. The function should work on longer time scales but is currently not tested beyond 12 months. Consider including `lag=2 into runningclimate function.

#load required functions
runningclimate_new<-function(climate,stat=c("mean","sum"),max.length=12){
  x.ts<-ts(climate[,2:13],start=climate[1,1]) #transform CRU style climate data into time-series format
  x.lag<-stats::lag(x.ts,k=-1) #create lag1 series
  x1<-ts.union(x.lag,x.ts) #combine lagged and original series
  max.length<-max.length  #default to 12 months
  stat<-stat #no default is given
  seasons<-rep(list(x1),max.length)
  for(i in 2:max.length){
    seasons[[i]]<-ts(cbind(matrix(rep(rep(NA,nrow(x1)),i-1),ncol=i-1),t(rollapply(t(seasons[[i]]),i,stat,na.rm=T,align='right'))),start=climate[1,1])
    colnames(seasons[[i]])<-c(paste0("p",month.abb),month.abb) }
  output<-seasons
}

Climatesite<-runningclimate_new(climate,stat="sum") #use "sum" when working with precipitation, use "mean" when working temperature

This next function fits cubic smoothing splines to a ts-class object and is needed for the other two functions below. Courtesy of David C. Frank.

#load required functions
spline.det <- function(x,smoothing) {
require(pspline)
  spline.p.for.smooth.Pspline<-function(year)  {
    p <- .5/((6*(cos(2*pi/year)-1)^2/(cos(2*pi/year)+2)))
    return(p)
  }
   fy <- function(object) {
    fyos<- function(x) {
      L <- !is.na(x)
      idx <- c(NA, which(L))[cumsum(L) + 1]
      fy <- min(idx,na.rm=TRUE)
      return(fy)
    }

    ifelse (is.mts(object),
            rel.years <- apply(object,2,fyos),
            rel.years <- fyos(object))

    calyears <- rel.years+start(object)[1]-1
    return(calyears)
  }

  ly <- function(object) {

    fyos<- function(x) {
      L <- !is.na(x)
      idx <- c(NA, which(L))[cumsum(L) + 1]
      fy <- max(idx,na.rm=TRUE)
    }

    ifelse (is.mts(object),
            rel.years <- apply(object,2,fyos),
            rel.years <- fyos(object))

    calyears <- rel.years+start(object)[1]-1
    return(calyears)
  }

  ifelse ((is.null(ncol(x))),x2 <- ts.union(x,x), x2 <- x)
  fyarray <- start(x2)[1]
  lyarray <- end(x2)[1]
  begin <- fy(x2)
  end <- ly(x2)
  smoothedarray <- x2

  spline.p <- spline.p.for.smooth.Pspline(smoothing)

  for (i in 1:ncol(x2)) {
    temp <- smooth.Pspline((begin[i]:end[i]),x2[(begin[i]-fyarray+1):(end[i]-fyarray+1),i],spar=spline.p,method=1)
    smoothedarray[(begin[i]-fyarray+1):(end[i]-fyarray+1),i]  <- temp$ysmth
  }

  if (is.null(ncol(x))) {smoothedarray <- smoothedarray[,-1]}
  residualarray <- x-smoothedarray
  ratioarray <- x/smoothedarray

  output <- list(x=x,smoothedarray=smoothedarray,residualarray=residualarray,ratioarray=ratioarray)
  return(output)
}

The function automatically returns four objects. 1) The original time series, 2) the smoothing function, 3) the residual array (x-smoothing), and 4) the ratio array (x/smoothing). For climate data (especially temperature) we need the residual array of that detrending.

Next up is the systematic correlation of the tree-ring chronology against all aggregated climate series. The output graph is similar to the output of the R package SPEI ( Beguería & Vicente-Serrano, 2017) and the CLIMTREG program ( Beck et al., 2013).

#load required functions
Climateplot_new<-function(TRW,Climatesite,fyr=tsp((Climatesite[[1]]))[1],lyr=tsp((Climatesite[[1]]))[2],detrended=c("Yes","No"),spline.length){   ####Climatesite=your climate data run through runningclimate_new(),TRW=Tree ring chronology with time series class,fyr=first year, lyr=last year
  climatecor<-NULL

  if(detrended=="Yes"){
    tsstart<-tsp((Climatesite[[1]]))[1]
    for (i in 1:length(Climatesite)){
      Climatesite[[i]]<-ts(apply(Climatesite[[i]],2,function(x)if(sum(complete.cases(x))>3){spline.det(ts(x),spline.length)$residualarray}else{rep(NA,length(x))}),start=tsstart)
      climatecor[[i]]<-cor(ts.union(Climatesite[[i]],window(TRW,fyr,lyr)),use="p")
    }
    levelplotmatrix<-matrix(NA,length(Climatesite),22)
    for (i in 1:length(Climatesite)){
      levelplotmatrix[i,]<-climatecor[[i]][1:22,25]
    }
  }else{
    for (i in 1:length(Climatesite)){
      climatecor[[i]]<-cor(ts.union(Climatesite[[i]],window(TRW,fyr,lyr)),use="p")
    }
    levelplotmatrix<-matrix(NA,length(Climatesite),22)
    for (i in 1:length(Climatesite)){
      levelplotmatrix[i,]<-climatecor[[i]][1:22,25]
    }
  }
  rownames(levelplotmatrix)<-names(1:length(Climatesite))
  colnames(levelplotmatrix)<-c(paste0("p",month.abb),month.abb[1:10])
  return(levelplotmatrix)
}

5. Climate correlations

If your chronology is spline-detrended, you should spline-detrend the climate data as well. Splines remove a fixed amount of variance at a given frequency. Keeping those lower frequencies (and trends) in the climate data while removing them from tree-ring data will likely yield distorted correlation coefficients. Thus, we strongly recommend using the same spline stiffness for both the climate and tree-ring data.

#generating the correlation plots
silly_cor<-Climateplot_new(TRW,Climatesite,1950,2005,"Yes",50)
P1<-contourplot(t(silly_cor),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.825,0.825,0.05)),xlab="Months",ylab="Window length")
P2<-contourplot(t(silly_cor),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.80,0.80,0.05)))
PP<-P1+P2
print(PP)

This graph shows climate-growth correlations for different end months (along the x-axis) and different season lengths (y-axis). Positive correlations are shown in green and blue, and negative correlations in yellow to redish colors.

Figure 1: This graph shows climate-growth correlations for different end months (along the x-axis) and different season lengths (y-axis). Positive correlations are shown in green and blue, and negative correlations in yellow to redish colors.

Combining P1 and P2 allows for more flexibility for the plotting. One could add a third layer, specifying a certain threshold value, i.e. where p<0.01 via specifying that threshold (e.g., at=0.3) and increasing line width (e.g., lwd=1).

Correlation is the standardized regression coefficient, where both standard deviations of the tree-ring data and the climate data are set to 1. The correct interpretation is thus primarily noise-related. A high (absolute) correlation coefficient means there is little noise in a relationship. But it doesn’t allow users to draw conclusions on the actual magnitude of impact on radial tree growth. This is especially true, when one wants to compare climate impacts at two different sites, or for two species growing at the same site. Correlation is not directly related to climate sensitivity, at least not on its own. It’s only the fraction of signal explained.

To quantify actual climate impact, we need the unstandardized regression coefficient. In the example with precipitation, we multiply by 100, to return a relative growth change in response to a 100mm change in precipitation in the respective season (a bit more realistic than retrieving the impact for 1mm precipitation change).

First, the function:

#load required functions
Climateslopeplot_new<-function(TRW,Climatesite,fyr=tsp((Climatesite[[1]]))[1],lyr=tsp((Climatesite[[1]]))[2],detrended=c("Yes","No"),spline.length,weighted=TRUE){ #Climatesite=your climate data, TRW=Tree-ring chronology, fyr=first year, lyr=last year
   climatecor<-matrix(NA,nrow=length(Climatesite),ncol=22)
  if(detrended=="Yes"){
    tsstart<-tsp(Climatesite[[1]])[1]
    for (i in 1:length(Climatesite)){
      Climatesite[[i]]<-ts(apply(Climatesite[[i]],2,function(x)if(sum(complete.cases(x))>3){spline.det(ts(x),spline.length)$residualarray}else{rep(NA,length(x))}),start=tsstart)
      for(k in 1:22){
        if(sum(complete.cases(Climatesite[[i]][,k]))>3){
          if(weighted==T){
            w<-rlm(window(TRW,fyr,lyr)~window(Climatesite[[i]][,k],fyr,lyr))$w
            if(k<=12){
              climatecor[i,k]<-coef(dyn$lm(window(TRW,fyr+1,lyr)~window(Climatesite[[i]][,k],fyr+1,lyr),weights=w))[2]
            }else{
              climatecor[i,k]<-coef(dyn$lm(window(TRW,fyr,lyr)~window(Climatesite[[i]][,k],fyr,lyr),weights=w))[2]
            }
          }else{
            climatecor[i,k]<-coef(dyn$lm(window(TRW,fyr,lyr)~window(Climatesite[[i]][,k],fyr,lyr)))[2]
          }
        }else{
          next
        }
      }
    }
    levelplotmatrix<-climatecor
  }else{
    for (i in 1:length(Climatesite)){
      for(k in 1:22){
        if(sum(complete.cases(Climatesite[[i]][,k]))>3){
          if(weighted==T){
            w<-rlm(window(TRW,fyr,lyr)~window(Climatesite[[i]][,k],fyr,lyr))$w
            if(k<=12){
              climatecor[i,k]<-coef(dyn$lm(window(TRW,fyr+1,lyr)~window(Climatesite[[i]][,k],fyr+1,lyr),weights=w))[2]
            }else{
              climatecor[i,k]<-coef(dyn$lm(window(TRW,fyr,lyr)~window(Climatesite[[i]][,k],fyr,lyr),weights=w))[2]
            }
          }else{
            climatecor[i,k]<-coef(lm(window(TRW,fyr,lyr)~window(Climatesite[[i]][,k],fyr,lyr)))[2]
          }
        }else{
          next
        }}}
    levelplotmatrix<-climatecor
  }
  rownames(levelplotmatrix)<-names(1:12)
  colnames(levelplotmatrix)<-c(paste0("p",month.abb),month.abb[1:10])
  return(levelplotmatrix)
}

You can choose between ordinary least square regression, or the robust linear regression that down-weights outliers.

#generating the new contourplot
silly_slope<-Climateslopeplot_new(TRW,Climatesite,1950,2005,"Yes",50,weighted=F)
P1<-contourplot(t(silly_slope*100),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.305,0.305,0.01)),xlab="Months",ylab="Window length")
P2<-contourplot(t(silly_slope*100),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.30,0.30,0.01)))
PP<-P1+P2
print(PP)

Because we sum up individual months, the comparison is actually a bit unfair and biased towards yielding strongest impacts on short season lengths. The standard deviation of the annual time series (aggregated over 12 months) is ~6 times higher than at monthly periods, so 100 mm change at annual scale is not really a lot, while it is more than a standard deviation on the monthly scale.

#standard deviation of May precipitation
sd(Climatesite[[1]][,17],na.rm=T)
## [1] 44.72265

#standard deviation of Mar-Jun precipitation
sd(Climatesite[[4]][,18],na.rm=T)
## [1] 121.7759

#standard deviation of pSep-Aug precipation
sd(Climatesite[[12]][,20],na.rm=T)
## [1] 275.4319

#generating the new contourplot
P1<-contourplot(t(silly_slope*c(1:12)*100),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.425,0.425,0.05)),xlab="Months",ylab="Window length")
P2<-contourplot(t(silly_slope*c(1:12)*100),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.40,0.40,0.05)))
PP<-P1+P2
print(PP)

This yields very similar results compared to the aggregation via stat="mean" instead of stat="sum" and shows the results if every month in that season had 100 mm more precipitation. This nicely shows the effect of using sum or mean for precipitation aggregation and the derivation of regression coefficients.

Now let’s see what happens when we normalize each climate time series by its standard deviation.

#compute standard deviation per aggregation size ( sapply() ) and per each column in each aggregation step (function(x)apply(x,2,sd))
climate_sd<-sapply(Climatesite,function(x)apply(window(x,1950,2005),2,sd,na.rm=T))
P1<-contourplot(t(silly_slope*t(climate_sd)[,1:22]),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.105,0.105,0.01)),xlab="Months",ylab="Window length")
P2<-contourplot(t(silly_slope*t(climate_sd)[,1:22]),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.10,0.10,0.01)))
PP<-P1+P2
print(PP)

The graph looks pretty similar in its pattern to the correlation graph from above. But now we know how much relative growth change a standard deviation change in the respective climate time series invokes.

Let’s have a look at the same analysis using temperature data.

#upload temperature data
climate<-read.table("https://raw.githubusercontent.com/deep-org/materials/main/data/general_statistics/01_climate_cor_trw/tmean.txt",header=F)
climate[climate==-999.9]<-NA #transform -999.9 to NAs

#perform the analysis
Climatesite<-runningclimate_new(climate,stat="mean")
silly_cor<-Climateplot_new(TRW,Climatesite,1950,2005,"Yes",50)
P1<-contourplot(t(silly_cor),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.825,0.825,0.05)),xlab="Months",ylab="Window length")
P2<-contourplot(t(silly_cor),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.80,0.80,0.05)))
PP<-P1+P2
print(PP)


silly_slope<-Climateslopeplot_new(TRW,Climatesite,1950,2005,"Yes",50,weighted=F)
P1<-contourplot(t(silly_slope),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.105,0.105,0.01)),xlab="Months",ylab="Window length",main=title)
P2<-contourplot(t(silly_slope),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.10,0.10,0.01)))
PP<-P1+P2
print(PP)

This graph shows how much the relative tree growth changes for a 1°C increase in mean temperature (0.07=7%).

#perform the analysis
climate_sd<-sapply(Climatesite,function(x)apply(x,2,sd,na.rm=T))
P1<-contourplot(t(silly_slope*t(climate_sd)[,1:22]),region=T,contour=F,lwd=0.4,col.regions=colorRampPalette(c("red","yellow","white","green3","blue")),at=c(seq(-0.105,0.105,0.01)),xlab="Months",ylab="Window length",main=title)
P2<-contourplot(t(silly_slope*t(climate_sd)[,1:22]),labels=T,col.regions=F,region=F,lwd=0.4,at=c(seq(-0.10,0.10,0.01)))
PP<-P1+P2
print(PP)

Again, this graph shows the same pattern as in the correlation plot, but for normalized climate data and the un-normalized ring-width index, a percentage tree growth change per standard deviation change in temperature.

6. Some afterthoughts

The R package dendroTools from Jevšenak and Levanič (2018) does something very similar but uses daily climate data.

tutorial