Spatial statistics of lightning strikes 4

Continuing from part three where I plotted lightning strikes over Alaska in June and July of 2012 color coded by strike multiplicity and strike strength, in this post I begin the transition from descriptive statistics to inferential statistics. This is still exploratory data analysis, however, in the sense that I know nothing arrived at here will be conclusive. Statistical modeling in general, including the spatial statistics performed on the lightning data here using variogram modeling and kriging, is a process that gradually transitions from exploratory to confirmatory in nature.

The first step is to move beyond considering only locations (part two: kernel density estimation) and strictly descriptive analysis of covariates (part three: visualizing strike multiplicity and strength) by exploring some basic, initial variogram models. Below I fit univariate variograms to strike multiplicity and strike strength separately. I know that this is questionable at this stage, but let’s get our feet wet with the basic R code. For one thing, my initial attempt at fitting a cross-variogram that combines both multiplicity and strength is invalid, yielding a negative semi-variance. So all these results are meaningless, nor would I expect them to mean much even if the model were valid. But again, all we want to accomplish for now is to build some variogram models. However, since we are transitioning into spatial statistics, it’s assumed the reader is versed in the subject. Otherwise I would stick to EDA and visualizations (posts one through three).

#### Kriging of lightning strike multiplicity and lightning strike strength
#### Code would look something like this, based on previous posts:

library(gstat)
library(raster)

## create a grid onto which we will interpolate, where 'b' is our raster layer:
grd <- expand.grid(x=seq(from=xmin(b), to=xmax(b), length=1374), y=seq(from=ymin(b), to=ymax(b), length=1450) )
## convert to SpatialPixel class
coordinates(grd) <- ~ x+y
gridded(grd) <- TRUE

## make gstat object and variogram:
g <- gstat(id="multiplicity", formula=multiplicity ~ 1, data=d, nmax=5, maxdist=100000)
g <- gstat(g, id="strength", formula=strength ~ 1, data=d)
v <- variogram(g)

## another approach:
# create directional variograms at 0, 45, 90, 135 degrees from north (y-axis)
v <- variogram(g, alpha=c(0,45,90,135))

## Get your bearings/Fit 'by eye'
plot(v)

## More options:
g <- gstat(id="multiplicity", model = vgm(0.3, nugget=0.5, model='Exp', range=10000, anis=c(0, 0.5)))
g <- gstat(id="strength", formula=strength ~ 1, data=d, model = vgm(6000, nugget=18000, model='Exp' , range=100000, anis=c(0, 0.5)))
g <- gstat(g, id = c("multiplicity", "strength"), model = vgm(40, model='Exp', range=100000, anis=c(0, 0.5)))

plot(v, model = g$model, main = "models fitted by eye")

## I'm going to do it this way:
z.m <- krige(formula=multiplicity ~ 1, locations=d, newdata=grd, model = vgm(0.3, nugget=0.5, model='Exp', range=10000, anis=c(0, 0.5)), nmax=10)
z.s <- krige(formula=strength ~ 1, locations=d, newdata=grd, model = vgm(6000, nugget=18000, model='Exp' , range=100000, anis=c(0, 0.5)), nmax=10)

z.m <- as.data.frame(z.m)
z.s <- as.data.frame(z.s)
z.e <- as.data.frame(z.e)
names(z.m)[3:4] <- c("multiplicity.pred","multiplicity.var")
names(z.s)[3:4] <- c("strength.pred","strength.var")
names(z.e)[3:4] <- c("elevation.pred","elevation.var")

This is not reproducible code. It’s merely intended to give you an idea of some ways to use gstat very basically. And as in previous posts, here is a basic example of what some R code would look like to accomplish the plot styles shown below. As before, I iterated over the files/months of years in a j loop, this time to plot kriged lightning strike multiplicity and strength for each j.

bk <- brick(b, values=F, nl=2)
bk <- setValues(bk, z.m$multiplicity.pred, layer=1)
bk <- setValues(bk, z.s$strength.pred, layer=2)
rk.multiplicity <- flip(raster(bk, 1), 2)
rk.multiplicity <- mask(rk.multiplicity, r) # r: raster layer with AK state boundary
rk.strength <- flip(raster(bk, 2), 2)
rk.strength <- mask(rk.strength, r)

library(rasterVis)

for(j in 1:length(files)){

#Create a PNG for month j
rtheme1 <- streamTheme(region=rev(brewer.pal(9,'YlOrRd')), panel.background=list(col='gray20'), pch=20, col="orange", cex=1.5)
colkey.at <- seq(1,summary(rk.multiplicity)[5],length.out=length(rtheme1$regions$col)-1)
colkey.col <- rtheme1$regions$col
png(file.path(plotDir,paste0("lightning_multiplicity_kriged_",yrs[j],"_",mos[j],".png")), width=2400, height=2400, res=300)
levelplot(rk.multiplicity, par.settings=modifyList(rtheme1,list(layout.heights=list(top.padding=0, bottom.padding=-1))),
    cuts=19, maxpixels=2e6, margin=F, contour=F,
    main=list(paste(month.abb[as.numeric(mos[j])],yrs[j],"Lightning strike multiplicity")), scales=list(draw=FALSE),
    colorkey=list(space="right", at=colkey.at, col=colkey.col)
)
dev.off()

rtheme1 <- streamTheme(region=rev(brewer.pal(9,'YlOrRd')), panel.background=list(col='gray20'), pch=20, col="orange", cex=1.5)
colkey.at <- seq(summary(rk.strength)[1], summary(rk.strength)[5], length.out=length(rtheme1$regions$col)-1)
colkey.col <- rtheme1$regions$col
png(file.path(plotDir,paste0("lightning_strength_kriged_",yrs[j],"_",mos[j],".png")), width=2400, height=2400, res=300)
levelplot(rk.strength, par.settings=modifyList(rtheme1,list(layout.heights=list(top.padding=0, bottom.padding=-1))),
    cuts=19, maxpixels=2e6, margin=F, contour=F,
    main=list(paste(month.abb[as.numeric(mos[j])],yrs[j],"Lightning strike strength")), scales=list(draw=FALSE),
    colorkey=list(space="right", at=colkey.at, col=colkey.col)
)
dev.off()

They’re pretty. Pretty useless. But pretty. Anyway, this gives you a sense of the process, facilitated in R with packages like gstat, raster, lattice, and rasterVis. One reason I’ve used completely unexciting data here is that this often happens in everyday analysis and this is just the seedling of a real project. Realistically it would be more interesting and informative to compare lightning activity to other variables such as elevation. With a statistical model accounting for such a relationship between, say, likelihood of lightning strike and other non-lightning factors, a spatial variogram could be fitted to the residuals of the model for example. Another reason is that I can’t simply put a full-fledged analysis on the blog. Some things are more suited for other information channels.

lightning_multiplicity_kriged_2012_06_800x800

lightning_multiplicity_kriged_2012_07_800x800

lightning_strength_kriged_2012_06_800x800

lightning_strength_kriged_2012_07_800x800

This entry was posted by Matt Leonawicz.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: