Summary and Keywords
The climate system consists of interactions between physical, biological, chemical, and human processes across a wide range of spatial and temporal scales. Characterizing the behavior of components of this system is crucial for scientists and decision makers. There is substantial uncertainty associated with observations of this system as well as our understanding of various system components and their interaction. Thus, inference and prediction in climate science should accommodate uncertainty in order to facilitate the decision-making process. Statistical science is designed to provide the tools to perform inference and prediction in the presence of uncertainty. In particular, the field of spatial statistics considers inference and prediction for uncertain processes that exhibit dependence in space and/or time. Traditionally, this is done descriptively through the characterization of the first two moments of the process, one expressing the mean structure and one accounting for dependence through covariability.
Historically, there are three primary areas of methodological development in spatial statistics: geostatistics, which considers processes that vary continuously over space; areal or lattice processes, which considers processes that are defined on a countable discrete domain (e.g., political units); and, spatial point patterns (or point processes), which consider the locations of events in space to be a random process. All of these methods have been used in the climate sciences, but the most prominent has been the geostatistical methodology. This methodology was simultaneously discovered in geology and in meteorology and provides a way to do optimal prediction (interpolation) in space and can facilitate parameter inference for spatial data. These methods rely strongly on Gaussian process theory, which is increasingly of interest in machine learning. These methods are common in the spatial statistics literature, but much development is still being done in the area to accommodate more complex processes and “big data” applications. Newer approaches are based on restricting models to neighbor-based representations or reformulating the random spatial process in terms of a basis expansion. There are many computational and flexibility advantages to these approaches, depending on the specific implementation. Complexity is also increasingly being accommodated through the use of the hierarchical modeling paradigm, which provides a probabilistically consistent way to decompose the data, process, and parameters corresponding to the spatial or spatio-temporal process.
Perhaps the biggest challenge in modern applications of spatial and spatio-temporal statistics is to develop methods that are flexible yet can account for the complex dependencies between and across processes, account for uncertainty in all aspects of the problem, and still be computationally tractable. These are daunting challenges, yet it is a very active area of research, and new solutions are constantly being developed. New methods are also being rapidly developed in the machine learning community, and these methods are increasingly more applicable to dependent processes. The interaction and cross-fertilization between the machine learning and spatial statistics community is growing, which will likely lead to a new generation of spatial statistical methods that are applicable to climate science.
Access to the complete content on Oxford Research Encyclopedia of Climate Science requires a subscription or purchase. Public users are able to search the site and view the abstracts and keywords for each book and chapter without a subscription.
If you have purchased a print title that contains an access token, please see the token for information about how to register your code.