|
|
Issue 3, March 2004
Physical Sciences & Mathematics
Combining Weather Data for a Dataset Sufficient for Generating
High-Resolution Weather Prediction Models
Jared Fox
Arizona State University
Advisor:
Steven Ghan, Ph.D.
Pacific Northwest National Laboratory
Discuss this article!
Abstract
Assessments of the effects of climate change typically require
information at scales of 10 km or less. In regions with complex
terrain, much of the spatial variability in climate (temperature,
precipitation, and snow water) occurs on scales below 10 km. Since
the typical global climate model simulation’s grid size is
more than 200 km, it is necessary to develop models with much higher
resolution. Unfortunately, no datasets currently produced are both
highly accurate and provide data at a sufficiently high resolution.
As a result, current global climate models are forced to ignore
the important climate variations that occur below the 200 km scale.
This predicament prompted the creation of a global hybrid dataset
with information for precipitation, temperature, and relative humidity.
The resulting dataset illustrated the importance of having high-resolution
datasets and gives clear proof that regions with complex terrain
require a fine resolution grid to give an accurate representation
of their climatology. For example, the Andes Mountains in Chile
cause a temperature shift of more than 25° C within the same
area as a single 2.5° grid cell from the NCEP dataset. Fortunately
the CRU, U.D., GPCP, and NCEP datasets, when hybridized, are able
to provide both precision and satisfactory resolution with global
coverage. This composite will enable the development of both high-resolution
models and quality empirical downscaling methods — both of
which are necessary for scientists to more accurately predict the
effects of global climate change. Without accurate long-term forecasts,
climatologists and policy makers will not have the tools they need
to effectively reduce the negative effects human activity have on
the earth.
Introduction
Assessments of the effects of climate change typically require
information at scales of 10 km or less. In regions with complex
terrain, much of the spatial variability in climate (temperature,
precipitation, and snow water) occurs on scales below 10 km (Daly
et al. 1994; Ghan et al. 2000; Gyalistras et al. 1998; Leung et
al. 1996). Current global climate model simulations typically use
grid sizes over 200 km due to limits in computational power (Ghan
et al. 2000; Delworth et al. 2000; Flato et al. 2000; Emori et al.
1999; Gordon et al. 2000; Russell et al. 1999; Zhang et al. 2000;
Washington et al. 2000). Since halving the grid size requires eight
times more computing power, and Moore’s Law indicates this
large increase in processing speed will take six years per halving,
reducing the grid size to 10 km would take about 25 years. This
computational limitation has prompted climatologists to develop
downscaling techniques, such as “empirical downscaling,”
that can produce high-resolution climatological predictions without
requiring such a massive increase in computational power. One method
of downscaling global climate models adds highly detailed information
about local geography (mountains are the most important added feature)
to the model. As demonstrated later in this paper, mountainous geography
has a very significant effect on weather conditions such as temperature
and precipitation. Since these two items are among the most essential
features to accurately predict, downscaling techniques may provide
great insight into future climatology. However, in order to create
and validate models that use these techniques, it is necessary to
first have a high-resolution, global dataset. Using this dataset,
work in this project will continue by creating a model that generates
its data via an empirical downscaling method.
Unfortunately,
there are no datasets currently produced that are both highly accurate
and provide data at a sufficiently high resolution. This predicament
prompted the author to create a global hybrid data set with information
for precipitation, temperature, and relative humidity. Since no
current dataset met the project’s requirements, the new dataset
is a hybrid — one that takes the best possible information
available from several different datasets and combines them in order
to make a highly accurate compilation. One valuable side benefit
is that the accuracy of the hybrid dataset is actually increased
because of the diversification of data sources — each dataset
acts as independent validation of the other datasets. The global
hybrid dataset discussed in this paper will be used as the base
for models based on empirical downscaling techniques.
One
of the largest concerns about global climate change is its effect
on rainfall and snow pack. These two providers of fresh water are
essential to life for both drinking and irrigation of crops. The
specific places where these items fall are very important. If snow
falls on a different side of a mountain, millions of people could
be suddenly without water. If consistent rainfall over agricultural
regions is moved only 50 km, millions of acres of farmland could
become barren. Accurate assessments of the impacts of climate change
typically require information at scales of 10 km or less (Ghan et
al. 2000). This composite will enable the development of both high-resolution
models and quality empirical downscaling methods — both of
which are necessary for scientists to more accurately predict the
effects of global climate change. Without accurate long-term forecasts,
climatologists and policy makers will not have the tools they need
to effectively reduce the negative impacts that human activity have
on the earth.
Materials and Methods
In order to
facilitate ease of distribution and simplicity of use of the hybrid
data set, the information was published in Network Common Data Form.
NetCDF is an interface for array-oriented data access and a library
that provides an implementation of the interface. The netCDF library
also defines a machine-independent format for representing scientific
data. Together, the interface, library, and format support the creation,
access, and sharing of scientific data. The netCDF software was
developed at the Unidata Program Center in Boulder, Colorado and
is freely available from the Unidata website (unidata.ucar.edu).
The first dataset used is a global 0.5° (about 55 km) precipitation
and air temperature monthly 1950-1999 dataset (land data only) developed
at the University of Delaware by Cort Willmott, Kenji Matsuura,
and David Legates from an analysis of station data. This analysis
only adjusts air temperature to account for the influence of topography
(Willmott and Matsurra 1995; Willmott et al. 1985). The second global
land dataset is the Climatic Research Unit 10-minute (about 18km)
1961-1990 climatology, also developed from station data (New et
al. 2002). The CRU has more topographic sensitivity than the U.D.
method and provides information for precipitation, surface air temperature,
relative humidity, and water vapor pressure, but only one climatological
value for each month.
Since
both of the datasets cover a slightly different time period and
have different spatial resolutions, the CRU and U.D. analyses were
combined by using the U.D. data to provide the temporal variability
and the CRU to provide the spatial variability. Specifically, for
each month and year, the climatological data in each 10 minute CRU
grid cell was scaled by the ratio of the U.D. data for that month
and each U.D. year to the U.D. climatological mean for that month
for the U.D. grid cell closest to the CRU grid cell. For example:
Hybrid(May1977) = CRUClim(May) * U.D.(May1977)/U.D.Clim(May)
The closest U.D. grid cell to a specific CRU grid cell is at most
0.25°
away in both latitude and longitude. The U.D. climatological mean
was determined from the average of the U.D. data over the CRU period,
1961-1990.
The
next step was to add data for the oceanic regions. Although coarser
resolution than the land data, the National Centers for Environmental
Prediction’s 2.5°
temperature and relative humidity climatologies (covering 1948-1998)
provided sufficient resolution for the aforementioned purposes.
For each missing value of the CRU/U.D. hybrid dataset, the data
from the nearest NCEP grid cell was inserted into the hybrid. For
any hybrid grid cell, the NCEP grid cell is at most 1.25°
away in both latitude and longitude, with the exception of the very
extreme north and south poles.
Finally,
oceanic precipitation coverage is provided by the Global Precipitation
Climatology Project’s 2.5°
V2 monthly precipitation dataset, covering 1979-2002. Again, although
less precise than the land data, 2.5°
is sufficient resolution for oceanic coverage. For each missing
value of the CRU/U.D. hybrid dataset, the data from the nearest
GPCP grid cell was inserted into the hybrid. For any hybrid grid
cell, the GPCP grid cell is at most 1.25°
away in both latitude and longitude.
The
end result is a highly accurate, high-resolution, global hybrid
dataset. The dataset contains temperature, precipitation, and relative
humidity information on a 10-minute (about 18 km grid size) resolution.
All
of the data was processed on a dual Pentium machine with 2GB RAM
running Redhat Linux. The netCDF files were created using the netCDF
Java Interface Version 2 and the Java 1.4 specification.
Results
Plots of the hybrid data best show the success of the new dataset.
The plots illustrate the importance of high-resolution data, and
the close correlation of the source datasets to each other. Figure
1 illustrates just how large is the difference in resolution between
older models and the current one. Figure 2 shows that the datasets
used are indeed quite compatible. Figures 3, 4, and 5 illustrate
that fine resolution datasets are essential to truly capture the
details of weather patterns.
|
| Figure
1 . Temperature climatology plot. The most obvious
item to notice here is the difference in grid cell resolution.
The sections over land have data on a 10-minute scale (about
18 km), while the sections over the oceans are on a 2.5°
(about 275 km) scale.
|
|
| Figure
2. Climate trends over portions of Europe, Russia,
Africa, and the Middle East. This view demonstrates that the
general climate trends for regions of land alongside the oceans
continue out into the oceanic regions. This trait indicates
that the various datasets used in the hybrid are in fact compatible,
despite their differences in resolution.
|
|
| Figure
3. Climate over the Andes Mountains of Chile. This
figure illustrates that regions with complex terrain require
a fine resolution grid to give an accurate representation
of their climatologies. The Andes Mountains in Chile cause
a temperature shift of over 25° C within the same area
as a single 2.5° grid cell from the NCEP dataset. The
CRU/U.D. hybrid dataset with 10-minute grid cells makes the
distinction easily recognized.
|
|
| Figure
4. Climate over the Andes Mountains of Chile. This
figure also illustrates the effect of complex terrain on a
region’s climatology. In Figure 3, the Andes Mountains
in Chile caused the temperature in that region to reside well
below the temperature of the surrounding areas. In this case,
the Andes are preventing large amounts of rainfall on top
of them, and instead are helping to trap large amounts of
moisture in the surrounding areas.
|
|
| Figure
5. Climate over the United States. This figure highlights
the importance of fine-resolution grid cells. Many of the
high-precipitation areas on the land in this image would be
easily hidden if they were contained in one of the massive
grid cells just off of the eastern coast of the United States. |
Discussion
As stated previously,
it is essential that the size of the grid cell is small enough to
accurately represent the underlying climate conditions. Without
sufficiently fine resolution, many important details will be omitted,
and that can definitely introduce inaccuracies into any prediction
models based upon such a dataset. Additionally, in the case of empirical
downscaling methods, a highly detailed and highly accurate dataset
is a must. Regrettably, none of the currently available datasets
can provide both suitably detailed resolution and dependable accuracy.
Fortunately the CRU, U.D., GPCP, and NCEP datasets, when hybridized,
are able to provide both precision and satisfactory resolution with
global coverage. This composite will enable the development of both
high-resolution models and quality empirical downscaling methods.
The hybridization process was designed specifically to make the
resulting data as accurate as possible. Using multiple data sources
enhances the reliability of the data since each data source provides
independent confirmation that the other data sources are accurate.
While rounding errors are a potential concern for many computationally
generated datasets, the hybridization was done with floating point
numbers and algorithms that minimized the effects of round off errors
by paying careful attention to the order of the calculations. Additionally,
after the hybrid was completed, it was compared to the original
data sources and no significant differences were observed. Data
for the oceanic regions were not available in high-resolution, but
climatological differences over water take place over much greater
areas than on land, so the oceanic resolution is sufficient for
our purposes. It is important to note that translating low-resolution
data into its high-resolution equivalent loses no information. In
fact, the accuracy is increased because this new data is scaled
by an independent high-resolution data source. All of these factors
combined to yield a very accurate, high-resolution, global, hybrid
dataset.
The
topics of global climate change and its most famous component, global
warming, are heavily charged issues in both the scientific and political
communities. The EPA has estimated that if humans continue to release
large amounts of greenhouse gases, the US alone will reap damages
in the hundreds of billions of dollars annually — roughly
the equivalent of $1 per gallon of oil burned (Titus 1992). This
estimate only refers to the cost of crops lost to drought, increased
irrigation and desalinization, etc. On the other hand, major corporations
make hundreds of billions of dollars annually by continuing to sell
products that release these same greenhouse gases. Research has
shown recent changes in the global climate and predicts that these
trends will continue, but the results have not been conclusive enough
to convince corporations and policy makers to look beyond their
short-term special interests. The climate prediction models based
upon this hybrid dataset have the potential to provide a much clearer
picture of our future. If the delicate balance of life-sustaining
conditions on Earth is being disrupted, it is vital that humans
have as much information about these changes as possible. Without
accurate information, we are severely handicapped in our efforts
to combat any serious consequences.
Acknowledgements
I would like to thank the U.S. Department of Energy and the National
Science Foundation for giving me the opportunity to participate
in the Student Undergraduate Laboratory Internships program and
the chance to have an incredible learning experience at Pacific
Northwest National Laboratory. I am giving special thanks to my
mentor, Dr. Steven Ghan for giving me such significant work to do
and for providing positive feedback. Thanks to everyone for letting
me work on such an enjoyable and meaningful project.
Discuss this article!
References
Daly,
C et al. (1994) A statistical-topographic model for mapping climatological
precipitation over mountainous terrain. Journal of Applied Meteorology
33: 140-158.
Delworth, TL and TR Knutson (2000) Simulation of early 20th century
global warming. Science 287: 2246-2250.
Flato, GM Boer et al. (2000) The Canadian Centre for Climate Modelling
and Analysis global coupled model and its climate. Climate Dynamics
16: 451-467.
Emori, S et al. (1999) Coupled ocean-atmosphere model experiments
of future climate change with an explicit representation of sulfate
aerosol scattering. Journal of the Meteorological Society 77: 1299-1307.
Ghan, SJ et al. (2000) The thermodynamic influence of subgrid orography
in a global climate model. Climate Dynamics: 20: 31-44.
Gordon, C et al. (2000) The simulation of SST, sea ice extents
and ocean heat transports in a version of the Hadley Centre coupled
model without flux adjustments. Climate Dynamics 16: 147-168.
Gyalistras, D et al. (1998) Future Alpine climate. In: Cebon, P
et al. (eds) Views from the Alps. Regional perspectives on climate
change. Massachusetts: MIT Press, Cambridge.
Leung, LR and SJ Ghan (1995) A subgrid parametrization of orographic
precipitation. Theoretical and Applied Climatology 52: 95-118.
Leung, LR et al. (1996) Application of a subgrid orographic precipitation/surface
hydrology scheme to a mountain watershed. Journal of Geophysical
Research 101: 12,803-12,817.
Leung, LR and SJ Ghan (1998) Parametrizing subgrid orographic precipitation
and surface cover in climate models. Monthly Weather Review 126:
3271-3291.
New, M et al. (2002) A high-resolution data set of surface climate
over global land areas. Climate Research 21: 1-25.
Russell, GL and D Rind (1999) Response to CO2 transient increase
in the GISS coupled model: regional coolings in a warming climate.
Journal of Climate 12: 531-539.
Titus, JG (1992) The Costs of Climate Change to the United States.
Global Climate Change: Implications, Challenges, and Mitigation
Measures, Pennsylvania Academy of Sciences, 1992.
Washington, WM et al. (2000) Parallel climate model (PCM) control
and transient simulations. Climate Dynamics 16: 755-774.
Willmott, CJ et al. (1985) Small-scale climate maps: A sensitivity
analysis of some common assumptions associated with grid-point interpolation
and contouring. American Cartographer 12: 5-16.
Willmott, CJ and K Matsuura (1995) Smart interpolation of annually
averaged air temperature in the United States. Journal of Applied
Meteorology 34: 2577-2586.
Zhang, XH et al. (eds) (2000) IAP global atmosphere-land system
model. Science Press. Beijing, China.
Journal of Young
Investigators. 2004. Volume Ten.
Copyright © 2004 by Jared Fox and JYI. All rights reserved.
|
|