This document provides information about the content and the structure of gridded precipitation files for the Greater Alpine Region (GAR) developed by the Climatic Research Unit of UEA in the framework of the ALP-IMP project. Dataset Summary ---------------------------------------------------------------------- File name - Quantity - Units - Dataset dimensions ---------------------------------------------------------------------- alp_pre_dat.txt precipitation mm/month 12 x 205 x 90 x 36 alp_pre_acc.txt explained variance unitless (%) 12 x 205 x 90 x 36 Gridding Method --------------- The gridding is based on the angular-distance-weighted interpolation of observed (HISTALP database) and reconstructed (by EOF-analysis of observations) monthly precipitation totals. The interpolation has been applied on a month-to-month basis (i.e. from January 1800 to December 2003) by processing multiplicative anomalies (i.e. expressed in per cent) with regard to the 1971-1990 monthly climatology field. For each grid point, data from the 3 nearest, significantly inter-correlated to each other station series were selected for interpolation. For years before 1927, when the station network gradually becomes sparser, the station series selected needed to be correlated to the reference period (1931- 2000) gridded precipitation series too. No interpolation was attempted if these conditions were not met (a case in the early part of the 19th century, particularly for summer months), which resulted in missing values in the dataset developed. The gridded anomalies were finally merged with an ETH (Zurich)-provided gridded climatology (1971-1990). Note that the ETH climatology was geographically extended to cover central Italy and the eastern part of the GAR by analyzing an ensemble of climatic data from a station network less dense than what used for developing the original ETH climatology. Precipitation Dataset Structure ------------------------------- The area covered by the file alp_pre_dat.txt ranges from 4deg to 19deg East and from 43deg to 49deg North. The grid resolution is 10 min, i.e. there are 6 grid- cells within any 1-deg interval, both in longitude and latitude, which are centred at 5min, 15min, 25min, 35min, 45min, 55min. Thus there are 90 samples per zonal row, from 4deg 5 min to 18deg 55min, and 36 samples per meridional row, from 43deg 5 min to 48deg 55min. Each sample value of the file corresponds to a monthly precipitation total. For every grid point the monthly value storing order follows the calendar month sequence, i.e. from January to December, for each year from 1800 to 2004. For instance, the nearest-to-Zurich gridded precipitation time series is stored as: East: 8 deg 35 min, North: 47 deg 25 min 1800 57 29 42 34 218 136 95 92 143 84 175 39 1801 69 88 124 89 146 146 169 63 201 76 89 150 1802 49 110 59 63 104 71 152 75 45 61 130 75 1803 35 62 77 102 151 201 111 80 47 74 180 145 1804 81 89 64 134 98 94 258 126 92 80 162 90 . . . 2002 31 115 66 71 176 89 105 129 126 134 192 82 2003 74 40 31 56 111 66 104 80 42 137 74 42 2004 999999999999999999999999999999999999999999999999999999999999999999999999 The first line provides the grid-point coordinates (longitude_deg, longitude_min, latitude_deg, latitude_min in 6x,i2,5x,i2,13x,i2,5x,i2 FORTRAN format). This line is preceded and followed by one blank line. Then the precipitation data are stored. The first column of the 205 consecutive data lines gives the year number (from 1800 to 2004), which is succeeded by 12 monthly precipitation values in mm/month units (in i4,x,12i6 FORTRAN format). The missing values are denoted by the default value 999999. The grid-point sequence runs zonally from West to East and then from South to North. Data Quality ------------ There is also an accompanying file, named alp_pre_acc.txt, which provides the relative explained variance score with respect to the 1931-2000 precipitation field (which is a period of full availability of station network observations). The scores were estimated by applying the same weights used for the reconstruction and interpolation throughout the 19th-20th centuries to the time series during the 1931-2000 reference period and then by comparing the outcome with the original 1931-2000 interpolated time series. The file structure is the same with the monthly precipitation file. The maximum value is 100(%), assigned to the 1931-2000 interpolated field, whereas missing values are again denoted by the default value 999999. For Zurich, the explained variance scores are: East: 8 deg 35 min, North: 47 deg 25 min 1800 68 81 75 62 70 52 61 61 71 78 60 71 1801 78 89 82 70 73 55 67 63 74 84 87 80 1802 78 89 82 70 73 55 67 63 74 84 87 80 1803 84 88 83 69 76 58 71 64 73 85 88 81 1804 84 88 83 69 76 58 71 64 73 85 88 81 . . . 2002 100 100 100 100 100 100 100 100 100 100 100 100 2003 100 100 100 100 100 100 100 100 100 100 100 100 2004 999999999999999999999999999999999999999999999999999999999999999999999999 The explained variance score is a nearly monotonically increasing function of time from 1800 to 1927 for each individual calendar month. Higher scores are found in winter months, whereas the lowest scores are characterizing the June and July time series. This difference is a consequence of the varying spatial coherency of the precipitation field throughout the annual cycle. From 1927 to 2003, when the station network data reaches its maximum availability, the explained variance score is 100 (with an exception for January 2003 due to a missing station datum). Not all the gridded precipitation data during the 19th & 20th centuries are usable: a realistically constructed gridded precipitation time series should have positive explained variance scores. Moreover, a meaningful construction appears to correspond to scores of at least 60(%). Although the scores increase with time (due to denser station network), in the first decades of the 19th century and in individual years afterwards, the scores sometimes evolve anomalously: drops of score, most of them less than 5(%), are found between successive years. This phenomenon results from the reconstruction/ gridding method which gives some preference in interpolating to the nearest station data. Hence, as the data-providing station network becomes denser through the course of years, some nearby data selected, either for station time series reconstruction or grid-point time series construction, come from station time series which may be less correlated with the time series under construction (in the reference period i.e., from 1931 to 2000) than some distant time series. Therefore, the resultant reconstruction & gridding may have lower scores than if distant data were used. In some cases there is a clear physical reason for the higher correlation of distant time series than the nearby ones: e.g. precipitation field partitioning into zones due to the presence of mountain chains. Particular adaptations have been applied to the data analysis method to moderate such effects. However the adaptations didn’t aim to eliminate every anomalous score evolution, since the score is estimated within a specific reference period (1931-2000) which may not be representative of the exact precipitation field structure in previous decades. By assuming that the precipitation field is to some extent non-stationary, we preferred to select nearby data, even if they were from less-correlated time series (provided that the correlation difference was not large), since distant correlations are, in general, less stable in time. The data for 2004 were not processed, since only a few station monthly precipitation totals were available at the time of the current gridded precipitation datasets development. This is why default missing values (999999) are found in all monthly precipitation totals and their corresponding explained variance scores. Reference --------- Efthymiadis, D., P. D. Jones, K. R. Briffa, I. Auer, R. Böhm, W. Schöner, C. Frei, and J. Schmidli (2006), Construction of a 10-min-gridded precipitation data set for the Greater Alpine Region for 1800–2003, Journal of Geophysical Research, 110, D01105, doi:10.1029/2005JD006120.