ClimTools Home | Paper | Overview | User Manual | File Formats | Functions |
Dimitrios Gyalistras
Systems Ecology Group, ETH Zurich, Switzerland
Version 1.0, 4. February 2004
A SDT-formatted file contains a simple header that describes the overall data set, plus a data table. The data table's header consists of a series of column identifiers which specify a list of attributes that are considered at all sites. The header is followed by an arbitrary number of data lines, one per location. Missing values for numerical (i.e., INTEGER or REAL-valued attributes) are denoted by the string "NA".
In order to allow for the unique identification of a site, a valid SDT data table must have either two columns named
xCoord and yCoord (of type REAL),
or one column named
SiteId (of type INTEGER).
Further a SDT table table may have an optional column named
SiteDescr (of type STRING),
plus any other columns of any type (BOOLEAN, INTEGER, REAL, STRING, or IDENTIFIER).
The mandatory columns may occur at any order and place within the data table. All data within one column must be of one and the same type. Otherwise any combinations of different types of columns are possible.
Note, all attributes per location must occur on one and the same line.
Otherwise no format restrictions apply, i.e. data points may be separated
by one or several tabulators and/or spaces etc. All information enclosed within
"(*" and "*)" is recognized as a comment and is skipped. Comments can
occur anywhere within a SDT-formatted file. Nested comments are possible.
SDTFile
=
SDTHeader DataTable.
SDTHeader
=
"SITE_DATA" dataDescription.
dataDescription
=
STRING.
DataTable
=
TableHeader TableLine {TableLine} "END".
TableHeader
=
IDENTIFIER {IDENTIFIER} lineSeparator.
TableLine
=
TableEle {TableEle} lineSeparator.
TableEle
=
BOOLEAN|INTEGER|REAL|STRING|IDENTIFIER|"NA".
lineSeparator
=
EOL.
Ex. 1: A SDT with xCoord and yCoord columns
SITE_DATA "My Data 1" xCoord yCoord Z 6.0 45.0 1201.0 7.0 46.0 2345.0 8.0 47.0 987.0 9.0 46.0 -20.0 10.0 45.0 839.0 11.0 46.0 499.0 12.0 45.0 1207.0 ENDEx. 2: A SDT with a SiteId column
SITE_DATA "Test Data 2" SiteId Z 1011 1201.0 -2103 2345.0 -2760 987.0 4041 NA 999 839.0 6061 NA 4071 1207.0 ENDEx. 3: A SDT with all kinds of columns
SITE_DATA "Some Swiss precipitation stations" SiteId SiteDescr Elevation xCoord yCoord 20 "SEDRUN" 1450 701900.0 170900.0 60 "Disentis" 1190 708230.0 173780.0 470 "SERTIG-BUEEL" 1710 783240.0 179830.0 475 "Monstein" 1575 778080.0 176230.0 490 "LATSCH" 1585 777140.0 167290.0 5350 "ZWEISIMMEN" 960 594800.0 155730.0 9930 "Scuol(Schuls)" 1295 817470.0 186600.0 9990 "Muestair" 1248 831170.0 169340.0 END
A DSD-formatted file contains at least one data set that consists of a simple header describing the station and variable considered, plus an arbitrary number of data vectors with exactly 34 elements each. The first two elements of a data vector specify the year and month considered, the third element specifies the maximum number of days containing valid data for that particular month (28, 29, 30, or 31), and the remaining vector elements contain the daily data for all days of the month. If a month has less then 31 days, missing values are expected for all days up to day 31. Missing values are coded throughout as "NA".
The individual data records need not be sorted by year and month. However, year numbers outside the range specified in the header are not accepted.
There are no format restrictions as long the syntax given below is not
violated, i.e. line breaks can occur anywhere, data points may be separated
by tabulators and/or spaces etc. All information enclosed within "(*" and "*)"
is recognized as a comment and is skipped. Comments can occur anywhere
within a DSD-formatted file. Nested comments are possible.
DSDFile
=
DataSet {DataSet}.
DataSet
=
Header DataRecord {DataRecord}.
Header
=
"#" stationIdent stationName variableIdent firstYear lastYear longitude latitude altitude.
stationIdent
=
INTEGER.
stationName
=
IDENTIFIER|STRING.
variableIdent
=
IDENTIFIER.
firstYear
=
INTEGER.
lastYear
=
INTEGER.
longitude
=
REAL.
latitude
=
REAL.
altitude
=
INTEGER.
DataRecord
=
yearNr monthNr numDays dataPoint {dataPoint}.
yearNr
=
INTEGER.
monthNr
=
INTEGER.
numDays
=
INTEGER.
dataPoint
=
INTEGER|REAL|"NA".
# 5520 BERN_LIEBEFELD Precip (*mm*) 1994 1997 7.421 46.929 570 1994 7 31 0.00 0.00 0.00 0.41 0.00 0.60 0.00 0.05 0.00 0.00 0.00 0.00 0.00 0.10 0.00 0.00 0.02 1.61 0.75 0.02 0.01 0.00 0.01 0.00 0.03 0.00 0.00 0.16 0.02 0.00 0.00 1994 8 31 0.11 0.00 0.00 0.00 0.00 0.00 1.33 0.00 0.00 2.85 0.90 0.00 1.21 0.00 0.00 0.05 1.23 0.00 0.00 0.00 0.17 0.00 0.11 0.90 0.11 0.00 0.00 0.00 0.00 0.12 1.66 1994 9 30 2.74 0.21 0.00 0.00 0.00 0.05 0.07 2.84 0.22 0.01 0.31 2.91 0.00 0.98 2.58 1.11 0.23 0.03 0.00 0.43 0.01 0.00 0.00 0.00 0.00 0.60 0.08 0.00 0.00 0.01 NA 1995 4 30 0.38 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.42 0.00 0.00 0.07 0.94 0.00 0.00 0.66 0.05 0.00 0.00 0.00 0.15 0.97 1.12 0.39 0.00 0.12 0.33 NA 1995 5 31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.16 0.00 0.00 0.11 3.15 3.03 0.29 0.63 0.99 0.94 1.11 0.01 0.01 0.00 0.00 0.00 0.00 0.81 2.25 0.00 0.00 0.63 2.80 1.95 1996 11 30 0.00 0.00 0.00 0.00 1.04 0.02 0.97 0.00 0.00 0.46 0.00 0.80 2.30 0.04 0.00 0.00 0.00 0.71 0.41 0.42 0.19 0.00 0.29 0.08 0.84 0.35 2.34 0.09 2.37 0.95 NA 1997 1 31 1.04 0.31 0.00 0.00 0.03 0.00 0.00 0.03 0.56 0.02 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.01 2.06 0.46 0.00 0.00 0.00 0.00 0.00 0.02 0.00 0.00 0.00 0.00 1997 2 28 0.00 0.00 0.00 0.00 0.56 0.00 0.00 0.00 0.00 0.02 0.39 0.02 0.29 0.74 0.63 0.00 0.00 0.86 0.00 0.40 0.00 0.00 0.23 0.18 0.14 1.95 0.05 0.00 NA NA NA 1997 3 31 0.00 0.00 0.00 0.10 0.03 0.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.14 0.00 0.03 0.00 0.36 0.86 0.00 0.00 0.00 0.18 0.04 0.00 0.00 0.00 0.47 0.10 0.00 0.00
A MAT-formatted file consists of header information plus the actual data matrix. The header may optionally include a data type and a code number, plus a descriptor string. The MAT format further supports optional column and/or row labels, and the specification of an arbitrary "no data" string used to denote missing data points. The default "no data" string is "NA".
There are no format restrictions, with the only exception that all column
labels must occur on the same line. All information enclosed within "(*"
and "*)" is recognized as a comment and is skipped. Comments can occur
anywhere within a MAT-formatted file. Nested comments are possible.
MatrixFile
=
MatrixHeader MatrixData.
MatrixHeader
=
[OptionalInfo] [NoDataSpecif] RowsColsSpecif [ColumnLabels].
OptionalInfo
=
"MATRIX" (TypeSpecif CodeSpecif descriptor)|(TypeSpecif descriptor)|(CodeSpecif descriptor)|descriptor.
TypeSpecif
=
"TYPE" type.
type
=
INTEGER.
CodeSpecif
=
"CODE" code.
code
=
INTEGER.
descriptor
=
STRING.
NoDataSpecif
=
"NODATA_STR" missingValCode.
missingValCode
=
IDENTIFIER|INTEGER|REAL.
RowsColsSpecif
=
"N_ROWS" numRows "N_COLS" numCols.
numRows
=
INTEGER.
numCols
=
INTEGER.
ColumnLabels
=
columnLabel {columnLabel} EOL.
columnLabel
=
IDENTIFIER.
MatrixData
=
[rowLabel] dataPoint {dataPoint}.
rowLabel
=
IDENTIFIER.
dataPoint
=
IDENTIFIER|INTEGER|REAL.
Ex. 1: A minimaly specified matrix
N_ROWS 2 N_COLS 3 1.1 1.2 NA -2.1 +2.2 -2.3Ex. 2: A fully specified matrix
MATRIX TYPE 111 CODE -111 "The matrix description" NODATA_STR NAN N_ROWS 3 N_COLS 3 Col1 Col2 Col3 Row1 1.1 1.2 1.3 Row2 -2.1 -2.2 -2.3 Row3 +3.1 NAN +3.3Ex. 3: A matrix with two columns and no row labels
MATRIX 'This is "my matrix"' NODATA_STR -99.999 N_ROWS 3 N_COLS 2 TheCol1 TheCol2 1.1 1.2 -99.999 2.2 +3.1 -99.999Ex. 4: A matrix with only one row and no column labels
NODATA_STR xxx N_ROWS 1 N_COLS 5 TheRow 1.0 2.0 xxx -4.0 5.5
A GDX-formatted file consists of header information, plus an arbitrary number of data vectors. The first element of a data vector is interpreted as a date (given in YYYYMMDD format), the second element as a time (given in HHMM format). The remaining elements are expected to contain the field data, which must run from N to S and E to W, i.e. the third vector element corresponds to the "top left", the last one to the "bottom right" field point. The data vectors must be sorted ascendingly by date and time, and no missing values for the date and time codes are accepted.
The GDX-format presents an extension of the MAT (matrix) format. Accordingly, optional column and/or row labels, and the specification of an arbitrary string used to denote missing data points are supported. The default "no data" string is "NA".
There are no format restrictions, i.e. line breaks can occur anywhere,
data points may be separated by tabulators and/or spaces etc. All information
enclosed within "(*" and "*)" is recognized as a comment and is skipped.
Comments can occur anywhere within a GDX-formatted file. Nested comments are
possible.
GDXFile
=
GDXHeader GDXData.
GDXHeader
=
GeneralInfo LonLatInfo MatrixHeader.
GeneralInfo
=
"FIELD" descriptor "CODE" code "LEVEL" level.
descriptor
=
STRING.
code
=
INTEGER.
level
=
INTEGER.
LonLatInfo
=
"LONGITUDES" nLon minLon maxLon "LATITUDES" nLat minLat maxLat.
nLon
=
INTEGER.
minLon
=
REAL.
maxLon
=
REAL.
nLat
=
INTEGER.
minLat
=
REAL.
maxLat
=
REAL.
MatrixHeader
(see the MAT format)
GDXData
=
dateCode timeCode dataPoint {dataPoint}.
dateCode
=
REAL.
timeCode
=
REAL.
dataPoint
=
REAL.
FIELD "A small field" CODE -100 LEVEL 200 LONGITUDES 4 -5.0 10.0 (* 5E to 10W *) LATITUDES 2 35.0 40.0 (* 35N to 40N *) N_ROWS 5 N_COLS 10 19810800 0 1.22 6.4 7.1 4.768 1.09 2.738 7.5 2.07 19820800 0 2.8 7.5 NA NA 6.00 1.583 2.7 2.27 19830800 0 1.01 6.7 5.7 8.98 3.02 2.103 2.0 6.95 19840800 0 3.222 NA NA 8.813 7.01 1.754 2.8 2.36 19850800 0 2.32 7.3 6.8 9.2 1.04 2.070 5.1 9.96
GDS-formatted data can occur in three sub-formats:
A GDS list-formatted file contains header information, plus one or several lists, one per data set stored in the file. A list consists of the x- and y- coordinates of each data point, plus the data point's value. As this is the case with the GDS standard format the list format also supports multiple data sets per file, but it does not support undefined values for individual data points.
A GDS Arc/Info-formatted file consists of header information, plus exactly one gridded data set that is stored in a rectangular data field. The GDS Arc/Info format corresponds to the "gridascii" type of output as produced by the Arc/Info GIS software.
Important note:
In the GDS Arc/Info format the header attributes "xllcorner" and "yllcorner" denote the location of the lower left corner of the lower left grid cell. In contrast, in the GDS standard and GDS list formats, these attributes denote the location of the lower left gridpoint. Hence, depending on the output format chosen, the attribute values given for "xllcorner" and "yllcorner" will differ by 0.5 times the cellsize.There are no format restrictions as long the syntax given below is not violated, i.e. line breaks can occur anywhere, data points may be separated by tabulators and/or spaces etc. All information enclosed within "(*" and "*)" is recognized as a comment and is skipped. Comments can occur anywhere within a GDS-formatted file. Nested comments are possible.
However, note that if a GDS Arc/Info-formatted file is to be processed by
other software than the ClimTools software (e.g., the Arc/Info GIS
software) more restrictive formatting rules will typically apply.
GDSFile
=
StandardFormatFile | ListFormatFile | ArcInfoFormatFile.
StandardFormatFile
=
StandardFormatHeader StandardFormatData.
ListFormatFile
=
ListFormatHeader ListFormatData.
ArcInfoFormatFile
=
ArcInfoFormatHeader ArcInfoFormatData.
StandardFormatHeader
=
DataDescr SectorDescr GridSpecif MissingValSpecif.
ListFormatHeader
=
DataDescr SectorDescr GridSpecif.
ArcInfoFormatHeader
=
GridSpecif MissingValSpecif.
StandardFormatData
=
DataField | NumberedDataField {NumberedDataField}.
ListFormatData
=
DataList | NumberedDataList {NumberedDataList}.
ArcInfoFormatData
=
DataField.
DataDescr
=
"GRIDDED_DATA" dataIdent dataDescr.
dataIdent
=
INTEGER.
dataDescr
=
STRING.
SectorDescr
=
"SECTOR" sectorIdent sectorDescr.
sectorIdent
=
INTEGER.
sectorDescr
=
STRING.
GridSpecif
=
"ncols" numCols "nrows" numRows "xllcorner" xLowerLeftCoord "yllcorner" yLowerLeftCoord "cellsize" cellSize.
numCols
=
INTEGER.
numRows
=
INTEGER.
xLowerLeftCoord
=
INTEGER|REAL.
yLowerLeftCoord
=
INTEGER|REAL.
cellSize
=
INTEGER|REAL.
MissingValSpecif
=
("NODATA_Value"|"NODATA_value"|"nodata_value") missingValCode.
missingValCode
=
IDENTIFIER|INTEGER|REAL.
DataField
=
dataPoint {dataPoint}.
dataPoint
=
IDENTIFIER|INTEGER|REAL.
NumberedDataField
=
"DATASET_NR" dataSetNumber DataField.
dataSetNumber
=
INTEGER.
DataList
=
ListEntry {ListEntry}.
ListEntry
=
xCoord yCoord listedValue.
xCoord
=
INTEGER|REAL.
yCoord
=
INTEGER|REAL.
listedValue
=
INTEGER|REAL.
NumberedDataList
=
"DATASET_NR" dataSetNumber DataList.
Ex. 1: GDS standard format
GRIDDED_DATA -10 "My test data" SECTOR -3000 "The sector" ncols 5 nrows 4 xllcorner -10.0 yllcorner -50.0 cellsize 0.5 NODATA_Value NA DATASET_NR 1 NA 11.0 12.0 13 14.0 20.0 21.0 22.0 +23 +24.0 30.0 31.0 NA -33 -34.0 40.0 41.0 42.0 43 44.0 DATASET_NR 2 -11.1 +11.0 -66.0 99.0 +333.0 22.1 22.0 -55 NA -334 -33.2 +33.0 44.0 77.0 -335.0 44.3 44.0 33 66.0 +336Ex. 2: GDS list format
GRIDDED_DATA 1002 "Temperature" SECTOR 3002 "MAB Davos" ncols 5 nrows 7 xllcorner 783000.0 yllcorner 192500.0 cellsize 100.0 (* X *) (* Y *) (* value *) 783000.0 193100.0 10.2 783000.0 192500.0 11.4 783400.0 193100.0 9.3 783400.0 192500.0 8.4 783000.0 192600.0 8.9 783000.0 192700.0 12.4 783000.0 192800.0 11.6 783200.0 192700.0 12.1 783200.0 192800.0 9.7 783200.0 192900.0 8.3 783300.0 192800.0 10.6 783300.0 192900.0 10.9 783300.0 193000.0 7.5Ex. 3: GDS Arc/Info format
ncols 25 nrows 21 xllcorner 814100.0 yllcorner 171420.0 cellsize 20 nodata_value -9999 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 4 4 4 4 4 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 5 4 6 6 6 6 6 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 5 3 6 6 6 6 1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 5 4 8 6 6 6 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 5 5 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 5 5 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 5 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 5 4 4 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 5 5 4 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 4 4 4 4 4 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 4 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 2 3 3 3 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 3 3 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 3 3 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 3 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 3 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 4 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 4 4 4 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1