ClimTools Home Paper Overview User Manual File Formats Functions

ClimTools Reference - Part A:
File Formats

Dimitrios Gyalistras
Systems Ecology Group, ETH Zurich, Switzerland

Version 1.0, 4. February 2004

Table of Contents


1. SDT - The "Site Data Table" Format

1.1 Description

The SDT format is a human readable, ASCII-based format specifically designed for the storage of attributes of geographically referenced point (site-specific) data, such as the identifier, name and elevation of a meteorological station, or the soil's field capacity at a given location.

A SDT-formatted file contains a simple header that describes the overall data set, plus a data table. The data table's header consists of a series of column identifiers which specify a list of attributes that are considered at all sites. The header is followed by an arbitrary number of data lines, one per location. Missing values for numerical (i.e., INTEGER or REAL-valued attributes) are denoted by the string "NA".

In order to allow for the unique identification of a site, a valid SDT data table must have either two columns named

xCoord and yCoord (of type REAL),

or one column named

SiteId (of type INTEGER).

Further a SDT table table may have an optional column named

SiteDescr (of type STRING),

plus any other columns of any type (BOOLEAN, INTEGER, REAL, STRING, or IDENTIFIER).

The mandatory columns may occur at any order and place within the data table. All data within one column must be of one and the same type. Otherwise any combinations of different types of columns are possible.

Note, all attributes per location must occur on one and the same line. Otherwise no format restrictions apply, i.e. data points may be separated by one or several tabulators and/or spaces etc. All information enclosed within "(*" and "*)" is recognized as a comment and is skipped. Comments can occur anywhere within a SDT-formatted file. Nested comments are possible.

1.2 Syntax

SDTFile = SDTHeader DataTable.
SDTHeader = "SITE_DATA" dataDescription.
dataDescription = STRING.
DataTable = TableHeader TableLine {TableLine} "END".
TableHeader = IDENTIFIER {IDENTIFIER} lineSeparator.
TableLine = TableEle {TableEle} lineSeparator.
lineSeparator = EOL.

1.3 Examples

Ex. 1: A SDT with xCoord and yCoord columns

 SITE_DATA "My Data 1" 
   xCoord  yCoord     Z
      6.0    45.0   1201.0
      7.0    46.0   2345.0
      8.0    47.0    987.0
      9.0    46.0    -20.0
     10.0    45.0    839.0
     11.0    46.0    499.0
     12.0    45.0   1207.0

Ex. 2: A SDT with a SiteId column

 SITE_DATA "Test Data 2"
   SiteId       Z  
     1011     1201.0  
    -2103     2345.0  
    -2760      987.0  
     4041         NA  
      999      839.0  
     6061         NA  
     4071     1207.0  

Ex. 3: A SDT with all kinds of columns

 SITE_DATA "Some Swiss precipitation stations"
   SiteId  SiteDescr       Elevation   xCoord      yCoord  
     20    "SEDRUN"        1450        701900.0    170900.0   
     60    "Disentis"      1190        708230.0    173780.0   
    470    "SERTIG-BUEEL"  1710        783240.0    179830.0   
    475    "Monstein"      1575        778080.0    176230.0   
    490    "LATSCH"        1585        777140.0    167290.0   
   5350    "ZWEISIMMEN"     960        594800.0    155730.0   
   9930    "Scuol(Schuls)" 1295        817470.0    186600.0   
   9990    "Muestair"      1248        831170.0    169340.0   


2. DSD - The "Daily Station Data" format

2.1 Description

The DSD format is a human readable, ASCII-based format specifically designed for the storage of univariate time series with a daily time step, e.g. daily measured weather data for one variable from a climate station.

A DSD-formatted file contains at least one data set that consists of a simple header describing the station and variable considered, plus an arbitrary number of data vectors with exactly 34 elements each. The first two elements of a data vector specify the year and month considered, the third element specifies the maximum number of days containing valid data for that particular month (28, 29, 30, or 31), and the remaining vector elements contain the daily data for all days of the month. If a month has less then 31 days, missing values are expected for all days up to day 31. Missing values are coded throughout as "NA".

The individual data records need not be sorted by year and month. However, year numbers outside the range specified in the header are not accepted.

There are no format restrictions as long the syntax given below is not violated, i.e. line breaks can occur anywhere, data points may be separated by tabulators and/or spaces etc. All information enclosed within "(*" and "*)" is recognized as a comment and is skipped. Comments can occur anywhere within a DSD-formatted file. Nested comments are possible.

2.2 Syntax

DSDFile = DataSet {DataSet}.
DataSet = Header DataRecord {DataRecord}.
Header = "#" stationIdent stationName variableIdent firstYear lastYear longitude latitude altitude.
stationIdent = INTEGER.
variableIdent = IDENTIFIER.
firstYear = INTEGER.
lastYear = INTEGER.
longitude = REAL.
latitude = REAL.
altitude = INTEGER.
DataRecord = yearNr monthNr numDays dataPoint {dataPoint}.
yearNr = INTEGER.
monthNr = INTEGER.
numDays = INTEGER.
dataPoint = INTEGER|REAL|"NA".

2.3 Example

 #  5520 BERN_LIEBEFELD  Precip (*mm*)   1994 1997   7.421 46.929 570
 1994  7  31  0.00  0.00  0.00  0.41  0.00  0.60  0.00  0.05  0.00  0.00  0.00  0.00  0.00  0.10  0.00  0.00  0.02  1.61  0.75  0.02  0.01  0.00  0.01  0.00  0.03  0.00  0.00  0.16  0.02  0.00  0.00   
 1994  8  31  0.11  0.00  0.00  0.00  0.00  0.00  1.33  0.00  0.00  2.85  0.90  0.00  1.21  0.00  0.00  0.05  1.23  0.00  0.00  0.00  0.17  0.00  0.11  0.90  0.11  0.00  0.00  0.00  0.00  0.12  1.66   
 1994  9  30  2.74  0.21  0.00  0.00  0.00  0.05  0.07  2.84  0.22  0.01  0.31  2.91  0.00  0.98  2.58  1.11  0.23  0.03  0.00  0.43  0.01  0.00  0.00  0.00  0.00  0.60  0.08  0.00  0.00  0.01    NA   
 1995  4  30  0.38  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.42  0.00  0.00  0.07  0.94  0.00  0.00  0.66  0.05  0.00  0.00  0.00  0.15  0.97  1.12  0.39  0.00  0.12  0.33    NA   
 1995  5  31  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.16  0.00  0.00  0.11  3.15  3.03  0.29  0.63  0.99  0.94  1.11  0.01  0.01  0.00  0.00  0.00  0.00  0.81  2.25  0.00  0.00  0.63  2.80  1.95   
 1996 11  30  0.00  0.00  0.00  0.00  1.04  0.02  0.97  0.00  0.00  0.46  0.00  0.80  2.30  0.04  0.00  0.00  0.00  0.71  0.41  0.42  0.19  0.00  0.29  0.08  0.84  0.35  2.34  0.09  2.37  0.95    NA   
 1997  1  31  1.04  0.31  0.00  0.00  0.03  0.00  0.00  0.03  0.56  0.02  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  1.01  2.06  0.46  0.00  0.00  0.00  0.00  0.00  0.02  0.00  0.00  0.00  0.00   
 1997  2  28  0.00  0.00  0.00  0.00  0.56  0.00  0.00  0.00  0.00  0.02  0.39  0.02  0.29  0.74  0.63  0.00  0.00  0.86  0.00  0.40  0.00  0.00  0.23  0.18  0.14  1.95  0.05  0.00    NA    NA    NA   
 1997  3  31  0.00  0.00  0.00  0.10  0.03  0.17  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.14  0.00  0.03  0.00  0.36  0.86  0.00  0.00  0.00  0.18  0.04  0.00  0.00  0.00  0.47  0.10  0.00  0.00   


3. MAT - The "Matrix" Format

3.1 Description

The MAT format is a human readable, versatile and general, ASCII-based format for the storage of real-valued matrices.

A MAT-formatted file consists of header information plus the actual data matrix. The header may optionally include a data type and a code number, plus a descriptor string. The MAT format further supports optional column and/or row labels, and the specification of an arbitrary "no data" string used to denote missing data points. The default "no data" string is "NA".

There are no format restrictions, with the only exception that all column labels must occur on the same line. All information enclosed within "(*" and "*)" is recognized as a comment and is skipped. Comments can occur anywhere within a MAT-formatted file. Nested comments are possible.

3.2 Syntax

MatrixFile      = MatrixHeader MatrixData.
MatrixHeader = [OptionalInfo] [NoDataSpecif] RowsColsSpecif [ColumnLabels].
OptionalInfo = "MATRIX" (TypeSpecif CodeSpecif descriptor)|(TypeSpecif descriptor)|(CodeSpecif descriptor)|descriptor.
TypeSpecif = "TYPE" type.
type = INTEGER.
CodeSpecif = "CODE" code.
code = INTEGER.
descriptor = STRING.
NoDataSpecif = "NODATA_STR" missingValCode.
RowsColsSpecif = "N_ROWS" numRows "N_COLS" numCols.
numRows = INTEGER.
numCols = INTEGER.
ColumnLabels = columnLabel {columnLabel} EOL.
columnLabel = IDENTIFIER.
MatrixData = [rowLabel] dataPoint {dataPoint}.
rowLabel = IDENTIFIER.

3.3 Examples

Ex. 1: A minimaly specified matrix

    N_ROWS 2                                  
    N_COLS 3                                  
       1.1       1.2      NA                  
      -2.1      +2.2     -2.3                 

Ex. 2: A fully specified matrix

        TYPE 111    CODE -111                   
        "The matrix description"                
        NODATA_STR  NAN   
        N_ROWS 3    N_COLS 3    
            Col1      Col2      Col3            
     Row1     1.1       1.2      1.3            
     Row2    -2.1      -2.2     -2.3            
     Row3    +3.1      NAN      +3.3            

Ex. 3: A matrix with two columns and no row labels

    'This is "my matrix"'  
    NODATA_STR  -99.999                     
    N_ROWS 3  N_COLS 2                      
    TheCol1   TheCol2                      
    1.1         1.2                       
    -99.999     2.2                       
    +3.1      -99.999                       

Ex. 4: A matrix with only one row and no column labels

    NODATA_STR xxx                              
    N_ROWS 1  N_COLS 5                      
    TheRow  1.0   2.0   xxx   -4.0   5.5        


4. GDX - The "Gridded Data as Text" Format

4.1 Description

The GDS format is a human readable, ASCII-based format specifically designed for the storage of time series of geophysical data given on a regular longitude-latitude grid.

A GDX-formatted file consists of header information, plus an arbitrary number of data vectors. The first element of a data vector is interpreted as a date (given in YYYYMMDD format), the second element as a time (given in HHMM format). The remaining elements are expected to contain the field data, which must run from N to S and E to W, i.e. the third vector element corresponds to the "top left", the last one to the "bottom right" field point. The data vectors must be sorted ascendingly by date and time, and no missing values for the date and time codes are accepted.

The GDX-format presents an extension of the MAT (matrix) format. Accordingly, optional column and/or row labels, and the specification of an arbitrary string used to denote missing data points are supported. The default "no data" string is "NA".

There are no format restrictions, i.e. line breaks can occur anywhere, data points may be separated by tabulators and/or spaces etc. All information enclosed within "(*" and "*)" is recognized as a comment and is skipped. Comments can occur anywhere within a GDX-formatted file. Nested comments are possible.

4.2 Syntax

GDXFile        = GDXHeader GDXData.
GDXHeader = GeneralInfo LonLatInfo MatrixHeader.
GeneralInfo = "FIELD" descriptor "CODE" code "LEVEL" level.
descriptor = STRING.
code = INTEGER.
level = INTEGER.
LonLatInfo = "LONGITUDES" nLon minLon maxLon "LATITUDES" nLat minLat maxLat.
minLon = REAL.
maxLon = REAL.
minLat = REAL.
maxLat = REAL.
MatrixHeader (see the MAT format)
GDXData = dateCode timeCode dataPoint {dataPoint}.
dateCode = REAL.
timeCode = REAL.
dataPoint = REAL.

4.3 Example

 FIELD       "A small field"
 CODE        -100
 LEVEL       200
 LONGITUDES  4  -5.0    10.0  (*  5E to 10W *)  
 LATITUDES   2   35.0   40.0  (* 35N to 40N *)
 N_ROWS      5
 N_COLS      10
 19810800  0   1.22    6.4    7.1   4.768   1.09   2.738   7.5   2.07
 19820800  0   2.8     7.5    NA      NA    6.00   1.583   2.7   2.27     
 19830800  0   1.01    6.7    5.7   8.98    3.02   2.103   2.0   6.95     
 19840800  0   3.222   NA      NA   8.813   7.01   1.754   2.8   2.36     
 19850800  0   2.32    7.3    6.8   9.2     1.04   2.070   5.1   9.96     


5. GDS - The "Gridded Data Set" Format

5.1 Description

The GDS format is a human readable, ASCII-based format specifically designed for the storage of gridded data sets that are given on a regular longitude-latitude grid.

GDS-formatted data can occur in three sub-formats:

1. The GDS standard format
2. The GDS list format
3. The GDS Arc/Info format
A GDS standard-formatted file contains header information, plus at least one gridded data set that is typically stored in a rectangular data field.

A GDS list-formatted file contains header information, plus one or several lists, one per data set stored in the file. A list consists of the x- and y- coordinates of each data point, plus the data point's value. As this is the case with the GDS standard format the list format also supports multiple data sets per file, but it does not support undefined values for individual data points.

A GDS Arc/Info-formatted file consists of header information, plus exactly one gridded data set that is stored in a rectangular data field. The GDS Arc/Info format corresponds to the "gridascii" type of output as produced by the Arc/Info GIS software.

Important note:

In the GDS Arc/Info format the header attributes "xllcorner" and "yllcorner" denote the location of the lower left corner of the lower left grid cell. In contrast, in the GDS standard and GDS list formats, these attributes denote the location of the lower left gridpoint. Hence, depending on the output format chosen, the attribute values given for "xllcorner" and "yllcorner" will differ by 0.5 times the cellsize.
There are no format restrictions as long the syntax given below is not violated, i.e. line breaks can occur anywhere, data points may be separated by tabulators and/or spaces etc. All information enclosed within "(*" and "*)" is recognized as a comment and is skipped. Comments can occur anywhere within a GDS-formatted file. Nested comments are possible.

However, note that if a GDS Arc/Info-formatted file is to be processed by other software than the ClimTools software (e.g., the Arc/Info GIS software) more restrictive formatting rules will typically apply.

5.2 Syntax

GDSFile = StandardFormatFile | ListFormatFile | ArcInfoFormatFile.
StandardFormatFile = StandardFormatHeader StandardFormatData.
ListFormatFile = ListFormatHeader ListFormatData.
ArcInfoFormatFile = ArcInfoFormatHeader ArcInfoFormatData.
StandardFormatHeader = DataDescr SectorDescr GridSpecif MissingValSpecif.
ListFormatHeader = DataDescr SectorDescr GridSpecif.
ArcInfoFormatHeader = GridSpecif MissingValSpecif.
StandardFormatData = DataField | NumberedDataField {NumberedDataField}.
ListFormatData = DataList | NumberedDataList {NumberedDataList}.
ArcInfoFormatData = DataField.
DataDescr = "GRIDDED_DATA" dataIdent dataDescr.
dataIdent = INTEGER.
dataDescr = STRING.
SectorDescr = "SECTOR" sectorIdent sectorDescr.
sectorIdent = INTEGER.
sectorDescr = STRING.
GridSpecif = "ncols" numCols "nrows" numRows "xllcorner" xLowerLeftCoord "yllcorner" yLowerLeftCoord "cellsize" cellSize.
numCols = INTEGER.
numRows = INTEGER.
xLowerLeftCoord = INTEGER|REAL.
yLowerLeftCoord = INTEGER|REAL.
cellSize = INTEGER|REAL.
MissingValSpecif = ("NODATA_Value"|"NODATA_value"|"nodata_value") missingValCode.
DataField = dataPoint {dataPoint}.
NumberedDataField = "DATASET_NR" dataSetNumber DataField.
dataSetNumber = INTEGER.
DataList = ListEntry {ListEntry}.
ListEntry = xCoord yCoord listedValue.
listedValue = INTEGER|REAL.
NumberedDataList = "DATASET_NR" dataSetNumber DataList.

5.3 Examples

Ex. 1: GDS standard format

 GRIDDED_DATA    -10   "My test data"  
 SECTOR  -3000 "The sector"               
 ncols   5
 nrows   4
 xllcorner       -10.0
 yllcorner       -50.0
 cellsize        0.5
 NODATA_Value    NA
 NA      11.0     12.0    13      14.0
 20.0    21.0     22.0   +23     +24.0
 30.0    31.0    NA      -33     -34.0
 40.0    41.0     42.0    43      44.0
 -11.1   +11.0   -66.0    99.0   +333.0
  22.1    22.0   -55     NA      -334
 -33.2   +33.0    44.0    77.0   -335.0
  44.3    44.0    33      66.0   +336

Ex. 2: GDS list format

 GRIDDED_DATA    1002    "Temperature"
 SECTOR  3002    "MAB Davos"
 ncols   5
 nrows   7
 xllcorner       783000.0
 yllcorner       192500.0
 cellsize        100.0
 (* X *)    (* Y *)    (* value *)
 783000.0   193100.0    10.2
 783000.0   192500.0    11.4
 783400.0   193100.0     9.3
 783400.0   192500.0     8.4
 783000.0   192600.0     8.9
 783000.0   192700.0    12.4
 783000.0   192800.0    11.6
 783200.0   192700.0    12.1
 783200.0   192800.0     9.7
 783200.0   192900.0     8.3
 783300.0   192800.0    10.6
 783300.0   192900.0    10.9
 783300.0   193000.0     7.5

Ex. 3: GDS Arc/Info format

   ncols         25
   nrows         21
   xllcorner     814100.0
   yllcorner     171420.0
   cellsize      20
   nodata_value  -9999
   1 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 4 4 4 4 4 2 2 1 1
   1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 5 4 6 6 6 6 6 1 1 1
   1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 5 3 6 6 6 6 1 1 1 1
   1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 5 4 8 6 6 6 1 1 1 1
   1 1 1 1 1 1 1 1 3 3 3 3 3 3 5 5 4 4 4 4 1 1 1 1 1
   1 1 1 1 1 1 1 3 3 3 3 3 3 3 5 5 4 4 4 4 1 1 1 1 1
   1 1 1 1 1 1 3 3 3 3 3 3 3 3 5 4 4 4 4 1 1 1 1 1 1
   1 1 1 1 1 3 3 3 3 3 3 3 3 3 5 4 4 4 4 1 1 1 1 1 1
   1 1 1 1 2 2 2 2 2 2 3 3 3 5 5 4 4 4 1 1 1 1 1 1 1
   1 1 1 2 2 2 2 2 2 2 3 3 3 4 4 4 4 4 1 1 1 1 1 1 1
   1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 1 1 1 1 1 1 1 1
   1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 4 1 1 1 1 1 1 1 1 1
   1 2 2 2 2 2 2 2 2 2 3 3 3 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 2 2 2 2 1 1 2 3 3 3 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 2 2 3 3 4 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 2 2 3 3 4 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 1 2 3 3 4 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 1 2 2 3 4 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 1 2 2 2 4 4 4 4 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 1 1 1 2 4 4 4 1 1 1 1 1 1 1 1 1 1
   1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 1 1 1

This documentation is maintained by Dimitrios Gyalistras. Last updated 10-Oct-2006.