6 Introduction to the GDAL library
GDAL (Geospatial Data Abstraction Library) is an open source, MIT-licensed library which is a collection of functions used to convert between different vector and raster GIS formats, data types, and coordinate reference systems (the PROJ.4 library is integrated into the GDAL library). Besides this, GDAL contains other functions for manipulating raster data. In the following, its functionality will be illustrated through several examples.
Hence, using the GDAL library, vector data can be converted from one coordinate system to another and from one format to another. Using cartographic projections with vector data is simple, the conversion is carried out point by point, and examples for points are given in the decription of the PROJ.4 library.
Using the ogrinfo
function, basic information on the vector file can be obtained.
ogrinfo -so test.kml test
INFO: Open of `test.kml'
using driver `LIBKML' successful.
Layer name: test
Geometry: Unknown (any)
Feature Count: 1
Extent: (20.476164, 44.805681) - (20.476164, 44.805681)
Layer SRS WKT:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]]
Name: String (0.0)
description: String (0.0)
timestamp: DateTime (0.0)
begin: DateTime (0.0)
end: DateTime (0.0)
altitudeMode: String (0.0)
tessellate: Integer (0.0)
extrude: Integer (0.0)
visibility: Integer (0.0)
drawOrder: Integer (0.0)
icon: String (0.0)
The -so
parameter is given to show only the summary data about the file, without listing all entities; this is followed by the file and layer names, usually identical to the file name without the extension. If the -so
command is given without parameters, then the section relating to the vector entities’ geometry would added to the previous output:
OGRFeature(test):1
tessellate (Integer) = -1
extrude (Integer) = 0
visibility (Integer) = -1
POINT (20.4761644699725 44.8056810018923)
If we want to transform this file into ESRI Shapefile format or the Gauss-Krüger projection for Serbia with the Hernannskogel datum, the ogr2ogr
command would be used with the following parameters:
ogr2ogr -f "ESRI Shapefile" testGK.shp -t_srs "+proj=tmerc +lat_0=0 +lon_0=21 +k=0.9999 +x_0=7500000 +y_0=0 +ellps=bessel +towgs84=574.027,170.175,401.545,4.88786,-0.66524,-13.24673,6.89 +units=m" -s_srs "EPSG:4326" test.kml
The parameter -f
defines the output format, and the list of all formats can be obtained using the ogr2ogr –formats command. This is followed by the resulting file’s name; -t_srs
is the argument used for defining the target coordinate system using the PROJ.4 notation or EPSG code; -s_srs
is the definition of the input file’s coordinate system, which can be omitted if it is contained in the file itself. It should be noted that when the ESRI Shapefile is used and accompanied by a .prj
file, then, the datum parameters are not written in the description and need to be defined again. This is often the case for other datum-related formats not defined by the EPSG code. After the command comes the input file’s name. A summary description of the new file is given as follows:
ogrinfo testGK.shp testGK
INFO: Open of `testGK.shp'
using driver `ESRI Shapefile' successful.
Layer name: testGK
Metadata:
DBF_DATE_LAST_UPDATE=2018-02-09
Geometry: Point
Feature Count: 1
Extent: (7458993.628067, 4962479.113279) - (7458993.628067, 4962479.113279)
Layer SRS WKT:
PROJCS["Transverse_Mercator",
GEOGCS["GCS_Bessel 1841",
DATUM["unknown",
SPHEROID["bessel",6377397.155,299.1528128]],
PRIMEM["Greenwich",0],
UNIT["Degree",0.017453292519943295]],
PROJECTION["Transverse_Mercator"],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",21],
PARAMETER["scale_factor",0.9999],
PARAMETER["false_easting",7500000],
PARAMETER["false_northing",0],
UNIT["Meter",1]]
...
However, the same logic cannot be applied to raster data as will be illustrated in the following example. Suppose we are converting raster data given in geographic coordinates into an equivalent Sinosoidal projection, using the method of first converting each pixel into a polygon and then carrying out a vector transformation, Figure 6.1.
Pixels are arranged as rectangular elements, with all pixels having the same geometry. It is necessary for the transformed raster to have a pixelated structure in which all pixels have the same geometry as well. From the previous example it is clear that this cannot be achieved since pixels are deformed because of the characteristics of the projection; hence, they do not represent rectangular elements.
In raster transformations, it is necessary to again achieve a raster structure; however, in doing this, another problem can arise. Namely, pixels can be mapped outside of the area encompassing the Earth’s surface (a Wraparound Problem), Figure 6.2.
Finally, the transformation for the previous example is shown in Figure 6.3.
From previous figures, it can clearly be seen that the position and number of pixels are not the same in both projections, and that the pixel values are rearranged so as to represent the positions and values of input pixels as closely as possible. Methods that secure the rearrangement of pixel values from the input raster to the output raster are called resampling methods. Here, some of the resampling methods will be briefly mentioned.
The Nearest Neighbor resampling method is a simple resampling method. It takes into account the nearest pixel center from the input raster in relation to the output raster pixel centers, so that it finds the nearest neighbor and assigns it the original input pixel’s value. The method is used when it is necessary to perform a quick resampling, as well as with categorical or normalized data.
Bilinear interpolation takes into account the four nearest cells in the direction of the raster’s main axes, and calculates the values of the output raster as the average value of the nearest cells.
Cubic convolution is similar to the bilinear method but the averaging is performed on 16 adjacent cells.
An example of using the gdalwarp
function is illustrated on Natural Earth data which represent continents and bodies of water, with stilyzation integrated into a TIFF file.
gdalwarp -t_srs '+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs' -r cubic -co COMPRESS=LZW -dstalpha NE1_50M_SR_W.tif NE1_50M_SR_W_sin.tif
The -t_srs
parameter is used to define the output file’s coordinate system, -r
is used to define the resampling method (here, cubic convolution is chosen), -co
is a parameter defining the compression method, and -dstalpha
makes the pixels without an assigned value be displayed as transparent. The input file, NE1_50M_SR_W.tif
, is shown in Figure 6.4. Unlike vector functions, for raster functions, the output file is always the last parameter in the function, NE1_50M_SR_W_sin.tif
, Figure 6.5.