“On the other side of the screen, it all looks so easy.”
Kevin Flynn, Tron
At the end of this session you should be able to
In general, the following spatial data models are available for the modelling of geographical data sets in Geographic Information Systems (GIS):
Geometrically, a raster data set is like a digital image. It consists of columns and rows and the resulting cells are called grid cells or pixels (i.e. pixel elements). Since there are no gaps between the grid cells, a raster data set is truly an area wide data set. As for tables or data frames, one can access the individual grid cells using a row and column index value.
Within each of the grid cells, one data value can be stored. If more than one information should be stored in a raster data set, more than one raster is necessary. That is the reason why you get e.g. 11 raster data sets from the Landsat 8 satellite sensor since it also has 11 sensor channels and hence 11 measurements per pixel location.
One thing that is different from a digital image is that geographical raster data comes with a projection information and defined corner coordinates (e.g. in latitude or longitude values or UTM etc.). The individual cells of the raster data set also have the same extend in terms of meters or degrees. This extend is called the resolution of the data set. Hence, if one knows the corner coordinates and the resolution, one can easily calculate the real-world coordinates of each grid cell.
The standard exchange format for geographical raster data sets is the GeoTIFF format.
In contrast to raster data sets, a vector point is defined by the distance to the origin of the coordinate system and the angle between the ordinate axis and the direction where the distance is measured. In simpler terms, a vector point in the 2D space is defined by its value pair on the x and y axis.
The most simplest form of vector data set is the point vector, which only consists of point data (e.g. the middle point coordinates of research plots) or the peak coordinate of Pico de Fogo). The points can also be connected by lines which makes the vector a line vector or by polygons which makes it a polygon vector.
In contrast to raster data sets, any vector data set can have more than one information per location. This is because the information is not directly stored in the vector geometry but in a table which is linked to the vector coordinates. This table can have one information per column and basically as many columns as one likes. The table is called attribute table and the individual columns are called attributes of the vector. For example, the subset you have been working with so far can easily be connected to the geographical locations of the survey plots where it has been collected.
The standard exchange format for geographical vector data sets is ESRI's shape file format.
In order to handle spatial data in R, the following packages are quite important:
These functions make the handling of geographical data sets quite simple and even provide advanced GIS functionality (e.g. extracting raster values at vector point locations etc.).
Since we will use raster data information just to do some predicting in this years school, let's just have a quick look on how to read them.
If you want to read a shape file (i.e. vector) into R, just use the readOGR() function from the sp package:
vector ← readOGR(“<name of the shape file>”, “<name of the data layer>”)
The first argument is the path and name of the vector file. The layer name is almost always the file name of the shape file except for the “shp” extension. Once you have read it, you get a spatial data frame which basically is a standard data frame with some additional geographical information merged to it.
If you want to read a GeoTiff (i.e. raster) into R, just use the raster() function from the raster package:
raster ← raster(<name of the GeoTiff file>)
Once you have read it, you get a data set of type raster.
A more detailed introduction to handling vector and raster data is beyond the scope of this school but will be covered in the one from 2015. For some more information on that topic, please refer to the excursus E10-1: Accessing spatial data sets in R or have a look at the code examples on C10-1 Visualization of spatial data sets in R and C10-2 Preprocessing of spatial data sets with R.