en:learning:schools:s01:lecture-notes:ba-ln-13

“You're in trouble, program. Why don't you make it easy on yourself?”

Master Control Program, Tron

- Validating predictions using leave-one-out validations

At the end of this session you should be able to

- handle simple for loops
- compute leave-one-out validations
- interpret the validation results

Ordination analysis is a multivariate data mining method which reduces the dimensionality of an ecological data set and re-arranges the reduced data values onto a two dimensional space in such a way that potential relationship patterns between the individual ecological information becomes as apparent as possible. Therefore, common data reduction and transformation techniques like principal component analysis, multi-dimensional scaling or correspondence analysis is used.

The following plots illustrate such kinds of data reduction and transformation.

The original value distribution along two variables called band a and band b is shown by the black dots. If, for example, a principal component analysis would be applied to this data set, a correlation matrix would be computed in order to find a function which transforms the data set onto itself from one multidimensional space into another one (i.e. eigentransformation). As a result, the new defined axis PC1 would describe the first dimension of this new multidimensional space. Graphically, it is drawn along the direction of maximum variance in the data set. Afterwards, a second axis (PC2) would be drawn along the direction of the maximum variance in the remaining data (i.e. the data points which have the same value on PC1).

Maybe the most commonly applied technique in ordination analysis is the correspondence analysis. It is heavily used to identify and delineate communities of species (i.e. groups of species which frequently occur together at one location). The correspondence analysis uses a Chi-square distance matrix as basis for its eigenalanysis. Therefore, the observed vs. expected values of each species at each location are computed following the commonly used Chi-square formula:

`(observed value - expected value)`

^{2} / expected value

In addition, environmental factors can be considered as so called constraining variables.

In general, ordination results can be visualized in a two dimensional plot.

The red/yellow points represent the locations of the individual field survey plots (this example is from a 2002 study) and the blue labels identify the location of the individual species within the newly created data space defined by axis CCA1 and CCA2. The arrows indicate the direction of the influence of the constraining variables.

For a more formal and extended introduction to this topic, please refer to e.g. [Syms2008] or [Wildi2010].

en/learning/schools/s01/lecture-notes/ba-ln-13.txt · Last modified: 2017/10/30 10:28 by aziegler

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International