# BIS-Fogo

### Site Tools

en:learning:schools:s01:worksheets:ba-ws-07-1

# W07-1 Descriptive data set properties

This worksheet summarizes descriptive statistic functions which can be used to describe the properties of a data set. After completing this worksheet you should know how to use basic descriptive statistic functions and produce box whisker plots.

## Things you need for this worksheet

• R — the interpreter can be installed on any operation system. For Linux, you should use the r-cran packages supplied for your Linux distribution. If you use Ubuntu, this is one of many starting points. If you use windows, you could install R from the official CRAN web page.

• R Studio — we recommend to use R Studio for (interactive) programming with R. You can download R Studio from the official web page.

• your script and data from W06-1 Leave-one-out validation

## Learning log assignments

This time, the following analysis is build on top of your script from W06-1. Please copy your script “W06-1.R”, rename the copy to “W07-1.R” and use it for the programming tasks of this worksheet.

Please figure out the functions for some general descriptive statistics: mean, median, standard deviation, maximum and minimum. Let's try those on the observed animal activity data set.

Let's compare the descriptive values with the predicted values of the animal activity from W06-1.

While those values give us a good impression of the differences between the observed and predicted animal activity data set, a visualization might be more intuitive.

Please visualize the data set properties of the observed and predicted animal activity using a boxplot (e.g. function boxplot()).

Please identify the outliers by computing a scatter plot first and then starting the identify() function. Now you can click on a point of the plot to get the meta information for it.

To terminate the identify() function, just hit Escape