en:learning:schools:s01:lecture-notes:ba-ln-12

“Come on, you scuzzy data, be in there. Come on. ”

Kevin Flynn, Tron

- Logical operations
- Sub-setting data frames
- Handling fill values or NA

At the end of this session you should be able to

- subset data frames by simple boolean expressions

The group of logical operations in programming languages generally encompasses relational and boolean operators.

Relational operators are used to compare two entities regarding their equality. Depending on that, they return only TRUE and FALSE. The following operators are included in R:

Operator | Operation |
---|---|

> | greater than |

< | less than |

== | exactly equal |

>= | greater than or equal |

<= | less than or equal |

!= | not equal |

A special instance of the == operator is implemented in the `isTRUE`

function which returns if an expression is TRUE or FALSE (e.g. isTRUE(x) returns TRUE if x is TRUE; it is an alternative for x == TRUE).

Boolean operators are another core component. They allow to combine the boolean expressions TRUE and FALSE in a boolean algebra. The basic operators implemented in R are the following:

Operator | Operation |
---|---|

!x | Not x (with e.g. x is the result of a boolean expression |

x | y | x OR y |

x & y | x AND y |

xor(x, y) | exclusive x OR y |

Of course, one can combine such operators but keep in mind that the precedence of these operators is as follows: NOT, AND, OR. Here is an example:

> A <- TRUE > B <- FALSE > C <- FALSE > B & C | A [1] TRUE > B & (C | A) [1] FALSE > !B & C | A [1] TRUE > !(B & C | A) [1] FALSE

Subsetting implies that you remove certain rows and/or columns from a data frame to reduce the actual data set to what is needed for your analysis. The two types are realized with partially different manners:

- subsetting by selecting the rows and columns you want in your final data frame
- subsetting by removing the rows and columns you want in your final data frame

Both can easily be done using the indexing methods of the data types already introduced in C00-2 Data frame basics.

The main difference (or advantage now) may be that you can derive the indexing boundaries using logial and boolean expressions.

Have a look at E12-1 Subsetting data frames now for more information on this subject.

A more detailed introduction to subsetting or - more general - cleaning data frames is beyond the scope of this school. For some more information on that topic, please refer to the excursus E12-1: Cleaning data frames.

en/learning/schools/s01/lecture-notes/ba-ln-12.txt · Last modified: 2015/09/22 16:22 (external edit)

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International