# BIS-Fogo

### Site Tools

en:learning:schools:s01:lecture-notes:ba-ln-12

# Differences

This shows you the differences between two versions of the page.

 — en:learning:schools:s01:lecture-notes:ba-ln-12 [2015/09/22 16:22] (current) Line 1: Line 1: + ====== L12: Subsetting data frames ====== + "Come on, you scuzzy data, be in there. Come on. " + + Kevin Flynn, Tron + + ==== Things we cover in this session ==== + * Logical operations + * Sub-setting data frames + * Handling fill values or NA + + ==== Things you need for this session ==== + * [[en:​learning:​schools:​s01:​worksheets:​ba-ws-12-1|W12-1 Subsetting data frames]] + ==== Things to take home from this session ==== + At the end of this session you should be able to + * subset data frames by simple boolean expressions + + ===== Logical operations ===== + The group of logical operations in programming languages generally encompasses relational and boolean operators. ​ + + Relational operators are used to compare two entities regarding their equality. Depending on that, they return only TRUE and FALSE. The following operators are included in R: + ^Operator ^Operation ^ + | > | greater than | + | < | less than | + | == | exactly equal | + | >= | greater than or equal | + | %%<%%= | less than or equal | + | != | not equal | + + A special instance of the == operator is implemented in the ''​isTRUE''​ function which returns if an expression is TRUE or FALSE (e.g. isTRUE(x) returns TRUE if x is TRUE; it is an alternative for x == TRUE). + + Boolean operators are another core component. They allow to combine the boolean expressions TRUE and FALSE in a boolean algebra. The basic operators implemented in R are the following: + ^Operator ^Operation ^ + | !x | Not x (with e.g. x is the result of a boolean expression | + | x %%|%% y | x OR y | + | x & y | x AND y | + | xor(x, y) | exclusive x OR y | + + + Of course, one can combine such operators but keep in mind that the precedence of these operators is as follows: NOT, AND, OR. Here is an example: + + + > A <- TRUE + > B <- FALSE + > C <- FALSE + > B & C | A + [1] TRUE + > B & (C | A) + [1] FALSE + > !B & C | A + [1] TRUE + > !(B & C | A) + [1] FALSE + ​ + + ===== Subsetting data frames ===== + Subsetting implies that you remove certain rows and/or columns from a data frame to reduce the actual data set to what is needed for your analysis. The two types are realized with partially different manners: + + * subsetting by selecting the rows and columns you want in your final data frame + * subsetting by removing the rows and columns you want in your final data frame + + Both can easily be done using the indexing methods of the data types already introduced in [[en:​learning:​schools:​s01:​code-examples:​ba-ce-00-02|C00-2 Data frame basics]]. ​ + + The main difference (or advantage now) may be that you can derive the indexing boundaries using logial and boolean expressions. + + 8-O Have a look at [[en:​learning:​schools:​s01:​code-examples:​ba-ce-12-01|E12-1 Subsetting data frames]] now for more information on this subject. + + + m( A more detailed introduction to subsetting or - more general - cleaning data frames is beyond the scope of this school. For some more information on that topic, please refer to the excursus [[en:​learning:​schools:​s01:​excursus:​ba-ex-12-01|E12-1:​ Cleaning data frames]]. + ===== Time for practice ===== + [[en:​learning:​schools:​s01:​worksheets:​ba-ws-12-1|W12-1 Subsetting data frames]]
en/learning/schools/s01/lecture-notes/ba-ln-12.txt ยท Last modified: 2015/09/22 16:22 (external edit)