Reading csv files is realized using the
read.table function from R’s utils library. The function will return a data frame which contains the information of the csv file.
df <- read.table("D:/active/moc/dm/examples/data_raw/wb-db_co2em_1960-2013.csv", header = TRUE, sep = ",", skip = 2)
As you can see, the
read.table function gets several arguments. Since this is probably the first time you use an R function, let’s have a closer look on these arguments. The first one gives the filename inclducing the path to the file starting from the level set as working directory. The second argument
header = TRUE tells the function, that the csv file has a header line which is used by
read.table to name the columns of the returning data frame. The third one
sep = "," defines the separator of the individual columns in the data frame. Finally, since our csv file has an additional header and an empty line below (in lines 1 and 2), the function should skip the first two lines (
skip = 2).
A note on the sequence of the arguments: the sequence of the arguments does not matter as long as you name them explicetly. If you do not use the argument identfier as it is the case for the first argument, the filename, in the example then the sequence matters. To get information on the default sequence and of course the general application of the each R function, type
?<function name> (e.g.
?read.table) in an R console.
After you executing the
read.table function above, the content of the csv file is stored into a two dimensional data frame called
A quick way to check if everything is fine is to display the first few lines of the data file using the
## Country.Name Country.Code Indicator.Name Indicator.Code X1960 X1961 ## 1 Aruba ABW CO2 emissions (kt) EN.ATM.CO2E.KT NA NA ## 2 Andorra AND CO2 emissions (kt) EN.ATM.CO2E.KT NA NA ## X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973 ## 1 NA NA NA NA NA NA NA NA NA NA NA NA ## 2 NA NA NA NA NA NA NA NA NA NA NA NA ## X1974 X1975 X1976 X1977 X1978 X1979 X1980 X1981 X1982 X1983 X1984 X1985 ## 1 NA NA NA NA NA NA NA NA NA NA NA NA ## 2 NA NA NA NA NA NA NA NA NA NA NA NA ## X1986 X1987 X1988 X1989 X1990 X1991 X1992 X1993 X1994 X1995 X1996 ## 1 179.7 447.4 612.4 649.1 1841 1929 1723 1771 1764 1782 1811.5 ## 2 NA NA NA NA NA NA NA NA NA 407 425.4 ## X1997 X1998 X1999 X2000 X2001 X2002 X2003 X2004 X2005 X2006 ## 1 1851.8 1668 1683.2 2233.2 2236.9 2255.2 2255.2 2258.9 2273.5 2273.5 ## 2 458.4 484 513.4 524.4 524.4 531.7 535.4 564.7 575.7 546.4 ## X2007 X2008 X2009 X2010 X2011 X2012 X2013 X ## 1 2358 2288 2296 2321 NA NA NA NA ## 2 539 539 517 517 NA NA NA NA
Writing data is as easy as reading it. Just use the
write.table(df, file = "D:/active/moc/dm/examples/data_procd/test.csv", sep = ";", dec = ",")
The example above also illustrates another argument called
dec which defines the decimal point. In addition, the parameter for the
sep argument is “;” which causes the new csv file to have “;” instead of “,” as a separator. This would be a good idea for German csv files which use the “,” as decimal sign but to put this straight: CSV means comma separated values so just use “,” as a separator and “.” as decimal point.