read.table with comma separated values and also commas inside each element

Question

I'm trying to create a table from a csv file comma separated. I'm aware that not all the rows have the same number of elements so I would write some code to eliminate those rows. The problem is that there are rows that include numbers (in thousands) which include another comma as well. I'm not capable of splitting those rows properly, here's my code:

pURL <- "http://financials.morningstar.com/ajax/exportKR2CSV.html?&callback=?&t=EI&region=FRA&order=asc"
res <- read.table(pURL, header=T, sep='	', dec = '.', stringsAsFactors=F)
x <- unlist( lapply(keyRatios, function(u) strsplit(u,split='
')) [[1]] )

Simon O'Hanlon · Accepted Answer

You need to make use of the quote = argument of either read.table or read.delim...

res <- read.delim( pURL, header=F, sep=',', dec = '.', stringsAsFactors=F , quote = "\"" ,   fill = TRUE , skip = 2 )

The seperator is "," not " ". Numbers written as thousands of millions are always quoted in this file so you can use the quote argument to make R ignore the comma inside the quotes with quote = "\"", and you want to skip the first two lines, and use fill = TRUE to fill in blanks on uneven lines.

head( res )

#                           2003-12 2004-12 2005-12 2006-12 2007-12 2008-12 2009-12 2010-12 2011-12 2012-12   TTM
#2          Revenue EUR Mil   2,116   2,260   2,424   2,690   2,908   3,074   3,268   3,892   4,190   4,989 5,034
#3           Gross Margin %    60.6    60.3    57.3    58.2    57.6    56.9    56.1    55.5    55.4    55.8  56.1
#4 Operating Income EUR Mil     365     404     394     460     505     515     555     618     683     832   841
#5       Operating Margin %    17.2    17.9    16.2    17.1    17.4    16.7    17.0    15.9    16.3    16.7  16.7
#6       Net Income EUR Mil     200     227     289     331     371     389     402     472     518     584   594
#7   Earnings Per Share EUR    3.90    4.30    5.44    6.22    3.48    3.62    3.78    4.36    4.82    2.77  2.80

I set the column names of res afterwards like this...

names( res ) <- res[1,]; res <- res[-1,]

It gave better formatting.

read.table with comma separated values and also commas inside each element

Tags:

split

r

csv

read.table

nopeva

1 Answers

Simon O'Hanlon

Recent Activity

Donate For Us

read.table with comma separated values and also commas inside each element

Tags:

split

r

csv

read.table

nopeva

1 Answers

Simon O'Hanlon

Related questions

Recent Activity

Donate For Us