Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Code to import data from a Stack overflow query into R

Tags:

r

When I try to answer a question in Stack Overflow about R, a good part of my time is spent trying to rebuild the data given as example (unless the question author has been nice enough to provide them as R code).

So my question is, if somebody just asks a question and gives his sample data frame the following way :

a  b   c
1 11 foo
2 12 bar
3 13 baz
4 14 bar
5 15 foo

Do you have a tip or a function to import this easily into an R session, without having to type the entire data.frame() instruction ?

Thanks in advance for any hint !

PS : sorry if the term "query" is not really nice in my question title, but it seems you can't use the word "question" in a question title in Stack overflow :-)

like image 384
juba Avatar asked Jun 01 '12 11:06

juba


3 Answers

Maybe textConnection() is what you want here:

R> zz <- read.table(textConnection("a  b   c
1 11 foo
2 12 bar
3 13 baz
4 14 bar
5 15 foo"), header=TRUE)
R> zz
  a  b   c
1 1 11 foo
2 2 12 bar
3 3 13 baz
4 4 14 bar
5 5 15 foo
R> 

It allows you to treat the text as a "connection" from which to read. You can also just copy and paste, but access from the clipboard is more dependent on the operating system and hence less portable.

like image 139
Dirk Eddelbuettel Avatar answered Oct 14 '22 01:10

Dirk Eddelbuettel


Recent version of R now offer an even lower keystroke option than the textConnection route for entry of columnar data into read.table and friends. faced with this:

zz
  a  b   c
1 1 11 foo
2 2 12 bar
3 3 13 baz
4 4 14 bar
5 5 15 foo

One can simply insert : <- read.table(text=" after the zz, delete the carriage-return and then insert ", header=TRUE) after the last foo and type [enter].

zz<- read.table(text="  a  b   c
1 1 11 foo
2 2 12 bar
3 3 13 baz
4 4 14 bar
5 5 15 foo", header=TRUE)

One can also use scan to efficiently enter long sequences of pure numbers or pure character vector entries. Faced with: 67 75 44 25 99 37 6 96 77 21 31 41 5 52 13 46 14 70 100 18 , one can simply type: zz <- scan() and hit [enter]. Then paste the selected numbers and hit [enter] again and perhaps a second time to cause a double carriage-return and the console should respond "read 20 items".

> zz <- scan()
1: 67  75  44  25  99  37   6  96  77  21  31  41   5  52  13  46  14  70 100  18
21: 
Read 20 items

The "character" task. after pasting to console and editing out extraneous line-feeds and adding quotes, then hitting [enter]:

> countries <- scan(what="character")
1:     'republic of congo'
2:     'republic of the congo'
3:     'congo, republic of the'
4:     'congo, republic'
5: 'democratic republic of the congo'
6: 'congo, democratic republic of the'
7: 'dem rep of the congo'
8: 
Read 7 items
like image 24
IRTFM Avatar answered Oct 14 '22 01:10

IRTFM


You can also ask the questioner to use the dput function which dumps any data structure in a way that can be just copy-pasted into R. e.g.

> zz
  a  b   c
1 1 11 foo
2 2 12 bar
3 3 13 baz
4 4 14 bar
5 5 15 foo

> dput(zz)
structure(list(a = 1:5, b = 11:15, c = structure(c(3L, 1L, 2L, 
1L, 3L), .Label = c("bar", "baz", "foo"), class = "factor")), .Names = c("a", 
"b", "c"), class = "data.frame", row.names = c(NA, -5L))

> xx <- structure(list(a = 1:5, b = 11:15, c = structure(c(3L, 1L, 2L, 
+ 1L, 3L), .Label = c("bar", "baz", "foo"), class = "factor")), .Names = c("a", 
+ "b", "c"), class = "data.frame", row.names = c(NA, -5L))
> xx
  a  b   c
1 1 11 foo
2 2 12 bar
3 3 13 baz
4 4 14 bar
5 5 15 foo
like image 14
huon Avatar answered Oct 14 '22 02:10

huon