Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recreate vector from print() console output

Tags:

r

Regrettably often you see questions on SO that present data in a format that's not reproducible; often just the copied result of print() ...

set.seed(1)

x <- sample(LETTERS, 40, replace = T)
y <- rnorm(20)

... such as this:

x
 [1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U" "M" "S"
[18] "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I" "M" "P" "M" "E"
[35] "V" "R" "U" "C" "S" "K"

... or this:

y
 [1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
 [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
[11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
[16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575

Ideally I'd like to be able to copy, for example, the text from the chunk above to my clipboard, and call some function foo() such that all.equal(foo(), x) for discrete data types, and all(near(foo(), y)) for floats (given the printed accuracy).

Is there an easy way to (approximately) reconstruct a simple vector from the copied result of print()'ing it?


Edit: Ironically, I realized that my own example wasn't exactly fully reproducible. Here's the code to create the copied print output:
y_printed <- capture.output(y)
like image 543
Mikko Marttila Avatar asked Jul 24 '18 08:07

Mikko Marttila


2 Answers

I use scan for that problem.

Can you make a function out of the below code?

y <-
  '[1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
 [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
[11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
[16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575'

y <- scan(what = character(), text = y)
y <- sub("^\\s*\\[\\d+\\]", "", y)
y <- as.numeric(y[y != ""])

With the suggestion in the comment by @Moody_Mudskipper,

Pattern can be updated to "^\s*\[\d+\]" to support OP's example (which starts by a space).

a function could be

recreateVector <- function(X, numeric = TRUE, quiet = FALSE){
  X <- scan(what = character(), text = X, quiet = quiet)
  X <- sub("^\\s*\\[\\d+\\]", "", X)
  X <- X[X != ""]
  if(numeric) X <- as.numeric(X)
  X
}


recreateVector(y)   # Use the original y
#Read 24 items
# [1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
# [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
#[11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
#[16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575

With a character vector, set argument numeric = FALSE, the default is TRUE.

x <-
'[1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U" "M" "S"
[18] "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I" "M" "P" "M" "E"
[35] "V" "R" "U" "C" "S" "K"'

recreateVector(x, numeric = FALSE)
#Read 43 items
# [1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U"
#[16] "M" "S" "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I"
#[31] "M" "P" "M" "E" "V" "R" "U" "C" "S" "K"

Note the argument quiet. I have set the default to FALSE, like in the definition of scan because I prefer to see whether anything was actually read in.

like image 148
Rui Barradas Avatar answered Oct 17 '22 04:10

Rui Barradas


We can mimic the guess on data type done when reading CSV files:

library(tidyverse)
unprint <- function(s) {
  s %>% str_replace_all(" *\\[\\d+\\] *","") %>% str_replace_all(" +","\n") %>% 
  textConnection %>% read.table
}
unprint(' [1]  0.91897737  0.78213630  0.07456498 -1.98935170  0.61982575
 [6] -0.05612874 -0.15579551 -1.47075238 -0.47815006  0.41794156
[11]  1.35867955 -0.10278773  0.38767161 -0.05380504 -1.37705956
[16] -0.41499456 -0.39428995 -0.05931340  1.10002537  0.76317575') %>% head

#           V1
#1  0.91897737
#2  0.78213630
#3  0.07456498
#4 -1.98935170
#5  0.61982575
#6 -0.05612874


unprint(' [1] "G" "J" "O" "X" "F" "X" "Y" "R" "Q" "B" "F" "E" "R" "J" "U" "M" "S"
[18] "Z" "J" "U" "Y" "F" "Q" "D" "G" "K" "A" "J" "W" "I" "M" "P" "M" "E"
[35] "V" "R" "U" "C" "S" "K"') %>% head

#  V1
#1  G
#2  J
#3  O
#4  X
#5  F
#6  X

A more elaborated version to handle brackets in strings : Also gives the correct output : a vector, not a data frame.

unprint <- function(s) {
  t <- s %>% textConnection %>% readLines %>% 
    str_replace(" *\\[\\d+\\] *","") %>%
    paste(collapse=' ') %>% str_replace_all(" ","\n") %>% 
    textConnection %>% read.table(stringsAsFactors=FALSE) 
  t$V1 %>% str_replace_all("\n"," ")
}

x <- unprint(' [1] "x + y  [1]" "x + z  [2]"')
x
#[1] "x + y  [1]" "x + z  [2]"
like image 28
Nicolas2 Avatar answered Oct 17 '22 03:10

Nicolas2