Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to import a text file into R as a character vector

I would like to know if there's a simple command in R which already exists and would allow to import a char text file (.txt) into a char vector.

The file might be English text with a string like "Hello my name is Fagui Curtain" and the output in R would be a char vector A such that A[1]<-"H", A[2]<-"e", A[3]<-"l", etc....

I've tried with the scan function, but it would return words A[1]<-"Hello", A[2]<-"my"....

I googled for my question but couldn't find anything useful.

Thanks

like image 741
Fagui Curtain Avatar asked Dec 25 '22 20:12

Fagui Curtain


2 Answers

Try strsplit after removing the space with gsub

A <- strsplit(gsub('\\s+', '', lines),'')[[1]]
A
#[1] "H" "e" "l" "l" "o" "m" "y" "n" "a" "m" "e" "i" "s" "F" "a" "g" "u" "i" "C"
#[20] "u" "r" "t" "a" "i" "n"

Or

library(stringi)
stri_extract_all_regex(lines, '\\w')[[1]]
#[1] "H" "e" "l" "l" "o" "m" "y" "n" "a" "m" "e" "i" "s" "F" "a" "g" "u" "i" "C"
#[20] "u" "r" "t" "a" "i" "n"

Or if you are using linux, scan and be piped with awk

scan(pipe("awk 'BEGIN{FS=\"\";OFS=\" \"}{$1=$1}1' file.txt"), 
                  what='', quiet=TRUE)
#[1] "H" "e" "l" "l" "o" "m" "y" "n" "a" "m" "e" "i" "s" "F" "a" "g" "u" "i" "C"
#[20] "u" "r" "t" "a" "i" "n"

data

lines <- readLines('file.txt')
like image 164
akrun Avatar answered Jan 13 '23 14:01

akrun


Alternative solution with use of stringr package (I like it as it produces very readable syntax).

sample_text

Hello my name is Fagui Curtain

File reading

require(stringr)
testVector <- str_split(readLines("sample_text.txt"), pattern = " ")
like image 33
Konrad Avatar answered Jan 13 '23 15:01

Konrad