Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - Removing the first and last character of a every factor in a data.table

Tags:

r

data.table

I'm new to R and I have the following quick question: What is the best way to delete the first and the last character of each "cell" in a data.table. I have imported the data from a .txt file in which the text has three-character separator - "^|^"?

DT <- fread("file.txt", header = T, sep= "|")

  Row     Conc   group
  ^1^     ^2.5^    ^A^
  ^2^     ^3.0^    ^A^
  ^3^     ^4.6^    ^B^
  ^4^     ^5.0^    ^B^
  ^5^     ^3.2^    ^C^
  ^6^     ^4.2^    ^C^
  ^7^     ^5.3^    ^D^
  ^8^     ^3.4^    ^D^ 

I am able to remove the "^"s column by column using the stringi package:

DT[, Row := stri_sub(Row,2,-2)]    

It converts it to char, but that should be alright. However, as the data.table I am using has 46 columns, I am looking for more time-efficient way to do it.

like image 205
Tsvetan Nikolov Avatar asked Dec 07 '22 21:12

Tsvetan Nikolov


1 Answers

Or to continue your approach:

library(data.table)
library(stringi)

cols <- names(df)
setDT(df)[, (cols) := lapply(.SD, function(x) stri_sub(x, 2, -2))]

If you want to convert columns containing numbers to an appropriate type, you can use the code provided by @Frank in the comments:

setDT(df)[, (cols) := lapply(.SD, function(x) type.convert(stri_sub(x, 2, -2)))]
like image 129
Sumedh Avatar answered Dec 11 '22 11:12

Sumedh