Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting a string from a data frame

Tags:

dataframe

r

I have a data frame in R that looks like this:

head(span_data)   

FECHA....DIA.Cá01TMax.Cá01HTMax.Cá01TMin.Cá01HTMin.Cá01TMed.Cá01HumMax.Cá01HumMin.Cá01HumMed.Cá01VelViento.Cá01DirViento.Cá01Rad.Cá01Precip.Cá01ETo
1 -------- --- -------- --------- -------- --------- -------- ---------- ---------- ---   ------- ------------- ------------- ------- ---------- ------- 
2  21-05-12 142     21.0     15:08      9.1      5:28     15.3       91.9       45.2        72.3           2.2         270.2    30.0        0.0    4.81
3  20-05-12 141     19.1     15:12     11.3      4:50     14.6       94.9       46.6       74.4           2.6         273.0    23.2       12.6     4.0
4  19-05-12 140     22.6     14:26     14.8     23:50     18.5       92.6       36.3       66.5           3.7         250.1    24.9        0.4    5.29
5  18-05-12 139     23.4     14:30     17.2     23:58     19.4       87.4       55.5       72.0           3.1         218.5    24.2        0.0    4.75
6  17-05-12 138     31.2     13:08     13.9      5:32     22.4       78.5       26.7       51.0           2.3         164.9    23.6        0.0    6.36

Right now, all of the rows are one long string, and I will be converting them to numbers. However, when I want to extract one of the rows, I get

span_data[3,1]
[1] 20-05-12 141     19.1     15:12     11.3      4:50     14.6       94.9       46.6       74.4           2.6         273.0    23.2       12.6     4.0
4272 Levels: -------- --- -------- --------- -------- --------- -------- ---------- ---------- ---------- ------------- ------------- ------- ---------- -------  ...

I don't want the "Levels" part. How do I extract just the string? (I'm sure this question has been answered before, but I just didn't know exactly how to pose the question.)

like image 940
ChunkyRice Avatar asked May 27 '12 13:05

ChunkyRice


People also ask

How do I extract a string from a DataFrame in Python?

extract() function is used to extract capture groups in the regex pat as columns in a DataFrame. For each subject string in the Series, extract groups from the first match of regular expression pat.

How do you extract a string in Python?

You can extract a substring in the range start <= x < stop with [start:step] . If start is omitted, the range is from the beginning, and if end is omitted, the range is to the end. You can also use negative values. If start > end , no error is raised and an empty character '' is extracted.

How do you convert a data frame to a string?

If you want to change the data type for all columns in the DataFrame to the string type, you can use df. applymap(str) or df. astype(str) methods.


3 Answers

could try...

a <- c("1-1","2-1","3-1")
b <- 1:3
ab<-as.data.frame(cbind(a,b))

x <- ab[3,1] # what you don't want 
x
#[1] 3-1
#Levels: 1-1 2-1 3-1

z <- as.character(ab[3,1]) # without levels as it is no longer a factor
z   
#[1] "3-1"
like image 52
user1317221_G Avatar answered Oct 21 '22 03:10

user1317221_G


You could try this:

 newdat <- read.table(text=span_data[[1]], stringsAsFactors=FALSE)
like image 34
IRTFM Avatar answered Oct 21 '22 04:10

IRTFM


It would be best to skip the first 3 (?) lines, and provide the appropriate colClasses argument to read.table. But you can get what you want from what you have quite easily:

as.character(span_data[3,1])
like image 1
Matthew Lundberg Avatar answered Oct 21 '22 04:10

Matthew Lundberg