Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read in .xrdml data within a complex array

Tags:

I'm trying to read several files of type ".xrdml" and combine them into a single dataframe with an intuitive label. The problem is that this file type has a large metadata.

I've tried the following

Required package

library(rxylib)

Things i tried

temp = list.files(pattern="*.xrdml")
xyz<-do.call(rbind,sapply(temp, read_xyData,verbose = TRUE,metaData = FALSE))

i ended up with a list, i can call each member of the list using for example xyz[[2]]

          2Theta    V2
   [1,]  4.006565  3496
   [2,]  4.019695  3417
   [3,]  4.032826  3520
   [4,]  4.045956  3516
   [5,]  4.059086  3480
   [6,]  4.072217  3343
   [7,]  4.085347  3466
   [8,]  4.098477  3552
   [9,]  4.111607  3425
  [10,]  4.124738  3384

if i try to flatten the list by using the unlist function, then the result becomes messy

what i will like to do is to read in all the files and combine them by column, each file has first column in common i.e 2Theta . i will also like to use the unique part of each file title to label V2

my files have titles like "BBHD-FASS_4-70_step01_40s_LM 11_5.xrdml". what i hope to be able to do at the end is to have a dataframe similar to the sample below

2Theta   LM 6-26  LM 6-27  LM 6-28 LM 4-10 LM 4-11 LM 4-12
4.006565    3576    3535    3677    3576    3535    3677
4.019695    3526    3552    3662    3526    3552    3662
4.032826    3584    3581    3657    3584    3581    3657
4.045956    3489    3535    3539    3489    3535    3539
4.059086    3496    3507    3525    3496    3507    3525
4.072217    3335    3466    3628    3335    3466    3628
4.085347    3353    3456    3444    3353    3456    3444
4.098477    3430    3479    3588    3430    3479    3588
4.111607    3334    3547    3535    3334    3547    3535
4.124738    3424    3342    3439    3424    3342    3439
4.137868    3349    3384    3459    3349    3384    3459
4.150998    3318    3395    3413    3318    3395    3413
4.164129    3208    3490    3457    3208    3490    3457
4.177259    3357    3295    3519    3357    3295    3519
4.190389    3254    3372    3450    3254    3372    3450

Here are samples of my files sample files

Sadly, i've spent so much time already trying several things which didn't work.

I'll be very grateful for any help or guidance i can get on how to approach this problem.

like image 895
Hammao Avatar asked Sep 02 '18 05:09

Hammao


1 Answers

To get to the data you need to find the correct position in the list of data that is returned by read_xyData. You can do this by looking at str(lst) below. To get to the data use ...$dataset[[1]]$data_block. (there may be extractor functions in the package but I have not checked)

# download data : link dead
#download.file("https://ucc93bf0aa50821e11b95c9530f5.dl.dropboxusercontent.com/zip_by_token_key?_download_id=9101556320431172280658295109635067362614982268430911643523348&_notify_domain=www.dropbox.com&dl=1&key=AV5mxk0trnetzASlH9_xJijTiGE55mUz0qa-x7JveZ7-Rdp3Z8i7GmwwQoWj8tUO14RKj51huhb5CuBdoxAC3WLuHvOMr7_bul691AmGpmwZgWWy0STezjFRnq0CVUR-iHNnZUHk9-t-i72nYODDpjXvo0PBhWTXwJuNWCSL4bnAauZREQtZwzNlspMF8PwZ37E9enf1WUUakLJwE43GbV2lAkuOTDghfcMmwokulIMEGA", destfile=temp<-tempfile())
unzip(temp, exdir=xdir<-tempdir())  

nms <- list.files(xdir, pattern="xrdml", full.names=TRUE)
# grab the names to names columns later
cnms <- gsub(".*(LM \\w+).*$", "\\1", basename(nms))


library(rxylib)

# loop through files to read in
lst <- lapply(nms, read_xyData, verbose = TRUE, metaData = FALSE)

# grab the data
dats <- lapply(lst, function(x) x$dataset[[1]]$data_block)

# rename second column
dats <- lapply(seq_along(dats), function(x) {
                          colnames(dats[[x]])[2] <- cnms[x] ; dats[[x]]})

# merge
alldat <- Reduce(function(...) merge(..., by="2Theta"), dats)
like image 102
user20650 Avatar answered Sep 28 '22 18:09

user20650