So, I downloaded a dataset containing 900 txt files, one for each biological sample. What I want to do is merge all of this data into one data matrix in R.
txt_files = list.files()
# read txt files into a list
for (i in length(txt_files)){
x <- read.table(file=txt_files[i], sep="\t", header=TRUE, row.name=1)
}
All files are in one folder, so I use list.files()
to query all file names. Then I want to read each table into a separate R object (which is called x in this case). The problem is that I would like to name each object after the name of the actual file instead of x.
I've tried a couple of things and tried to search the internet, but haven't found a solution yet. One thing I did find was to use lapply to import them all into a data list.
data_list = lapply(txt_files, read.table, sep = "\t")
However, I don't think this will be appropriate for me, since the data matrixes are not available anymore after this. I hope someone can help me.
Naming connected (especially sequential) things is in general a bad thing. The next thing you'll want to do is loop over these things, and that means constructing names by pasting bits together. Its a mess.
Store things in a list whenever possible. You've done that. I created a few CSV files:
> txt_files=c("f1.txt","f2.txt","f3.txt","f4.txt","f5.txt")
> data_list = lapply(txt_files, read.table, sep = ",")
> data_list[[1]]
V1 V2 V3
1 1 2 3
> data_list[[3]]
V1 V2 V3
1 1 2 3
2 5 4 3
3 1 2 3
So now I can loop over them with for(i in 1:length(txt_files))
and get the name of the file with txt_files[i]
and so on:
> for(i in 1:length(txt_files)){
+ cat("File is ",txt_files[i],"\n")
+ print(summary(data_list[[i]]))
+ }
File is f1.txt
V1 V2 V3
Min. :1 Min. :2 Min. :3
1st Qu.:1 1st Qu.:2 1st Qu.:3
Median :1 Median :2 Median :3
Mean :1 Mean :2 Mean :3
3rd Qu.:1 3rd Qu.:2 3rd Qu.:3
Max. :1 Max. :2 Max. :3
File is f2.txt
V1 V2 V3
Min. :1 Min. :2 Min. :3
1st Qu.:1 1st Qu.:2 1st Qu.:3
Median :1 Median :2 Median :3
Mean :1 Mean :2 Mean :3
3rd Qu.:1 3rd Qu.:2 3rd Qu.:3
Max. :1 Max. :2 Max. :3
...
[etc]
You can do something like this:
names(data_list) <- txt_files
Or perhaps:
names(data_list) <- basename(txt_files)
Or maybe use sapply
instead of lapply
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With