Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract first three columns from all tsv files in a folder

Tags:

linux

bash

csv

I have several tsv files in a folder which add up to over 50 gb total. To make it easier on memory when loading these files into R, I want to extract only the first 3 columns of these files.

How can all of the files have their columns extracted at once in terminal? I am running Ubuntu 16.04.

like image 506
Keshav M Avatar asked Feb 08 '18 11:02

Keshav M


1 Answers

This looks like a perfect use case for the cut utility

You can use it as follows:

cut -d$"\t" -f 1-3 folder/*

Where -d specifies the field delimiter (in this case tabs), -f specifies the fields to extract and folder/* is a glob specifying all files to be parsed.

like image 58
Tobias Ribizel Avatar answered Sep 19 '22 17:09

Tobias Ribizel