I am trying to extract a specific field in a specific line of a CSV file.
I'm able to do it according to the number row but sometimes the row number of the file will change so this is not that flexible.
I wanted to try and do it to extract a specific name before the field I'm interested in.
[Header]
File,5
Researcher Name,Joe Black
Experiment,Illumina-Project
Date,05/02/2021
Pipeline,RNA_Pipeline
In this case, I want to extract the researcher and the Experiment Name from the CSV file:
Joe Black Illumina-Project
The following works but as I said it is not as flexible:
awk -F',' 'NR == 3 { print $2 }' test.csv
So I was trying to do something like from what I've found but have not been successful
awk -F',' 'Line == "Researcher Name" { print $1 }' test.csv
You could use awk for this. Change '$2' to the nth column you want. echo '1,"2,3,4,5",6' | awk -F "\"*,\"*" '{print $2}' will print 2 instead of 2,3,4,5 .
Make a list of columns that have to be extracted. Use read_csv() method to extract the csv file into data frame. Print the exracted data. Plot the data frame using plot() method.
Printing specific lines from a file is no exception. To display 13th line, you can use a combination of head and tail: Or, you can use sed command: To display line numbers from 20 to 25, you can combine head and tail commands like this: Or, you can use the sed command like this: Detailed explanation of each command follows next.
The input.txt file contains the output of the ls command in the long listing format: We can use the awk command to display specific columns. Let’s print the 5th column from the file: print: it’s awk’s built-in function which prints text to the standard output stream Note that awk uses $N to represent the Nth column.
The powerful sed command provides several ways of printing specific lines. For example, to display the 10th line, you can use sed in the following manner: The -n suppresses the output while the p command prints specific lines. Read this detailed SED guide to learn and understand it in detail.
Printing specific lines from a file is no exception. To display 13th line, you can use a combination of head and tail: head -13 file_name | tail +13 Or, you can use sed command:
Whenever your input data contains name-value pairs it's best to first create an array that holds those mappings (f[]
below) and then you can print/test/modify whatever values you like in whatever order you like just by indexing the array by the names.
Look at how easy it is to do what you want with this approach:
$ awk -F, '{f[$1]=$2} END{print f["Researcher Name"], f["Experiment"]}' file
Joe Black Illumina-Project
but also how easy it is to do whatever else you might need in future, e.g.:
$ awk -F, '
{ f[$1]=$2 }
END {
if ( (f["File"] > 3) && (f["Date"] ~ /2021/) ) {
print f["Experiment"], f["Pipeline"], f["Researcher Name"]
}
}
' file
Illumina-Project RNA_Pipeline Joe Black
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With