Problem: I have a CSV dump file - with excess of 250,000 lines. When I use while read
- it takes a while (no pun intended). I would like to go back to the last 10,000 lines to do what I need to do instead of the 250,000 lines.
Code Snippet: My current code is this:
IFS=","
while read line
do
awk_var=`echo "$line" | awk -F" " '{print $0}'`
var_array=($awk_var)
read -a var_array <<< "${awk_var}"
echo "${var_array[1]}"
done </some_directory/directory/file_in_question.csv
Question: How can I use tail -n10000
with while read line
when reading the file_in_question.csv
with a bash script?
Replace:
done </some_directory/directory/file_in_question.csv
with:
done < <(tail -n10000 /some_directory/directory/file_in_question.csv)
The <(...)
construct is called process substitution. It creates a file-like object that bash can read from. Thus, this replaces reading from some_directory/directory/file_in_question.csv
directly with reading from tail -n10000 /some_directory/directory/file_in_question.csv
.
Using process substitution like this allows you to keep your while
loop in the main shell, not a subshell. Because of this, variables that you create in the while
loop will retain their value after the loop exits.
The code as shown prints the second column of a CSV file. If that is all that the code is supposed to do, then it can be replaced with:
awk -F, '{print $2}' /some_directory/directory/file_in_question.csv
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With