I have written a shell script in ksh to convert a CSV file into Spreadsheet XML file. It takes an existing CSV file (the path to which is a variable in the script), and then creates a new output file .xls. The script has no positional parameters. The file name of the CSV is currently hardcoded into the script.
I would like to amend the script so it can take the input CSV data from a pipe, and so that the .xls output data can also be piped or redirected (>) to a file on the command line.
How is this achieved?
I am struggling to find documentation on how to write a shell script to take input from a pipe. It appears that 'read' is only used for std input from kb.
Thanks.
Edit : script below for info (now amended to take input from a pipe via the cat, as per the answer to the question.
#!/bin/ksh #Script to convert a .csv data to "Spreadsheet ML" XML format - the XML scheme for Excel 2003 # # Take CSV data as standard input # Out XLS data as standard output # DATE=`date +%Y%m%d` #define tmp files INPUT=tmp.csv IN_FILE=in_file.csv #take standard input and save as $INPUT (tmp.csv) cat > $INPUT #clean input data and save as $IN_FILE (in_file.csv) grep '.' $INPUT | sed 's/ *,/,/g' | sed 's/, */,/g' > $IN_FILE #delete original $INPUT file (tmp.csv) rm $INPUT #detect the number of columns and rows in the input file ROWS=`wc -l < $IN_FILE | sed 's/ //g' ` COLS=`awk -F',' '{print NF; exit}' $IN_FILE` #echo "Total columns is $COLS" #echo "Total rows is $ROWS" #create start of Excel File echo "<?xml version=\"1.0\"?> <?mso-application progid=\"Excel.Sheet\"?> <Workbook xmlns=\"urn:schemas-microsoft-com:office:spreadsheet\" xmlns:o=\"urn:schemas-microsoft-com:office:office\" xmlns:x=\"urn:schemas-microsoft-com:office:excel\" xmlns:ss=\"urn:schemas-microsoft-com:office:spreadsheet\" xmlns:html=\"http://www.w3.org/TR/REC-html40\"> <DocumentProperties xmlns=\"urn:schemas-microsoft-com:office:office\"> <Author>Ben Hamilton</Author> <LastAuthor>Ben Hamilton</LastAuthor> <Created>${DATE}</Created> <Company>MCC</Company> <Version>10.2625</Version> </DocumentProperties> <ExcelWorkbook xmlns=\"urn:schemas-microsoft-com:office:excel\"> <WindowHeight>6135</WindowHeight> <WindowWidth>8445</WindowWidth> <WindowTopX>240</WindowTopX> <WindowTopY>120</WindowTopY> <ProtectStructure>False</ProtectStructure> <ProtectWindows>False</ProtectWindows> </ExcelWorkbook> <Styles> <Style ss:ID=\"Default\" ss:Name=\"Normal\"> <Alignment ss:Vertical=\"Bottom\" /> <Borders /> <Font /> <Interior /> <NumberFormat /> <Protection /> </Style> <Style ss:ID=\"AcadDate\"> <NumberFormat ss:Format=\"Short Date\"/> </Style> </Styles> <Worksheet ss:Name=\"Sheet 1\"> <Table> <Column ss:AutoFitWidth=\"1\" />" #for each row in turn, create the XML elements for row/column r=1 while (( r <= $ROWS )) do echo "<Row>\n" c=1 while (( c <= $COLS )) do DATA=`sed -n "${r}p" $IN_FILE | cut -d "," -f $c ` if [[ "${DATA}" == [0-9][0-9]\.[0-9][0-9]\.[0-9][0-9][0-9][0-9] ]]; then DD=`echo $DATA | cut -d "." -f 1` MM=`echo $DATA | cut -d "." -f 2` YYYY=`echo $DATA | cut -d "." -f 3` echo "<Cell ss:StyleID=\"AcadDate\"><Data ss:Type=\"DateTime\">${YYYY}-${MM}-${DD}T00:00:00.000</Data></Cell>" else echo "<Cell><Data ss:Type=\"String\">${DATA}</Data></Cell>" fi (( c+=1 )) done echo "</Row>" (( r+=1 )) done echo "</Table>\n</Worksheet>\n</Workbook>" rm $IN_FILE > /dev/null exit 0
The pipe character | is used to connect the output from one command to the input of another. > is used to redirect standard output to a file. Try it in the shell-lesson-data/exercise-data/proteins directory!
Arguments can be passed to the script when it is executed, by writing them as a space-delimited list following the script file name. Inside the script, the $1 variable references the first argument in the command line, $2 the second argument and so forth.
A pipe is a form of redirection (transfer of standard output to some other destination) that is used in Linux and other Unix-like operating systems to send the output of one command/program/process to another command/program/process for further processing.
Logical OR Operator ( || ) in Bash The logical OR operator || processes multiple values. It is usually used with boolean values and returns a boolean value. It returns true if at least one of the operands is true. Returns false if all values are false.
Commands inherit their standard input from the process that starts them. In your case, your script provides its standard input for each command that it runs. A simple example script:
#!/bin/bash cat > foo.txt
Piping data into your shell script causes cat
to read that data, since cat
inherits its standard input from your script.
$ echo "Hello world" | myscript.sh $ cat foo.txt Hello world
The read
command is provided by the shell for reading text from standard input into a shell variable if you don't have another command to read or process your script's standard input.
#!/bin/bash read foo echo "You entered '$foo'" $ echo bob | myscript.sh You entered 'bob'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With