I'm trying to remove stop words from sentences in file?
Stop Word which I mean :[I, a, an, as, at, the, by, in, for, of, on, that]
I have these sentences in file my_text.txt
:
One of the primary goals in the design of the Unix system was to create an environment that promoted efficient program
Then I want to remove stop word form the sentence above
I used this script :
array=( I a an as at the by in for of on that )
for i in "${array[@]}"
do
cat $p | sed -e 's/\<$i\>//g'
done < my_text.txt
But the output is:
One of the primary goals in the design of the Unix system was to create an environment that promoted efficient program
The expected output should be :
One primary goals design Unix system was to create an environment promoted efficient program
Note: I want to Delete Remove stop words not duplicated words?
Like this, assuming $p
is an existing file:
sed -i -e "s/\<$i\>//g" "$p"
You have to use double quotes, not single quotes to get variables expanded.
The -i
switch replace in line.
Learn how to quote properly in shell, it's very important :
"Double quote" every literal that contains spaces/metacharacters and every expansion:
"$var"
,"$(command "$var")"
,"${array[@]}"
,"a & b"
. Use'single quotes'
for code or literal$'s: 'Costs $5 US'
,ssh host 'echo "$HOSTNAME"'
. See
http://mywiki.wooledge.org/Quotes
http://mywiki.wooledge.org/Arguments
http://wiki.bash-hackers.org/syntax/words
array=( I a an as at the by in for of on that )
for i in "${array[@]}"
do
sed -i -e "s/\<$i\>\s*//g" Input_File
done
Try without \s*
to understand why I added this regex
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With