Awk: How to work on multiple files.txt in folder and subfolders?

Question

Given a folder with subfolders themselves with multilangual .txt files such as:

But where is Esope the holly Bastard
But where is 생 지 옥 이 군
지 옥 이
지 옥
지
我 是 你 的 爸 爸 ！
爸 爸 ！ ！ ！
你 不 會 的 ！

I already know how to count space-separated word-frequency within ONE file.txt :

$ grep -o '\w*' myfile.txt | awk '{a[$1]++}END{for(k in a)print a[k],k}' | sort > myoutput.txt

Getting the elegant :

1 생
1 군
1 Bastard
1 Esope
1 holly
1 the
1 不
1 我
1 是
1 會
2 이
2 But
2 is
2 where
2 你
2 的
3 옥
4 지
4 爸
5 ！

How to change the code to work on multiples files within a folder and its subfolders, all presenting a similar pattern ( *.txt at least) ?

hek2mgl · Accepted Answer

You can use the find command for that. Like this:

find -iname '*.txt' -exec cat {} \; | grep -o '\w*' | awk '{a[$1]++}END{for(k in a)print a[k],k}' | sort

I'm using the the option -exec to cat every *.txt file in the current directory and it's subdirs. The output will get piped to your grep|awk|sort pipe.

Awk: How to work on multiple files.txt in folder and subfolders?

Tags:

regex

bash

shell

awk

cjk

Hugolpz

1 Answers

hek2mgl

Recent Activity

Donate For Us

Awk: How to work on multiple files.txt in folder and subfolders?

Tags:

regex

bash

shell

awk

cjk

Hugolpz

1 Answers

hek2mgl

Related questions

Recent Activity

Donate For Us