Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does sort k1,1 changes the output?

I am new in hadoop world and I am trying to learn writing a code with map reduce mindset.

So, I was following the michael-noll tutorial.

One of the challenges, I am facing (besides understanding a new framework) is the amount of terminal tricks this framework uses.

So What does.

  $echo "foo foo quux labs foo bar quux" | /home/hduser/mapper.py | sort -k1,1 | /home/hduser/reducer.py

means??? what does echo does??

Also, the output of above code is:

  bar     1
  foo     3
  labs    1
   quux    2

Now if i dont have the sort -k1,1 thingy

  foo     2
  bar     1
  labs    1
  foo     1
   quux    2

What is the effect that sort flag is having? what does -k1,1 means?

Thanks..

Reference: http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

like image 900
frazman Avatar asked Nov 23 '25 03:11

frazman


1 Answers

In Linux, the vertical bar, | is used to redirect output of one command to be the input of another.

The echo command writes the following string to the standard output. So in your case, it is writing foo foo quux labs foo bar quux which is then passed as the input to /home/hduser/mapper.py, whose output is then passed as input to sort, and so on.

sort is a Linux command that sorts text. The -k flag tells it which column to sort by. So the 1,1 tells it to sort starting at column 1, ending at column 1.

Type man sort in your Linux terminal to learn more about the command. I hope this helps!

like image 101
Eric Alberson Avatar answered Nov 24 '25 23:11

Eric Alberson