Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to implement counters in hadoop streaming in python

Tags:

python

hadoop

I am new to hadoop streaming. I have few filter conditions in my reduce code, I would like to know how many records pass this conditions. I come to know we can do this by writing custom counters. Can some body show point me how to write custom counters?

I am emitting three columns in mapper code, say a,b,c key is a, and value as list, which is like [b,c], To have an example from mapper code, it is like ['I'^['C','P']]

Here is my reduce code.

labels = ["a","b"]
for line in sys.stdin:
    l = line.strip().split("^")
    key = l[0]
    value = l[1]
    record = [key] + value
    records.append(record)
df = pd.DataFrame.from_records(records,columns=labels)
df = df((df['a'] == 'I') & (df['b'] == 'C'))

I would like to know how many records df contains, at reducer level.

Thank you.

like image 711
subro Avatar asked Mar 01 '17 07:03

subro


People also ask

What are counters in Hadoop streaming?

Counters in Hadoop are used to keep track of occurrences of events. In Hadoop, whenever any job gets executed, Hadoop Framework initiates Counter to keep track of job statistics like the number of bytes read, the number of rows read, the number of rows written etc.

How does Python integrate with Hadoop?

Hadoop framework is written in Java language; however, Hadoop programs can be coded in Python or C++ language. We can write programs like MapReduce in Python language, while not the requirement for translating the code into Java jar files.

What support is available in Python for Hadoop streaming?

Hadoop Streaming supports almost all types of programming languages such as Python, C++, Ruby, Perl etc. The entire Hadoop Streaming framework runs on Java. However, the codes might be written in different languages as mentioned in the above point.


1 Answers

You can simply print to stderr:

print >> sys.stderr, "reporter:counter: CUSTOM, NbRecords,1"

This will increment counter "NbRecords" in counters group "CUSTOM" by 1

like image 159
user1151446 Avatar answered Oct 08 '22 21:10

user1151446