Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in hadoop-streaming

Can I force my reducers (copy phase) to start only when all mappers are completed

Amazon Elastic MapReduce - SIGTERM

Python MapReduce Hadoop Streaming Job that requires multiple input files?

Hive FAILED: ParseException line 2:0 cannot recognize input near ''macaddress'' 'CHAR' '(' in column specification

hadoop, python, subprocess failed with code 127

POC for Hadoop in real time scenario

Map Reduce output to CSV or do I need Key Values?

hadoop 2.4.0 streaming generic parser options using TAB as separator

Processing images using hadoop

Pass directories not files to hadoop-streaming?

hadoop hadoop-streaming

What is the difference between Rack-local map tasks and Data-local map tasks?

Python hadoop streaming : Setting a job name

How to get the name of input file in MRjob

How to use a file in a hadoop streaming job using python?

How to set the precise max number of concurrently running tasks per node in Hadoop 2.4.0 on Elastic MapReduce

How to read hadoop sequential file?

Using python efficiently to calculate hamming distances [closed]

Hadoop: job runs okay on smaller set of data but fails with large dataset

Amazon MapReduce best practices for logs analysis