Python client support for running Hive on top of Amazon EMR

Tags:

I've noticed that neither mrjob nor boto supports a Python interface to submit and run Hive jobs on Amazon Elastic MapReduce (EMR). Are there any other Python client libraries that supports running Hive on EMR?

671

asked May 23 '11 22:05

poiuy

1 Answers

With boto you can do something like this:

args1 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script',
         u'--base-path',
         u's3://us-east-1.elasticmapreduce/libs/hive/',
         u'--install-hive',
         u'--hive-versions',
         u'0.7']
args2 = [u's3://us-east-1.elasticmapreduce/libs/hive/hive-script',
         u'--base-path',
         u's3://us-east-1.elasticmapreduce/libs/hive/',
         u'--hive-versions',
         u'0.7',
         u'--run-hive-script',
         u'--args',
         u'-f',
         s3_query_file_uri]
steps = []
for name, args in zip(('Setup Hive','Run Hive Script'),(args1,args2)):
    step = JarStep(name,
                   's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar',
                   step_args=args,
                   #action_on_failure="CANCEL_AND_WAIT"
                   )
    #should be inside loop
    steps.append(step)
# Kick off the job
jobid = EmrConnection().run_jobflow(name, s3_log_uri,
                                   steps=steps,
                                   master_instance_type=master_instance_type,
                                   slave_instance_type=slave_instance_type,
                                   num_instances=num_instances,
                                   hadoop_version="0.20")

102

answered Nov 15 '22 19:11

unthingable

Related questions
                            
                                Need a way to determine if a file is done being written to
                            
                                Verbally format a number in Python
                            
                                Unstructured Text to Structured Data
                            
                                How to configure Eclipse for PyDev? Python doesn't appear in Preferences window
                            
                                Dynamically loading Python application code from database under Google App Engine
                            
                                _really_ disable GtkTreeView searching
                            
                                twisted: difference between `defer.execute` and `threads.deferToThread`
                            
                                How can I remove the axes in an Axes3D class?
                            
                                Getting html stripped of script and style tags with BeautifulSoup?
                            
                                Check whether a path is valid in Python
                            
                                How to remove an item from a gtkMenu?
                            
                                Python trace module - Trace lines as they are executed, but save to file, rather than stdout
                            
                                Implementing a custom string method
                            
                                Add new navigate modes in matplotlib
                            
                                Python: namespaces in xml ElementTree (or lxml)
                            
                                Create a Python type from C that implements a __dict__?
                            
                                Create instance method in metaclass using partial in Python 3
                            
                                Does a wysiwyg editor for report lab's rml exist?
                            
                                Python wrapper to access Hg, Git and possibly Bazaar repositories?
                            
                                matplotlib figures disappearing between show() and savefig()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python client support for running Hive on top of Amazon EMR

Tags:

python

boto

hive

elastic-map-reduce

poiuy

People also ask

1 Answers

unthingable

Recent Activity

Donate For Us