How to specify max vcores to be allocated to a query in hive?

Tags:

I am running multiple queries on the hive. I have a Hadoop cluster with 6 nodes. Total vcores in the cluster is 21.

I need only 2 cores to be allocated to a python process so that the rest of the available cores will be used by another main process.

Code

from pyhive import hive
hive_host_name = "subdomain.domain.com"
hive_port = 20000
hive_user = "user"
hive_password = "password"
hive_database = "database"

conn = hive.Connection(host=hive_host_name, port=hive_port,username=hive_user, database=hive_database, configuration={})
cursor = conn.cursor()
cursor.execute('select count(distinct field) from somedata')

368

asked Nov 13 '19 06:11

Vishnu

Video Answer

1 Answers

Try passing following setting in the configuration map:

yarn.nodemanager.resource.cpu-vcores=2

Default value is 8 for this setting.

Description: Number of CPU cores that can be allocated for containers.

Your updated code will be like:

from pyhive import hive
hive_host_name = "subdomain.domain.com"
hive_port = 20000
hive_user = "user"
hive_password = "password"
hive_database = "database"
configuration = {
    "yarn.nodemanager.resource.cpu-vcores": 2
}

conn = hive.Connection( \
                       host=hive_host_name,
                       port=hive_port,
                       username=hive_user,
                       database=hive_database,
                       configuration=configuration
                      )
cursor = conn.cursor()
cursor.execute('select count(distinct field) from somedata')

Reference URL

133

answered Oct 19 '22 07:10

Ambrish

Related questions
                            
                                Is `pickle.dump(d, f)` equivalent to `f.write(pickle.dumps(d))`?
                            
                                Variable tf.Variable has 'None' for gradient in TensorFlow Probability
                            
                                How is learning rate decay implemented by Adam in keras
                            
                                How do you remove a comment in ruamel.yaml?
                            
                                Every product/combination of nested dictionaries saved to DataFrame
                            
                                Is there a way to turn a date-indexed dataframe containing durations of events, into a dataframe of binary data showing event for each day?
                            
                                Numpy concatenate + merge 1D arrays
                            
                                pd.Series assignment with pd.IndexSlice results in NaN values despite matching indices
                            
                                How to debug (500) Internal Server Error on Python Waitress server?
                            
                                Lateral Join in django queryset (in order to use jsonb_to_recordset postgresql function)
                            
                                Airflow stack webserver failing to resolve postgres related attribute, fails to start
                            
                                Jupyter notebook color different parentheses by different colors
                            
                                Loopback Access Token To Flask
                            
                                How to create train, test and validation splits in tensorflow 2.0
                            
                                Anaconda ImportError: /usr/lib64/libstdc++.so.6: version `GLIBCXX_3.4.21' not found
                            
                                Using '_id' in Django
                            
                                How can one use HashiCorp Vault in Airflow?
                            
                                Converting from generator-based to native coroutines
                            
                                How to build python project based on pyproject.toml
                            
                                Unable to save TensorFlow Keras LSTM model to SavedModel format

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to specify max vcores to be allocated to a query in hive?

Tags:

python

hive

hadoop-yarn

pyhive

Vishnu

People also ask

Video Answer

1 Answers

Ambrish

Recent Activity

Donate For Us