Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in amazon-emr

Persisting data to DynamoDB using Apache Spark

How do I run Spark jobs concurrently in the same AWS EMR cluster ?

How to fix error on pyspark EMR Notebook - AnalysisException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

Running Amazon EMR with a custom AMI?

how to install custom packages on amazon EMR bootstrap action in code?

python boto amazon-emr

How do I specify multiple bootstrap actions using aws cli?

Use S3DistCp to copy file from S3 to EMR

How to specify the location of custom log4j.configuration when spark-submit to Amazon EMR?

Trying to install pandas for Pyspark running on Amazon EMR

pandas pyspark amazon-emr

Processing (OSM) PBF files in Spark

How to access a public S3 bucket from another AWS account?

"Parquet record is malformed" while column count is not 0

How to use Hadoop Streaming with LZO-compressed Sequence Files?

hadoop mapreduce amazon-emr

Which Policy is needed for elasticmapreduce:RunJobFlow in AWS?

How to wait until all executors are allocated before Spark application starts on YARN?

Adding external jars in EMR Notebooks

How to set PYTHONHASHSEED on AWS EMR

How can I use graphframes with pyspark on AWS EMR?