Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in emr

How do I submit more than one job to Hadoop in a step using the Elastic MapReduce API?

Get a yarn configuration from commandline

terminating a spark step in aws

SparkUI for pyspark - corresponding line of code for each stage?

apache-spark pyspark emr

Force Server Side Encryption for S3 Bucket

How to suppress INFO messages for spark-sql running on EMR?

log4j apache-spark emr

Pyspark - Load file: Path does not exist

AWS EMR perform "bootstrap" script on all the already running machines in cluster

EMR Spark - TransportClient: Failed to send RPC

Spark - Which instance type is preferred for AWS EMR cluster? [closed]

amazon-ec2 apache-spark emr

Where are the Spark logs on EMR?

scala apache-spark emr

Any Scala SDK or interface for AWS?

SQL query in Spark/scala Size exceeds Integer.MAX_VALUE

How do I copy files from S3 to Amazon EMR HDFS?

amazon-s3 hadoop hive hdfs emr

How to restart yarn on AWS EMR

hadoop hadoop-yarn emr

Compress file on S3

How do you make a HIVE table out of JSON data?

json hadoop hive amazon-emr emr

How to bootstrap installation of Python modules on Amazon EMR?

"Container killed by YARN for exceeding memory limits. 10.4 GB of 10.4 GB physical memory used" on an EMR cluster with 75GB of memory