Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

New posts in emr

File already exists error writing new files from dataframe

apache-spark emr

Optimizing GC on EMR cluster

How do I submit more than one job to Hadoop in a step using the Elastic MapReduce API?

Get a yarn configuration from commandline

terminating a spark step in aws

SparkUI for pyspark - corresponding line of code for each stage?

apache-spark pyspark emr

Force Server Side Encryption for S3 Bucket

How to suppress INFO messages for spark-sql running on EMR?

log4j apache-spark emr

Pyspark - Load file: Path does not exist

AWS EMR perform "bootstrap" script on all the already running machines in cluster

EMR Spark - TransportClient: Failed to send RPC

Spark - Which instance type is preferred for AWS EMR cluster? [closed]

amazon-ec2 apache-spark emr

Where are the Spark logs on EMR?

scala apache-spark emr

Any Scala SDK or interface for AWS?

SQL query in Spark/scala Size exceeds Integer.MAX_VALUE

How do I copy files from S3 to Amazon EMR HDFS?

amazon-s3 hadoop hive hdfs emr

How to restart yarn on AWS EMR

hadoop hadoop-yarn emr

Compress file on S3

How do you make a HIVE table out of JSON data?

json hadoop hive amazon-emr emr

How to bootstrap installation of Python modules on Amazon EMR?