I need to set a custom environment variable in EMR to be available when running a spark application. I have tried adding this: <pre class="prettyprint"><code> ... --configurations '[ { "Classification": "spark-env", "Configurations": [ { "Classification": "export", "Configurations": [], "Properties": { "SOME-ENV-VAR": "qa1" } } ], "Properties": {} } ]' ... </code></pre> and also tried to replace "spark-env with <code>hadoop-env</code> but nothing seems to work. There is this answer from the aws forums. but I can't figure out how to apply it. I'm running on EMR 5.3.1 and launch it with a preconfigured step from the cli: <code>aws emr create-cluster...</code>

Add the custom configurations like below JSON to a file say, <code>custom_config.json</code> <pre class="prettyprint"><code>[ { "Classification": "spark-env", "Properties": {}, "Configurations": [ { "Classification": "export", "Properties": { "VARIABLE_NAME": VARIABLE_VALUE, } } ] } ] </code></pre> And, On creating the emr cluster, pass the file reference to the <code>--configurations</code> option <pre class="prettyprint"><code>aws emr create-cluster --configurations file://custom_config.json --other-options... </code></pre>

For me replacing spark-env to yarn-env fixed issue.

How to set a custom environment variable in EMR to be available for a spark Application

Tags:

environment-variables

amazon-web-services

apache-spark

hadoop

emr

I need to set a custom environment variable in EMR to be available when running a spark application.

I have tried adding this:

                   ...
                   --configurations '[                                    
                                      {
                                      "Classification": "spark-env",
                                      "Configurations": [
                                        {
                                        "Classification": "export",
                                        "Configurations": [],
                                        "Properties": { "SOME-ENV-VAR": "qa1" }
                                        }
                                      ],
                                      "Properties": {}
                                      }
                                      ]'
                   ...

and also tried to replace "spark-env with hadoop-env but nothing seems to work.

There is this answer from the aws forums. but I can't figure out how to apply it. I'm running on EMR 5.3.1 and launch it with a preconfigured step from the cli: aws emr create-cluster...

287

asked Feb 22 '17 15:02

NetanelRabinowitz

2 Answers

Add the custom configurations like below JSON to a file say, custom_config.json

[   
  {
   "Classification": "spark-env",
   "Properties": {},
   "Configurations": [
       {
         "Classification": "export",
         "Properties": {
             "VARIABLE_NAME": VARIABLE_VALUE,
         }
       }
   ]
 }
]

And, On creating the emr cluster, pass the file reference to the --configurations option

aws emr create-cluster --configurations file://custom_config.json --other-options...

144

answered Oct 13 '22 23:10

franklinsijo

For me replacing spark-env to yarn-env fixed issue.

answered Oct 13 '22 22:10

Przemek

Related questions
                            
                                How can I monitor memory used by specific process in AWS cloudwatch?
                            
                                Cloudsearch Fuzzy terms and phrases
                            
                                What database driver should be used to access AWS Aurora?
                            
                                Timeout when publishing from AWS Lambda to SNS
                            
                                Is there a way to programmatically list all of the available actions for an AWS service?
                            
                                CloudFront wasn't able to connect to the origin
                            
                                Terraform: state management for multi-tenancy
                            
                                AWS Cloudformation Fn::ImportValue inside Fn::GetAtt
                            
                                What's the best way to store token signing certificate for an AWS web app?
                            
                                how to add a node to my kops cluster? (node in here is my external instance)
                            
                                Template format error: Unresolved resource dependencies
                            
                                Is it possible to intercept kill signals to close DB connections right before a lambda function is killed and started cold?
                            
                                AWS CloudWatch Logs Insights alarms
                            
                                Load balancing gRPC requests using one of AWS Load Balancers
                            
                                Final GAE vs AWS architectural decision
                            
                                Push files up to Amazon Cloudfront: Possible?
                            
                                RDS instance access from Ec2 instance from different region
                            
                                Why client-side load balancers like Ribbon?
                            
                                OR condition in filter expression for dynamo db
                            
                                Amazon AWS Athena S3 and Glacier Mixed Bucket

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With