Questions tagged with Amazon EMR

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

We have a airflow setup runs the EMR jobs daily basis. I noticed an odd behavior that when I resubmit job for calculating the adhoc reports, spark application failed with below error, arguments seems...
Accepted AnswerAmazon EMR
1
answers
0
votes
236
views
Vaas
asked 8 months ago
When I try to create a new workspace for an AWS EMR Studio in the AWS Console, I get a blank page and a Javascript error in the console ("Failed to execute 'mark' on 'Performance':...
0
answers
0
votes
129
views
asked 8 months ago
I am trying to have glue data catalog as the hive metastore, stood up the EMR(emr-6.15.0) with the following node classification config per AWS, and it always initialize a default glue catalog...
Accepted AnswerAmazon EMRAWS Glue
1
answers
0
votes
635
views
zying
asked 8 months ago
So I define manually finishing using the RunJobFlow operator (https://docs.aws.amazon.com/emr/latest/APIReference/API_RunJobFlow.html) `"KeepJobFlowAliveWhenNoSteps": True`. However, the cluster...
1
answers
0
votes
210
views
asked 8 months ago
I would like to know the log4j configuration to get container logs into more structured format like Json, so I can leverage another automation to parse the files and train some customization to filter...
2
answers
0
votes
587
views
Scott M
asked 8 months ago
Hello, I have upgraded the EMR from 6.14 to 6.15, and started seeing errors on the existing core node: `org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: IAMInstanceCredentialsProvider:...
1
answers
0
votes
187
views
ruakn
asked 8 months ago
I am trying to connect to my documentDB trhough the spark-mongodb connector, but it looks like DocumentDB does not support Collstats. How disable the collstats command so i can do my transformations...
1
answers
0
votes
558
views
asked 8 months ago
How to add additional library i.e. databricks spark xml to a running EMR cluster and access it in Notebook
1
answers
0
votes
288
views
Rajeev
asked 8 months ago
I am using emr-6.12.0 and trying to set environment varibles which are stored in secret manager in bootstrap.sh file. ``` SECRET_NAME="/myapp/dev/secrets" SECRETS_JSON=$(aws secretsmanager...
1
answers
0
votes
444
views
vivek
asked 8 months ago
I want my EMR cluster to be terminated automatically post an idle time. I have configured 'Automatically terminate cluster after idle time' and set the idle time as '5 minutes' . In my cluster i have...
1
answers
0
votes
337
views
Joswa
asked 8 months ago
If my environment is full of Apache Hudi integrating with EMR and Lake Formation, I found out that Hudi environment is not very friendly to be used by Redshift nor Athena. There are many advanced...
2
answers
0
votes
560
views
Ray Lai
asked 8 months ago
My customer is using AWS EMR and is storing all the Hive meta data on an external RDS instance, using MySQL 5.7.* And since MySQL 5.7 is running out of its lifecycle, we are pushing them to upgrade...
Accepted AnswerAmazon EMRMySQL
1
answers
1
votes
374
views
AWS
Lei
asked 8 months ago