Questions tagged with Amazon EMR
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
We have a airflow setup runs the EMR jobs daily basis. I noticed an odd behavior that when I resubmit job for calculating the adhoc reports, spark application failed with below error, arguments seems...
When I try to create a new workspace for an AWS EMR Studio in the AWS Console, I get a blank page and a Javascript error in the console
("Failed to execute 'mark' on 'Performance':...
I am trying to have glue data catalog as the hive metastore, stood up the EMR(emr-6.15.0) with the following node classification config per AWS, and it always initialize a default glue catalog...
So I define manually finishing using the RunJobFlow operator (https://docs.aws.amazon.com/emr/latest/APIReference/API_RunJobFlow.html) `"KeepJobFlowAliveWhenNoSteps": True`. However, the cluster...
I would like to know the log4j configuration to get container logs into more structured format like Json, so I can leverage another automation to parse the files and train some customization to filter...
Hello,
I have upgraded the EMR from 6.14 to 6.15, and started seeing errors on the existing core node:
`org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException: IAMInstanceCredentialsProvider:...
I am trying to connect to my documentDB trhough the spark-mongodb connector, but it looks like DocumentDB does not support Collstats. How disable the collstats command so i can do my transformations...
How to add additional library i.e. databricks spark xml to a running EMR cluster and access it in Notebook
I am using emr-6.12.0 and trying to set environment varibles which are stored in secret manager in bootstrap.sh file.
```
SECRET_NAME="/myapp/dev/secrets"
SECRETS_JSON=$(aws secretsmanager...
I want my EMR cluster to be terminated automatically post an idle time.
I have configured 'Automatically terminate cluster after idle time' and set the idle time as '5 minutes' .
In my cluster i have...
If my environment is full of Apache Hudi integrating with EMR and Lake Formation, I found out that Hudi environment is not very friendly to be used by Redshift nor Athena. There are many advanced...
My customer is using AWS EMR and is storing all the Hive meta data on an external RDS instance, using MySQL 5.7.* And since MySQL 5.7 is running out of its lifecycle, we are pushing them to upgrade...