Questions tagged with Amazon EMR

Content language: English

Select up to 5 tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Hello As part of Cloud Migration and Modernization approach using using AWS, the requirement is to migrate Hbase data directly to S3 then read the data from S3 using Java Microservices. (EMR would not...
1
answers
0
votes
279
views
bnaha
asked 7 days ago
I have a use case where I need to run Batch EMR job on schedule (daily). I can make folders on date basis for my data coming from IoT. Or I can make folders for each device sending IoT data and put...
1
answers
0
votes
272
views
asked 21 days ago
Trying to load data of 200GB into dynamo using spark EMR but facing performance issues. """ Copy paste the following code in your Lambda function. Make sure to change the following key parameters for...
4
answers
0
votes
611
views
asked a month ago
I'm trying to create a EMR 7.1.0 cluster with HBase enabled for full S3 backup (including WAL) via the web console. However, no AWSServiceRoleForEMRWAL role is automatically being created and thus my...
2
answers
0
votes
258
views
asked a month ago
I'm trying to find out if Trino on EMR supports access controls maintained in Lake Formation. My catalog is AWS Glue. I couldn't find any documentation on Lake Formation or EMR side that would talk...
1
answers
0
votes
329
views
profile picture
Saawgr
asked a month ago
Hello, Can we get solution for this error `Service: EmrServerlessResourceManager; Status Code: 403; Error Code: AccessDeniedException` while running spark submit jobs at EMR Serverless. Below is...
1
answers
0
votes
541
views
Ashwath
asked a month ago
I noticed that when you create a new EMR cluster using Spark, the default Python environment includes two different packages that both provide the "dateutil"...
1
answers
1
votes
467
views
dgibson
asked 2 months ago
Hello Experts, Technically speaking, EBS volumes assigned to the EMR core nodes are persistent storage and I have specifically created them to not delete on cluster termination. Then, I have attached...
Accepted AnswerAmazon EMR
1
answers
0
votes
448
views
Scott M
asked 2 months ago
I know the recommended strategy is to use EMR Serverless or EMR. However, I have a particular use case where I only need to run a fairly small PySpark job and need quick results. I've already gotten...
1
answers
0
votes
677
views
asked 2 months ago
Why does Amazon EMR creates inbound rule entries for master and core security groups? ![Core SG](/media/postImages/original/IM6Mggxg_vTQSTJFNCM0FRPA) ![Master...
1
answers
0
votes
569
views
asked 3 months ago
I have an EMR workspace under which I have 4 Jupyter notebooks created on which PySpark code blocks are run. I want to get the last execution code block time across all 4 notebooks to determine the...
1
answers
0
votes
553
views
Sukrit
asked 3 months ago
I want to change the default s3 storage class to INTELLIGENT_TIERING of Hive connector of EMR Trino 426 (EMR 6.15.0). I found the [hive.s3.storage-class option in the Trino 426 official...
Accepted AnswerAmazon EMR
2
answers
0
votes
622
views
asked 4 months ago