Questions tagged with Amazon EMR

Content language: English

Select up to 5 tags to filter

Sort by most recent

Filter Questions by

AllAnsweredUnansweredNo Answer

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Unable to load data to apache in EMR cluster notebook

I am running an EMR cluster with an attached notebook, and using Apache spark to load/process data however I have not been able to load data into Apache. Whenever I try to run...

Analytics Amazon EMR Extract Transform & Load Data Amazon EMR Studio

answers

votes

769

views

Music Dev

asked 4 months ago

Spark application takes longer than expected in emr 7

I have spark application running in emr 7 that took 15+ hours which was taken 9 hours in emr 6.14. There is no code change and data volume changes. One observation is the application attempted thrice...

Accepted AnswerAmazon EMR

answers

votes

731

views

Vaas

asked 4 months ago

How should i configure my emr cluster to handle large data

I have an EMR cluster and I have used the treasure data connector to read data from table into dataframe using pyspark. Now these tables that I'm trying to read have approximately 100 million to 500...

Amazon EMR

answers

votes

688

views

Nakshtra

asked 4 months ago

EMR Jupyter Notebook: PySpark Imports Work in Shell, Not in Notebook- Issue is importing custom files

Issue: PySpark works in the first cells (likely SparkSession creation) but throws import errors when using my Python files in later cells. Environment: AWS EMR ( Amazon EMR...

Amazon EMR

answers

votes

590

views

Harish

asked 4 months ago

Studio Workspace can't see my runnning EMR EC2 cluster to attach to

Let me know if this is something AWS EMR Studio does: 1. in Databricks community edition, and in Google Collab, one can fire up a simple Jupyter notrebook with an automatically started cluster (small...

Amazon WorkSpaces Amazon EMR Amazon EMR Serverless

answers

votes

636

views

ken cottrell

asked 4 months ago

AWS EMR - YARN Resource Issue

Hi everyone, I am using AWS EMR to do some ETL operations on very large datasets (like millions/billions of records). I am using PySpark and reading the csv files using *spark.read.csv*. The results...

Amazon EMR Compute

answers

votes

648

views

vsk95

asked 4 months ago

Serverless job failure

While running the serverless job run, I am getting below errror: "Number of cores specified by 'spark.driver.cores '7' is invalid".

Amazon EMR Amazon EMR Serverless

answers

votes

519

views

Akash

asked 4 months ago

refresh_hfiles not working

Hi I have a EMR with Hbase on S3 storage mode.I have a read replica cluster pointing to same S3 bucket. Now when I add record in primary cluster and flush table on primary, and then run refresh_hfiles...

Amazon EMR Database AWS IAM Identity Center Amazon S3 Access Grants

answers

votes

503

views

shushant

asked 4 months ago

AWS EMR WAL creation error

Hi I am getting error while launching EMR with Hbase as S3Storage and WAL backup enabled . Caused by: java.lang.RuntimeException: createWal failed for wal WALMetadata(WALWorkspace=testworkspace2,...

AWS Identity and Access Management Developer Tools Amazon EMR IAM Policies

answers

votes

635

views

shushant

asked 4 months ago

I have a Python package saved in CodeCommit and I need it to run in the notebook linked to an EMR cluster.

I have a Python package saved in CodeCommit and need to use it in the notebook attached to my EMR cluster workspace. The package is already successfully installed via bootstrap. To do this, in my .sh...

AWS CodeCommit Amazon EC2 Amazon EMR Amazon EMR Studio

answers

votes

533

views

amanda_oliveira

asked 5 months ago

How do I connect Amazon mq to AWS emr serveless?

I have a Serverless EMR appication, I am submitting a spark job via python script. I have packaged all the dependencies an an the script to an s3 bucket. When I execute the job the spark job is...

Amazon EMR Amazon MQ Amazon EMR Serverless

answers

votes

615

views

Tushar

asked 5 months ago

Unable to run iceberg insert in hive deployed on EMR

Hello, I configured iceberg formatted table with transaction in hive on EMR 6.4.1. When I insert data into the table, the operation get stuck, without any error. Any insights are highly...

Accepted AnswerAmazon EMR

answers

votes

470

views

Mark

asked 5 months ago

1
2
3
4
5
•••
26
12 / page

Questions tagged with Amazon EMR

Unable to load data to apache in EMR cluster notebooklg...

Spark application takes longer than expected in emr 7lg...

How should i configure my emr cluster to handle large datalg...

EMR Jupyter Notebook: PySpark Imports Work in Shell, Not in Notebook- Issue is importing custom fileslg...

Studio Workspace can't see my runnning EMR EC2 cluster to attach tolg...

AWS EMR - YARN Resource Issuelg...

Serverless job failurelg...

refresh_hfiles not workinglg...

AWS EMR WAL creation errorlg...

I have a Python package saved in CodeCommit and I need it to run in the notebook linked to an EMR cluster.lg...

How do I connect Amazon mq to AWS emr serveless?lg...

Unable to run iceberg insert in hive deployed on EMRlg...

Unable to load data to apache in EMR cluster notebook

Spark application takes longer than expected in emr 7

How should i configure my emr cluster to handle large data

EMR Jupyter Notebook: PySpark Imports Work in Shell, Not in Notebook- Issue is importing custom files

Studio Workspace can't see my runnning EMR EC2 cluster to attach to

AWS EMR - YARN Resource Issue

Serverless job failure

refresh_hfiles not working

AWS EMR WAL creation error

I have a Python package saved in CodeCommit and I need it to run in the notebook linked to an EMR cluster.

How do I connect Amazon mq to AWS emr serveless?

Unable to run iceberg insert in hive deployed on EMR