WeSearch

Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

·12 min read · 0 reactions · 0 comments · 12 views
#aws#spark#emr#data#etl
Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy
⚡ TL;DR · AI summary

The article discusses the use of EMR Serverless Spark for read-write ETL processes on NAS data without the need for cluster management or data copying. It highlights the efficiency of this approach, achieving a full ETL pipeline execution in just 37 seconds at a low cost. The integration of FSx for ONTAP with EMR Serverless allows for direct reading and writing to NAS storage, streamlining data processing workflows.

Key facts
Original article
DEV.to (Top)
Read full at DEV.to (Top) →
Opening excerpt (first ~120 words) tap to expand

try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 1143688) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Yoshiki Fujiwara(藤原 善基)@AWS Community Builder for AWS Community Builders Posted on May 26 Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy #aws #spark #emr #amazonfsxfornetappontap FSx for ONTAP S3 Access Points × Lakehouse Deep Dive (7 Part Series) 1 Query NAS Data In Place with Athena and FSx for ONTAP S3 Access Points 2 FSx for ONTAP S3 Access Points Lakehouse — What Works, What Doesn't, and Why ... 3 more parts...

Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from DEV.to (Top)