Created by potrace 1.16, written by Peter Selinger 2001-2019
AboutBlogProjectsWork ExperienceContact
AboutBlogProjectsWork ExperienceContact

#spark

Jul 23, 2020

How to use Spark Checkpoints on new Clusters

Resuming spark streaming application from stored checkpoint in s3 results in NoRouteToHostException

May 29, 2020

How to tune your Writes to PostGres when migrating From s3 using Glue

let us understand how to identify the bottlenecks and what can be done to speed up your writes.

May 11, 2020

Spark Executor on Task Node

Configure your EMR cluster to run Executors only on Spot TASK nodes and save money !

Mar 21, 2020

How to smartly provison resources for Spark Job

Feb 19, 2020

Working with Postgres table in Glue ETL and calling stored procedures

Glue is king when it comes to simple, easily chained ETL that ties into the catalog.

GitHubStack OverFlowLinkedIn

© - Coded by hand and headaches

This site is built with Gatsby.js and hosted on Netlify. The source code is hosted on Github.