Spark - Needle in a haystack story

In this short blog post, I will share an example of how executor logs helped us in narrowing down a malformed record that was causing our code to throw java.lang.ArithmeticException long overflow exception

3 min read

Localstack & Terraform

In this blog post, we are going to see an example of how we can use the localstack framework for testing terraform deployments.

6 min read

Spark Patterns - FlatMapGroups

In this blog post, I am going to explain you with an example on how we can use the FlatMapGroups api for implementing complex logic against grouped datasets.

5 min read

Working with schema in SparkSQL

In this blog post, we will see how to apply schema to SparkSQL DataFrames. We will also see, how to use Scala’s implicits for converting DataFrame into strongly typed entities.

5 min read

SparkSQL Getting Started

In this blog post, I am going to explain you the steps required for configuring Spark in your machine. I will also present simple SparkSQL program which runs SQL query against sample csv file.

9 min read

Spark Recipes

If we ignore the complexities of running spark applications then getting up-to speed with spark programming api is relatively straight forward. However like any other programming api, spark too contains some elements that aren’t that obvious to figure out. In this post, I will share some not so obvious things about spark programming api.

4 min read