Things I use to get stuff done

Big Data Stack

  • Distributed Computing Frameworks - Apache Spark , Apache HIVE , Map Reduce
  • AWS Big Data Services - EMR ,Athena, Data Pipeline, Redshift, Kinesis , Networking (VPC ) , CloudWatch , CloudTrail , Lambda ,S3 , IAM
  • DataWarehouse & MPP - Redshift , Presto on EMR
  • DataFormats - Parquet , Avro , JSON , CSV
  • ETL/ELT tools - AWS Data Pipeline, AWS Glue
  • HBase - Highly available and scalable but a pain to setup.
  • Full Stack Development

  • React - Love it
  • Flask - I believe its more conveniet to make micro services APIs with Flask while I personally find a tad bit difficult to work with django's batteries included philosophy.
  • Spring - Lately having my fair share of time working with Java Enterprise Solutions
  • Postgres - Probably the most user friendly database.
  • DynamoDb-I always advice customer to think of DDB as a partitioned B-tree data structure.If you can model your use case as such then DynamoDB is a viable choice for your needs.
  • Language

  • Java - For competitive Programming
  • Javascript - Getting myself better on this and learning new stuff !
  • Python - Who doesn't like 🐍
  • Bash - Either you love it or hate, guess which side I'm on
  • SQL - I love that there are no layers. It's just data and you
  • Tools & Certifications

  • AWS ASA - Certified Assosciate Solution Architect
  • VSCode - The best editor out there
  • HUE editor for HiveQL - Its not perfect but does a pretty decent job.
  • DBeaver - My favourite JDBC tool