Saurav Kaushik

Saurav is a distinguished full-stack data scientist with extensive experience in competitive data science, research, open-source contributions and providing end-to-end solutions to several real-world business and data problems across multiple industrial domains.

Recently, he was ranked among Top 40 young Data Scientists in India. He's proficient in Python, R, HIVE, SQL and C++ and is the author and maintainer of a package in R named ensembleR hosted on CRAN and a package in Python named pyensembler hosted on PyPi

In this discussion we talked about how to approach self learning for machine learning and competitive data science, what brings in a person into machine learning, the comparison with other forms of engineering, and general recomendation on approaching data science.

Direct Download:

MP3 Audio File size: 49.76 MB



Some links and descriptions:


  • Spark: Apache Spark is a fast and general engine for large-scale data processing.

  • Scala: Scala is a general-purpose programming language providing support for functional programming and a strong static type system. Designed to be concise, many of Scala's design decisions aimed to address criticisms of Java.

  • Seaborn: Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.

  • bbplot

  • Plotly: Python Data Visualisation Tools and Techniques

  • Kaggle/Kaggle Competitions/Kaggle Kernels

Awesome reading material:

Favorite Language:

  • Python

Favorite Editors:


LinkedIn | GitHub




While you are at it tweet us at:


Receive email updates when the next podcast goes live!