August 16th, 2019
A Maker Data Scientist’s journey: from Sudoku to KaggleRSS Share Category: H2O Driverless AI, Machine Learning, Makers
By: Parul Pandey
If you put enough smart people together in one space, good things happen. Erik Hersman
One of the perks of being a part of H2O.ai is that you get to work with some of the brightest minds on the planet. Here you get to closely engage with people who have a great deal of experience, as well as expertise. One such set of specialists here are the Kaggle Grandmasters, who have time and again proven their prowess and expertise in the Data Science realm.
I had a chance to interact with my colleague Rohan Rao, a Kaggle Grandmaster, and a 7 time National Sudoku Champion, and I thought of sharing the conversation with the community. Through this conversation, you’ll get a chance to know about his journey, inspirations, and accomplishments. You’ll also get to know what inspires him to compete and what keeps him motivated.
Rohan Rao, who goes by the name Vopani on Kaggle, dons several hats. Apart from being a Kaggle Grandmaster and a Data Scientist here at H2O.ai, he is also an accomplished Sudoku and Puzzle solver, widely considered as the best sudoku solver of India.
He is a 15-time national sudoku champion in sudoku and puzzles, the first Indian to be ranked in the top-10 in the world and the only Indian to be on podium (Top-3) at Asian Championships 2018 and 2019. He recently won the latest edition of National Sudoku championship and also stood 2nd in the prestigious Brands Brain International challenge held at Bangkok in July this year.
Here is an excerpt from my conversation with Rohan :
You are the first Indian to be amongst the top-10 in the World, in Sudoku? What initially inspired you to get started in Sudoku?
Rohan: I’ve always been fascinated with logic, numbers, patterns, and sports since I was very young. I learnt to solve sudoku in 2005, one day before a competition in Mumbai, which I ended up winning in the U-16 category. That was the tipping point for me, and I started pursuing it actively out of interest and a hobby.
Being a problem solver by nature, I see sudokus as a problem to solve. Going through the logical path of eliminating and putting digits to ultimately reach the solution always gives me a feeling of accomplishment and happiness.
Practise, preparation, hard work and a lot of effort over the years enabled me to win various national and international championships. It gave me the opportunity to represent my country in the sport and helped me achieve my goal of becoming the first Indian to be in the Top-10 in the world and Top-3 in Asia.
How did you get interested in Data Science? Did being good in solving puzzles had a role to play in it?
Rohan: On completing my Masters in Applied Statistics, I was looking for fields where I could apply statistics to solve real-world problems. I came across Data Science (DS) as an exciting area of work that uses a fair amount of mathematics and statistics. Fortunately, I got my first job with a machine learning (ML) consultancy company in 2013, where I began my career.
Sudoku and ML started independently as separate areas of interest and work. While initially, it was hard to keep up with both, but gradually discovered some areas where both of them overlap.
Sudoku taught me to think, strategise and plan for a given problem. It helped me build mental stamina, speed and the ability to find crucial elements of solutions that stand out.
On the other hand, ML taught me to combine theoretical insights and ideas into practical hands-on solutions smartly. It made me appreciate the importance of analysis, multi-dimensional insights and the concept of optimizing solutions to the brink.
Becoming a Kaggle Grandmaster is an impressive feat which requires a lot of perseverance and hard work. How did your tryst with Kaggle begin and what kept you motivated throughout your grandmaster’s journey?
Rohan: I was fortunate to have a great mentor during my first couple of years of working who exposed me to Kaggle and worked closely with me. My competitive spirit, hunger, and drive to achieve success then took over, which was a constant motivation through my journey.
Becoming a Kaggle Grandmaster was undoubtedly a defining moment in my career, and it involved some personal sacrifices and a lot of support from family and friends.
What are your favourite resources when it comes to Data Science in general? Which programming languages do you prefer?
Rohan: Kaggle, Google and Stackoverflow would be my top three, and it is not at all surprising. Depending on the problem statement, there are many useful Github repositories and open source libraries/packages that can all add value in building the solution.
I started with R, and it remains my preferred language for DS. I learnt Python subsequently along with Scala, which I’ve been using for the last few years in building production-ready solutions.
A lot of people, especially newbies, get overwhelmed by Kaggle? Any suggestions on how they should approach a data science competition?
Rohan: Kaggle has become vast and extensive with a goldmine of useful information, codes, ideas, discussions, and solutions. It can be a little overwhelming at first due to the enormity and depth of the available content.
My suggestion would be to carve out a path and depending on what the goal is, identify small Kaggle tasks and devote time in completing them. During the initial stages, it is better to focus on one competition entirely. One can start by exploring the kernels and being part of the discussions of that competition. This can then be followed by taking part in different types of competitions, one after the other to get maximum exposure.
As a Data Scientist here at H2O.ai, what are your roles and in which specific areas do you work?
Rohan: As a Kaggle Grandmaster at H2O.ai, my role is primarily in developing H2O’s products like Driverless AI to facilitate our customers to build machine learning solutions for a wide range of use-cases across various industries including fin-tech, manufacturing, retail, healthcare, marketing. I specialize in recommendation engines, credit risk modelling, digital payments ecosystem and optimizing digital marketing campaigns.
The challenge is two-fold: Building an industry-agnostic platform for scalable machine learning solutions as well as enabling intelligence through various recipes pertaining to data or models or domain expertise and being able to combine everything into an end-to-end packaged product.
What are some of the best things that you have learnt via Kaggle that you apply in your professional work at H2O.ai?
Rohan: There are a wide variety of problems on Kaggle, each having their way of processing the data, building models and optimising solutions. The most significant learning for me has been the ability to understand a dataset and then devise a framework of the data science solution.
I automated many components of the machine learning workflow to improve my efficiency while working on Kaggle competitions. I now use many of those modules to enhance H2O.ai’s products like Driverless AI for building machine learning solutions across various industries so that its available in a broader ecosystem.
Of late, there has been a lot of hype around ML and AI, in general. What are your thoughts about it?
Rohan: It is essential to understand that AI is making massive breakthroughs across industries and has a bright future in products and applications built around it. It is also important to realise that almost all of the innovative solutions are extremely hard to develop and have been developed over years of research and is not just magic.
It will slowly and certainly become part and parcel of our lives in many ways that would hopefully solve many problems in the world and with time, it’ll become easier to understand, build and share.
Are there any specific areas or problems where you would want to apply your expertise in ML?
Rohan: Predicting crimes using Data and ML across the globe is one of my most prominent areas of interest, which can make a significant positive impact in the world.
Any advice for the Data Science aspirants who have just started or wish to start their Data Science journey?
Rohan: DS is becoming a vast umbrella with a lot of exciting work and projects across a wide variety of industries. A lot of aspirants get bogged down with the enormity of the content and depth of the information that is available.
My advice to others would be to identify an area that suits their skill-sets and then work towards solving a problem in that direction. It is essential to be hands-on in any DS work because it is only then one can understand all the nitty-gritty of the profile. Working on DS projects as part of a company/team as a full-time contributor is also preferable because it gives exposure to the entire workflow of a DS project.
It requires a significant amount of effort, time, patience and sacrifice. Keeping this in mind will help you prepare better with the fast-paced DS community and improve your chances of being successful, along with enjoying the work and journey in the process.
Success never comes easy. It is often an arduous path with a lot of hurdles and obstacles. Patience, perseverance, and practice are the three virtues which constitute the pillars of success. Grandmasters are not born in a day. Instead, they spend days and years working relentlessly to achieve their goal. Hope this conversation also inspires and motivates you to work towards your desired goal in life.
Originally published on Medium