This course, a component of H2O.ai University’s certification program, aims to equip participants with the requisite skills to effectively utilize our H2O.ai Driverless AI tool. Jonathan Farinela, Solutions Engineer at H2O.ai, will emphasize the crucial role of data quality in achieving successful outcomes, while also elucidating the principles and procedures of data preparation.
The course is divided into two main sections:
- In the initial section, participants will delve into the importance of the tabular format in classical machine learning. They will also grasp the distinction between supervised and unsupervised learning, along with common methodologies like classification and regression. The significance of defining the unit of analysis in dataset construction will be highlighted. Moreover, participants will witness demonstrations of data preparation within Driverless AI, showcasing its ability to automate preprocessing tasks and allow customization using Python code.
- Transitioning to the second section, the course will concentrate on time series data preparation. Fundamental aspects of time series problems will be explored, including the necessity of a date column and understanding the autoregressive nature of such data. The course will also address challenges associated with handling multiple series within a dataset and provide best practices for improving model performance. Jonathan will exemplify dataset preparation and splitting techniques tailored for time series analysis using the capabilities of Driverless AI.
Enjoy the learning journey!