After several weeks of active development, we’re proud to unveil H2O Flow, our brand new, open-source user interface for H2O! We used it live during our H2O World keynote today, and this blog post is a brief introduction to some of the core ideas behind H2O Flow.
H2O Flow is a web-based interactive computational environment where you can combine code execution, text, mathematics, plots and rich media into a single document, much like IPython Notebooks .
Flow not only enables you to use H2O interactively, but also provides you with mechanisms for capturing, replaying, annotating, sharing and presenting your analysis workflow. You can import files, build models, iteratively improve them, make predictions and finally add rich text to build up vignettes of your work for sharing and presentation, all from within Flow’s browser-based environment.
Flow has a “hybrid” user interface that seamlessly blends the concept of a command-line, text-based shell with that of a modern graphical user interface. But where traditional interactive computing environments output text, Flow displays purpose-built point-and-click user interfaces for every operation in H2O.
Flow lets you drive H2O’s API interactively and inspect every little detail inside H2O’s data store. It gives you access to all of H2O’s objects in the form of nicely arranged and formatted tabular data that you can then further manipulate, analyze and visualize.
Flow’s interface is composed of a sequence of executable cells . You can add more cells, move them around, clip them into a clip library and save/retrieve flows. Each cell has one input box into which you can enter commands, define functions, call other functions and access other cells and objects in the page. Executing the cell produces one or more outputs. Outputs are always graphical objects that you can point, click and inspect. Even raw objects and outputs from H2O’s API can be inspected graphically as navigable tree structures.
That said, unlike other interactive computing environments, you do not need any computer programming experience to be able to use Flow. As shown in the video above, you can simply point-and-click your way through all of H2O’s operations without writing a single line of code. In fact, you can simply turn off all input cells and use it like a “regular” GUI.
You do not need to be familiar with Flow functions or H2O’s REST API to be able to import datasets, build models, make predictions, compare models, etc. All operations in Flow are capable of prompting you for inputs and guiding you every step of the way.
Flow currently supports the Coffeescript programming language. If you know programming, you can use Flow’s GUI to drive H2O, and then pick and choose from the step-by-step inputs generated by Flow to build up your own custom scripts to drive H2O.
For example, parameter-less functions like getFrames()
, getModels()
, getJobs()
all return graphical lists of objects that you can further manipulate. However, if you call parameterized functions without providing arguments, e.g. if you call buildModel()
instead of, say, buildModel(args...)
, Flow displays a GUI form that prompts you for arguments, and then emits a new fully atomic executable cell that calls buildModel(args...)
with the correct set of parameters.
This bi-directional traversal from code-to-GUI and GUI-to-code is what makes Flow unique in its ability to drive H2O without the need for programming.
In effect, you “learn by example” to get acquainted with the API, and master it at your own pace.
Flow uses the bog-standard H2O REST API under the hood, so not only can you can access and manipulate raw H2O objects under the hood, you can also build custom GUI applications using H2O’s REST API, right from within Flow. More about the Coffeescript API and GUI development in a future post. Stay tuned!
There’s a rich set of features built into H2O Flow, and we’ll be actively writing about these in detail in the coming weeks. Until then, here are some screenshots of Flow.
You can inspect raw H2O objects with dump()
. Commonly used commands are available in the form of executable clips. You can add more clips to this list to build up a repository of your commonly used snippets.
Like H2O, we have open-sourced Flow under the Apache 2 License, and is currently bundled with all h2o-dev releases.
The Flow repo is located at https://github.com/0xdata/h2o-flow .
To run Flow, simply follow the build and launch instructions for h2o-dev , then point your browser to http://localhost:54321/flow/index.html
The quickest path to experience H2O and Flow is to use the assist()
function, which you can run by clicking the Assist Me! button in the Help panel on the sidebar. Thereafter, just click importFiles
on the Assistance menu and follow the instructions.
Flow is pre-alpha, so let us know if you run into issues.
Also, this is just the beginning. The core of Flow is ready, and this opens up avenues for the next wave of features and extensions requested by our community. So, tell us what you need, and what you would like us to build going forward. Would you like support for more languages like R and Python? Export and run Flows on Node.js? Get Coffeescript or Javascript running inside H2O for data-munging and map/reduce/aggregate operations on big datasets?
We are fully committed to make H2O + Flow the best machine-learning platform on the planet, so please feel free to be a part of Flow’s roadmap – join us at h2o-stream / h2ostream@goooglegroups.com.