May 12th, 2021

How Much is My Property Worth?

RSS icon RSS Category: Community, Deep Learning, Explainable AI, H2O, Open Source, R

Note: this is a guest blog post by Jaafar Almusaad.

How Much is My Property Worth?

This is the million-dollar question – both figuratively and literally.

Traditionally, qualified property valuers are tasked to answer this question. It’s a lengthy and costly process, but more critically, it’s inconsistent and largely subjective. Mind you, valuation is an “art,” not “science.”

In reality, “qualified” valuers often end up having different “opinions” about the value of properties, and it’s up to the customer to pick the “opinion” that better serves their interest.

To address this issue, AVMs (Automated Valuation Models) have been developed. The motive was to use data and hand-crafted computer algorithms to estimate the market value of properties instantaneously and consistently. However, a major caveat with AVMs is that human biases can propagate to the final product, potentially resulting in partial valuations.

A more recent approach is to rely on Machine Learning and AI. Indeed, AI has outperformed human experts in many fields, and more data is now available than we can ever digest.

I was curious if AI can replace current AVMs, so I curated a dataset from different sources. It comprises tens of millions of property transactions and data about locations in the UK (safety, income, education, etc.). 

I typically use R and the data.table library for my data workflows, and these work flawlessly with H2O-3. So, I used H2O to train Deep Learning models to predict property values.

As you may realize, Deep Learning AI is superior but pretty much a “black box.” Therefore, it’s crucial to validate the models, not only mathematically but also with a human’s “common sense”. H2O has a whole arsenal of tools for validating and interpreting trained AI models. In addition to the common statistical tools (RMSE, Deviance, etc.), H2O has recently included Residual Analysis. In layman’s terms, residual analysis picks an observation with a known target value, predicts the value, and compares the prediction with the actual value — a residual of zero means perfect prediction. The analysis is illustrated graphically for additional convenience.

Another couple of tools that I find very helpful are Variable Importance and Partial Dependence Plots (PDPs). As the name suggests, Variable Importance tells us the most influential variables (i.e., predictors). In my case, since the variables are well defined and understood, Variable Importance helps validate the models further. For instance, we know (by experience) that the total floor area has the biggest impact on property value. Fortunately, we can see the floor area at the top of the chart. In other words, the model did well.

However, if we want to dive deeper to see exactly how floor area affects the price, then we can use Partial Dependence Plot. PDP analyzes the impact of a specific variable (i.e., floor area) on the target (i.e., price). Under the hood, it averages out all other variables and focuses on the variable in question. In our case, we can see that the relationship between floor area and price is pretty much linear, which is what we would expect.

As my dataset continues to grow, both in terms of the number of observations and features, I find myself spending more time on the technical side when I should be focusing on growing the business. Thankfully, H2O has automated most of the work with their Driverless AI, which I consider exploring in the next phase.

AccuVal, the property valuation platform is publicly and freely available here

Community Contributions

Please let me know if you want to talk about your H2O use cases. We welcome all kinds of community contributions (e.g. blog posts, tech talks, apps, etc.)

About the Author

Jo-Fai Chow

Jo-fai (or Joe) has multiple roles (data scientist / evangelist / community manager) at Since joining the company in 2016, Joe has delivered H2O talks/workshops in 40+ cities around Europe, US, and Asia. Nowadays, he is best known as the H2O #360Selfie guy. He is also the co-organiser of H2O's EMEA meetup groups including London Artificial Intelligence & Deep Learning - one of the biggest data science communities in the world with more than 11,000 members.

Leave a Reply

Developing and Retaining Data Science Talent

It’s been almost a decade since the Harvard Business Review proclaimed that “Data Scientist” is

May 12, 2022 - by Jon Farland
The Wildfire Challenge Winners Blog Series – Team Too Hot Encoder

Note: this is a community blog post by Team Too Hot Encoder - one of

May 10, 2022 - by Team
The Wildfire Challenge Winners Blog Series – Team HTB

Note: this is a community blog post by Team HTB - one of the

May 10, 2022 - by Team
Bias and Debiasing

An important aspect of practicing machine learning in a responsible manner is understanding how models

April 15, 2022 - by Kim Montgomery
Comprehensive Guide to Image Classification using H2O Hydrogen Torch

In this article, we will learn how to build state-of-the-art models in computer vision and

March 29, 2022 - by Team
H2O Wave Snippet Plugin for PyCharm

Note: this blog post by Shamil Dilshan Prematunga was first published on Medium. What is PyCham? PyCharm

March 24, 2022 - by Shamil Prematunga

Start Your Free Trial