August 16th, 2013

Public Data Sets

RSS icon RSS Category: Uncategorized [EN]
Fallback Featured Image

For your data analysis pleasure, I give you a giant list of super cool publicly available data. If you’re looking at the data sets and wondering “now what?” – you can find this list AND tutorials on how to use H2O for analysis at the H2O docs page (here: http://docs.0xdata.com).
You can also get a detailed hands on experience analyzing any of this data, random numbers you might have laying around, stuff you made up, or whatever you want by coming to any of our upcoming meetups and hanging out with the 0xdata math team (http://www.meetup.com/H2Omeetup/). 
Open City Datasets
**Palo Alto Open Data
http://www.cityofpaloalto.org/gov/depts/it/open_data/default.asp
Chicago
https://data.cityofchicago.org/
20 yrs crime data
https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2
NYC
https://nycopendata.socrata.com/
Rents & Neighborhoods
http://www.huduser.org/portal/datasets/HUD_data_matrix.html
Transportation and Travel
Airlines Dataset
http://stat-computing.org/dataexpo/2009/the-data.html – but so far it contains years 1987-2007 (based on http://www.stat.purdue.edu/~sguha/rhipe/doc/html/airline.html)
Data source: http://www.transtats.bts.gov/Fields.asp?Table_ID=236
Open Flights Database
http://openflights.org/data.html
Capital Bikes Share Data
https://www.capitalbikeshare.com/trip-history-data
Sciences and Engineering
NASA Open Data
http://data.nasa.gov/
Seismic Data
http://sioseis.ucsd.edu/segy.header.html
Weather Public Data
http://OpenWeatherMap.org
http://OpenMeteoData.org
Diverse Data Sets

Many Eyes Community Datasets
http://www-958.ibm.com/software/analytics/manyeyes/
Kaggle Competitions
http://www.kaggle.com/
UCI Machine Learning Library
http://archive.ics.uci.edu/ml/datasets.html
Human Activity Recognition Using Smartphones http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
MLData repository
http://mldata.org/
GitHub Challenge
https://github.com/blog/1450-the-github-data-challenge-ii
Yelp Dataset Challenge
https://www.yelp.com/dataset_challenge
Netflix Prize
http://stackoverflow.com/questions/1407957/netflix-prize-dataset
Infochimps

Home


Stanford Dataset Library
http://snap.stanford.edu/data/index.html
Million Songs Database
http://labrosa.ee.columbia.edu/millionsong/pages/getting-dataset
Caret
http://caret.r-forge.r-project.org/datasets.html
Public Policy Data
European Open Data
http://open-data.europa.eu/en/
US Open Data

Frontpage

opendatasites


WorldBank Data
http://data.worldbank.org/data-catalog
Guardian Data
http://www.guardian.co.uk/news/datablog/interactive/2013/jan/14/all-our-datasets-index
Statistics Netherlands
http://www.cbs.nl/en-GB/menu/home/default.htm?Languageswitch=on
Quandl 6M Financial, Economics, and Social Datasets
http://www.quandl.com/

Leave a Reply

+
AI in Insurance: Resolution Life’s AI Journey with Rajesh Malla

Rajesh Malla, Head of Data Engineering - Data Platforms COE at Resolution Life insurance takes

March 29, 2023 - by Liz Pratusevich
AT&T panel: AI as a Service
+
AT&T panel: AI as a Service (AIaaS)

Mark Austin, Vice President of Data Science at AT&T joined us on stage at H2O

March 22, 2023 - by Liz Pratusevich
+
[Infographic] Healthcare providers: How to avoid AI “Pilot-Itis”

From increased clinician burnout and financial instability to delays in elective and preventative care, the

March 15, 2023 - by
+
Deploy a WAVE app on an AWS EC2 instance

This article was originally published by Greg Fousas and Michelle Tanco on Medium  and reviewed by

March 10, 2023 - by Michelle Tanco and Greg Fousas
+
How Horse Racing Predictions with H2O.ai Saved a Local Insurance Company $8M a Year

In this Technical Track session at H2O World Sydney 2022, SimplyAI's Chief Data Scientist Matthew

March 8, 2023 - by Liz Pratusevich
+
AI and Humans Combating Extinction Together with Dr. Tanya Berger-Wolf

Dr. Tanya Berger-Wolf, Co-Founder and Director of AI for conservation nonprofit Wild Me, takes the

March 1, 2023 - by Liz Pratusevich

Request a Demo

Explore how to Make, Operate and Innovate with the H2O AI Cloud today

Learn More