Data Commons

This is an open knowledge repository that combines data from public datasets using mapped common entities. It includes tools to easily explore and analyze data across different datasets without data cleaning or joining.

In addition to a Data Commons about places (demographics, health, crime, economics, etc.), we are also building a Biomedical Data Commons and are starting on an Energy/Climate Data Commons.

I started working on this (during my stint away from Google), collaborating with Andrew Moore and Chaitanya Baru in the context of the Open Knowledge Network effort. I returned to Google November 2017 and started building Data Commons. provides schemas for structured data on the web. It is in use by over 25 million sites and used to power a range of application in search, personal assistants, email, etc.

We started this project in 2010 together with collaborators from Microsoft and Yahoo. It launched in 2011 and shortly after, we were joined by Dan Brickley, who has been co-running with me since.

Custom Search

We started Google Custom (or Programmable) search, to explore the idea of a platform where someone could combine their knowledge about a domain to create a better search for that domain, on top of Google's infrastructure, leveraging its web crawl, etc.


While at Apple, we created MCF as an attempt to introduce structured data as a first class citizen on the Web. It introducced simple knowledge representation ideas, notably in the form of directed labelled graphs, as a general data model for structured data on the Web. Later, at Netscape, this was submitted to the W3C (MCF Using XML), which eventually evolved into the RDF family of standards, including some which I authored, like RDF Schema.

In 1999, Eckart Walther and I created the first version of RSS as a mechanism for obtaining content for Netscape's portal. It seems to have survived beyond Netscape.


I spent my twenties on the Cyc project at MCC. It was an attempt to create an system capable of basic common sense reasoning using the kind of architecture advocated by symbolic logic (McCarthy, Feigenbaum and others). I left the project at the end of 1994 and Doug Lenat carried on with it in Cycorp.

Many ideas from that project flowed into the Semantic Web, the use of Knowledge Graphs in search, etc.