OSCON 2010 - Data freedom and the semantic web

I presented CubicWeb at OSCON 2010. I could only stay for a day and I did not get a chance to see a lot of talks, but judging from the conference schedule it seems only a few of them were related to making data available on the web. I will focus on these talks, for they are very relevant to us who are building the semantic web.

http://assets.en.oreilly.com/1/event/45/oscon2010_125x125.jpg

I highly encourage you to watch this video of Stormy Peters, "Is Your Data Free?". It addresses the issue of the privacy of data that you think belongs to you but actually doesn't. This is exactly what is behind the CubicWeb design: build your own web of data in a permission based environment in order to preserve your privacy.

http://wiki.freebase.com/skins/freebaseUpdate/freebaselogo.png

Open source, Open data presented by the Freebase folk, makes a very interesting parallel between open source and open data raising the problematic of versioning open data and providing quality data. There are methodologies and tools for open source software to ensure well designed and reliable code. There is absolutely nothing so far that could handle properly data versioning and data quality assurance. That is the biggest concern freebase has and through this talk they asked for help from the open source community so that more people would get involved in finding solutions to serve open data.

An attendee raised an interesting question about the format that everybody would agree to use to represent the data. I was surprised by the answer. It seems that so far they do not believe that this is a concern, not to say they don't care, but almost. For freebase, the main concern and most challenging part of the data representation is to have a unique identifier. I am not quite sure I agree on that part. Yes, this is important, even mandatory, but there is also the need to define or use a known format to represent this data, (RDF for example) so that we can source this data. To be semantic data, it needs to be both identifiable and readable. And I do not see the point of publishing data on the web if it is not ready to use.

Just for fun, look at Rewrite or Refactor: When to Declare Technical Bankruptcy, it might sounds familiar to you...

CubicWeb presentation went well, an interested audience which was very happy to see that we could aggregate multiple types of sources in a CubicWeb application. Of course, it would be even better if we would support an RDF source such as dbpedia: don't worry that's going to happen. Also what raised an interest is the semantic views already integrated in the framework such as SIOC, OWL, FOAF, DOAP that you can find in blog entries (sioc), schema (owl), user (foaf), project (doap).

 

RDF Resource Description Framework Icon OWL Button - microformats JSON - RSS dublincore DOAP SIOC - FOAF

By providing a platform for using data from multiple sources and publishing semantic data, CubicWeb is already a piece of the web of open data!