Tuesday, January 30, 2007

Microformats

As far as I understand it, besides using templates and categories, there is no way to really structure the content on a mediawiki page. For example, every team on the iGEM 2006 wiki provided a picture and a project abstract somewhere. The information was available, but not accessible without visiting every single team's page and actively looking for it. This year, one of our goals for the iGEM 2007 website & wiki is to make sure this kind of information is tagged, or marked-up, or annotated, or put in a special area on a template, or by some other method standardized across all the teams. If information common to all teams is standardized, it will be much easier to find and reuse, from both a human and machine perspective.

I haven't learned much about it yet, but I'm excited about microformats (also see Alex Faaborg's blog). If you already know about them, please let me know what you think. Here's popular definition from the microformats website: "simple conventions for embedding semantics in HTML to enable decentralized development." They are basically just standardized xhtml tags, and so should be easy to integrate with mediawiki content. The biggest hurdle would be making them simple for users to use.

Here's an example of the adr microformat:
32 Vassar st.
MIT 32-314
Cambridge, MA 02139
U.S.A.

N 42° 21'42.94
W 71° 05'28.36

It looks normal, but check out the source - the address has actually been marked up with the extra xhtml. Software agents, either in the browser (see operator) or scraping the page from elsewhere, should be able to understand the address.

The registry is one attempt at combining a database of user-submitted structured data and totally freeform wiki pages: special perl scripts provide a seamless interface between the registry database and what looks like normal wiki pages with forms on them. However, that solution does not seem as flexible or granular as the microformats; we need to find a way to make standardizing so easy everyone will do it most of the time. The microformats are good at letting users standardize a little bit of information on any wiki page. It would be hard to anticipate what or where that information would be in advance and then build forms.

I imagine special little buttons on the wiki wysiwyg editor that appears when users edit a page that forms their information in the right way. A user can press the address button which produces a template of the xhtml right in their article, just like the link and media buttons do.

EDIT: I just realized that Operator doesn't support the adr microformat (as I understand it), so I'm adding our lat & lon in the geo format.

No comments: