Tuesday, January 30, 2007

Microformats

As far as I understand it, besides using templates and categories, there is no way to really structure the content on a mediawiki page. For example, every team on the iGEM 2006 wiki provided a picture and a project abstract somewhere. The information was available, but not accessible without visiting every single team's page and actively looking for it. This year, one of our goals for the iGEM 2007 website & wiki is to make sure this kind of information is tagged, or marked-up, or annotated, or put in a special area on a template, or by some other method standardized across all the teams. If information common to all teams is standardized, it will be much easier to find and reuse, from both a human and machine perspective.

I haven't learned much about it yet, but I'm excited about microformats (also see Alex Faaborg's blog). If you already know about them, please let me know what you think. Here's popular definition from the microformats website: "simple conventions for embedding semantics in HTML to enable decentralized development." They are basically just standardized xhtml tags, and so should be easy to integrate with mediawiki content. The biggest hurdle would be making them simple for users to use.

Here's an example of the adr microformat:
32 Vassar st.
MIT 32-314
Cambridge, MA 02139
U.S.A.

N 42° 21'42.94
W 71° 05'28.36

It looks normal, but check out the source - the address has actually been marked up with the extra xhtml. Software agents, either in the browser (see operator) or scraping the page from elsewhere, should be able to understand the address.

The registry is one attempt at combining a database of user-submitted structured data and totally freeform wiki pages: special perl scripts provide a seamless interface between the registry database and what looks like normal wiki pages with forms on them. However, that solution does not seem as flexible or granular as the microformats; we need to find a way to make standardizing so easy everyone will do it most of the time. The microformats are good at letting users standardize a little bit of information on any wiki page. It would be hard to anticipate what or where that information would be in advance and then build forms.

I imagine special little buttons on the wiki wysiwyg editor that appears when users edit a page that forms their information in the right way. A user can press the address button which produces a template of the xhtml right in their article, just like the link and media buttons do.

EDIT: I just realized that Operator doesn't support the adr microformat (as I understand it), so I'm adding our lat & lon in the geo format.

Monday, January 29, 2007

Diving into Drupal, or, List 'o' Modules

This post of links should help me dive deeply into drupal.

First, getting started with content in Drupal: Node Types.
Also see the docs on Drupal's Taxonomy system.
Also see the great IBM intro to developing a collaborative web site.

Most of the non-essential core modules listed here will be useful (comment, forum, node, profile, search, statistics, taxonomy, tracker, user).

Here's a list of interesting contributed modules I'd like to learn more about:
  • Simplenews: Simplenews is a simple newsletter module which allows both anonymous as well as registered users to subscribe to different newsletters.
  • Massmailer: manage mailing lists (based on PHPList
  • Listhandler: synchronize mailing lists and forums
  • Organic Groups: Enable users to create collaborative groups (incompat. w/ taxonomy?)
  • Organic Groups List Manager: integrated mailing list/forum for OGs.
  • Node Vote: a node voting system
  • Interwiki: wiki syntax for linking
  • URL Filter: automatically turn URL text into hyperlinks
  • User Points: users gain points as they do certain actions
  • Tagadelic: weighted tags in a cloud
  • Privatemsg: an internal messaging system
  • Pathauto: automatic path aliases for nodes and categories
  • Biblio: manage lists of scholarly publications (including .pdf upload)
  • SPAM: tools to stop unwelcome posts
  • Services: standardized API for drupal
  • Google Analytics: free advanced web stats

Installing Drupal 5.0 on OS X 10.4

I started by archiving and emptying /Library/Webserver/Documents (I'm using OS X), then unpacked the drupal-5.0 archive there (without the containing drupal-5.0 directory).

I couldn't find the .htaccess file mentioned by Lullabot. I think it might have gotten lost as I moved the files around graphically using the Finder. It was present, however, when I used the commands suggested in the install instructions. Then I got the following error navigating to the install directory with firefox:
The Drupal installer requires write permissions to ./sites/default/settings.php during the installation process.

So my permissions are effed up. Boo. Just doing

chmod 777 ./site/default/settings.php


seems to have corrected the problem.
The location of the programs for connecting to and administering my mysql database haven't been added to my PATH, so I have to remember to go to /usr/local/mysql/bin for now to run them. I created a user for the drupal table as outlined in the INSTALL.mysql.txt.

Great. That seems to have worked. I changed the permissions on the settings.php (back) to 644. And Drupal 5.0 is online. Amazing.

Made a ./files directory and
sudo chown www:admin files
sudo chmod 755 files


Apparently I didn't install the GD library with PHP, so I'll have to recompile it.
Oh! Looks like I can use a precompiled binary put together by www.entropy.ch. I've got Apache 1.3.33. To successfully install the new PHP module (with GD support!), I've got to uncomment the lines that enable the pre-installed PHP module in SERVER_CONFIG_FILE="/etc/httpd/httpd.conf"

Goody! Now my system has PHP 5.2.0 with a bunch of libraries, including the GD library. However, the switch seems to have broken the connection between MySQL and PHP - Drupal can't get to the MySQL database. Ah, PHP is looking for the MySQL socket in /tmp/mysql.sock - exactly the opposite of the problem I was having when installing Vanilla forums a couple weeks ago. The solution was to change the php.ini file (now located at /usr/local/php5/lib/php.ini) to override the compiled-in default and look for the MySQL socket at /var/mysql/mysql.sock, which is "more secure," according to this apple developer document. It took me a while to realize the personal web sharing control panel buttons weren't actually causing httpd to restart and reload the php.ini file, nor was apachectl graceful commands, for reasons I don't understand. Rebooting did the trick.

The only task left is to set up the cron jobs. Then I'll have a completely generic drupal install.

Edit: got the cron working (with curl - wget and lynx are not pre-installed binaries on OS X). There was a little trick to get the clean URLs to work - the httpd.conf file has to be changed to allow .htaccess overrides. The Mac OS X specific guidelines explain it:
In httpd.conf (in /etc/httpd), locate the following section and allow overrides, so that Drupal's clean urls will work (they depend upon rewrite rules in .htaccess). You'll need to be root (or sudo) to do this. Don't forget to restart apache after modifying httpd.conf (turn personal web sharing off, then back on again, or use /usr/sbin/apachectl restart).

Friday, January 26, 2007

Content Abstraction in the Joomla! CMS

I've been evaluating Joomla and Drupal (and looked briefly at Plone) for the CMS of iGEM2007.com Superficially, I've gotten some bad vibes about Drupal from the developer of a big site that is based on it, popsugar.com, and from attendees of MashupCamp3. Joomla seems newer and brighter and in a way, more promising. But it doesn't seem to have as robust documentation or as established a user & development community as Drupal, and it also doesn't seem to be as flexible in terms of extending it in ways the core developers hadn't expected. I feel pretty confident that we could use and extend Drupal to do what we want, but I'm not sure about Joomla, so I'm giving it one last hard look-over. I'd like to use it if we can.

I found the following buried in a visual tutorial in part of the Help section of Joomla.org. I don't know why it isn't prominently displayed on the first page of the developer docs - I think it should be.
One way of looking at Joomla is that a Joomla site really only consists of one page (plus a lot of content stored in a database). As you click on menu items, Joomla rebuilds the content of that one page, as if you are navigating to another page.

When you click on a menu item, that menu item is going to load a single main piece of content, such as an article or list of articles, or calendar or whatever else into wherever you've specified to be the main content area in index.php.

In addition, modules, that is, all of the smaller items such as menus, are going to either display or not depending on whether you've configured them to show for that menu item.

The idea of Joomla is to build a site not by creating pages, but by configuring menu items. A given menu item will load a particular main content item, plus whatever modules you want. The modules are always displayed in the location as set up by index.php. (However, you can make it appear as if modules move to different places on the page, however, by cloning modules, placing them in more than one location, and then hiding or showing them depending on the menu item.)
Great insight! But I'm still leaning towards drupal...

Friday, January 19, 2007

Promote / Digg anything

In the post about Team Blogs I mentioned a page that would show the latest updates to all the team blogs and some kind of "Digg It" system that would let community members promote a cool blog post to the Community News Feed on the main page. This promotion mechanism was a consistent theme in the brainstorming sessions, and the general consensus was that it would be really cool if, once a user was logged in, she or he was able to Digg just about anything - any thread in the forum, any blog post, any part in the registry, and any wiki page.

It would probably be useful to see a list of all recently Dugg items, and certainly it would be useful to see a "Most diggs of all-time" list. Furthermore, I think it would provide a real incentive to write interesting posts in the team blogs, particularly if getting a post dugg (just once? how many times?) would promote it to the igem2007 main page and into the Community News RSS feed.

But despite the simplicity of the Digg operation, it seems like this could be a pretty complicated thing to implement across the entire site. And how do we deal with team members digging eachother's content?

Another way of thinking about the digg feature might be to contextualize it as a "Favorite this" operation. Users should click the favorite button next to any content that they want to show up in a dynamic list. I could see myself clicking this for help documents or discussion topics I like to visit often... but I don't think it would work for promoting blog posts. Hmm.

Team Blogs

I want the new site to provide a window into the experience of iGEM as it is happening, and I think the wikis last year did a poor job of capturing the "instantaneous narrative" of each team, although to be fair, that was never an articulated goal. Nonetheless, I am sure blogs would do a much better job than wiki pages. My intuition is that even if their usage won't be any more familiar to team members than that of a wiki, their purpose is much more clear. What I mean is that the very concept and structuring of a blog suggests a certain kind of information that a wiki, even a page called "team blog", wouldn't because of the associations with openness and flexibility and revision wikis usually have.

That said, even if we provided a little dialog box for team members on their team portal page to update their team blog, would anyone use it? It's hard to say. I hope that they would, and meagan and I want to set a good example and have a registry blog running by the springtime. The purpose of the blog is for the teams to share their experience with the community, and we are going to provide a central aggregation page showing the latest posts for all of the teams, as well as the ability to Digg or Promote the coolest posts to the Community News Feed on the front page, as well as offering RSS feeds for everything. Hopefully this will provide enough of an incentive for at least someone on each team to post to the blog every week or so.

Structuring Team & User Data: Team Portal pages

Prelude
James Brown, Brendan Hickey, Kim De Mora, Randy, Meagan, Tom, and myself have talked a lot about what features we want the igem2007.com site to have. I have about 15 pages of scribbled notes and drawings exploring and defining these ideas with various degrees of clarity, but nothing that ties them all together. So I'm going to describe each of the main features we want separately, with the hopes that doing so will make it easier to tie them all together afterward. And because this is a blog, I'm going to describe each one in a separate blog post, mostly so that it will be easier to comment on specific ideas, but also because hey, blogs are supposed to be bite-size.

Structuring Team & User Data: Team Portal pages

One of the main focuses of the new site is on dynamic content. The main page is a place that everyone in the community should want to frequently visit because it provides an instantaneous snapshot on the state of the community. To that end the site must be built so that the content users and teams put online can be aggregated automatically into composite pages like the main page. Last year we required the teams to build at least one page on our wiki to represent themselves to the community, and encouraged them to put up much more content - but we left the organizational scheme up to them. The content and structure each team settled on was often similar, but never the same, and so a fair amount of effort was required to find the same information for different teams.

So, the new site will feature a Team Portal page for each team, dynamically generated from content teams upload via specific forms or perhaps even wiki pages (marked up using microformats so we could scrape the pages?). Here's a list of specific pieces of content we will want from each team:
  • Team name (short name, long name, official name)
  • Team picture (800px wide)
  • School logo (128px x 128px)
  • Project Abstract
  • list of links to main wiki pages
    • Elaboration of Project description, updates
    • Calendar
    • Protocols
    • Etc. ( other favorite links)
  • Team Blog
All of the data will be arranged in the same way on each portal page, ensuring a visitor to any team's portal page will be able to find the same information where they expect to find it.

I believe that part of the problem with the team wikis last year was that there was a general ambiguity about what was to be done with them. Teams knew they had to put something online, but there were no clear or specific guidelines as to what it was we wanted or how to do it. It's the structured data problem again. So in a lot of cases, those teams that did use the wikis ended up using them for two purposes: for project management, i.e. scheduling and posting the results of experiments, and for telling the team's story, i.e. introducing the members and giving the background and progress of their project. I think the wiki is an ok way for teams to get this content online, but we have to be help them be more intentional about what information goes where.

Thursday, January 18, 2007

MashupCamp3 (at MIT) - post 2

Lots of links from MashupCamp3. I really should put them into del.icio.us, but my tags are so messy and unorganized I will just list them here until I have time to overhaul my bookmarks.
Be sure to check out Dapper & openkapow - they are awesome. The names alone are almost enough for this crappy non-metadata flat list, and I'll add tags and descriptions when I put them into del.icio.us, but just how much more useful will the contextualized collection be? What are the positive benefits, in all practicality, of using a site like del.icio.us? Network effects of the folksonomy?

MashupCamp3 (at MIT) - post 2

Lots of links from MashupCamp3. I really should put them into del.icio.us, but my tags are so messy and unorganized I will just list them here until I have time to overhaul my bookmarks.
Be sure to check out Dapper & openkapow - they are awesome. The names alone are almost enough for this crappy non-metadata flat list, and I'll add tags and descriptions when I put them into del.icio.us, but just how much more useful will the contextualized collection be? What are the positive benefits, in all practicality, of using a site like del.icio.us? Network effects of the folksonomy?

Wednesday, January 17, 2007

MashupCamp3 (at MIT) - post 1

Well, I got back from BioSysBio 2007 and 2 great days of iGEM2007 work in Cambridge, UK yesterday afternoon, crashed for about 12 hours, and woke up for MashupCamp3, which happens to be located at Hotel@MIT this year and occurring today and tomorrow. I'm really hoping I get a chance to run some of the iGEM2007 ideas by the experience and expertise concentrated at this conference. Right this second, everyone in the room here at MashupCamp3 is determining the schedule of sessions for the day, because MashupCamp is an Unconference. It seems like many of the sessions are about mobile technologies and geospatial mashups.

BioSysBio

Well, BioSysBio 2007 has come and gone! What a blast! I was on the organizing committee and contributed mainly by videoing the talks and posting them on google video. About 2/3 of them are online now - just search for BioSysBio on google video. I'd like to post more about the conference and about some of the incredibly productive brainstorming sessions about the iGEM2007 website I had with James, Kim, & Brendan over the two days following the BioSysBio.... but, I'm at another conference right now, MashupCamp3 (@MIT)....

Saturday, January 06, 2007

Vanilla Forums... so sweet, so tasty

Vanilla is a forum engine that is simply beautiful. It is not clunky. It seems straightforward. It seems very extensible. It rocks. And I am hoping it will make one sweet foundation for our community forums at the igem2.0 website.

Anyway, after reading a bunch of the information at onLamp.com, I downloaded and installed MySQL 5.0.27 and enabled the PHP module in apache. There was a some confusion in all the documentation I was following as to how to set the mysql socket to the right location... apparently the binary I installed from MySQL.com (for Mac OS X 10.4 (x86)) sets the socket to /tmp/mysql.sock, but PHP expects it to be at /var/mysql/mysql.sock. Two Apple docs here and here indicate that /var/mysql/mysql.sock is better for security purposes, so I attempted to change the MySQL defaults to move the socket by making a MySQL configuration file at /etc/my.cnf that contained
[mysqld]
socket=/var/mysql/mysql.sock

[client]
socket=/var/mysql/mysql.sock
Unfortunately, this didn't quite fix things immediately, but with some very non-scientific, uncontrolled fiddling and reconfiguring, I got the the server online. So now we have a pretty cool forum. Oh, also, for the record, I tried hardening the default install of MySQL by removing anonymous access and defining real, encrypted passwords for the remaining accounts... but I have no idea what gaping holes I'm leaving open.

Next up, extensions for Vanilla and choosing a blog content management system.

Friday, January 05, 2007

parts.mit.edu/igem 2.0

Over the next month or so I'll be redesigning and rebuilding the iGEM website. My design goals are to make it prettier, simpler, easier to use, much more dynamic, and above all, overflowing with features and functionality that naturally generate a much stronger sense of community than what we currently have.

So far Randy and Meagan and I have done some brainstorming and I've started doing some detailed concept sketches of the features we talked about. After I finish with those, I'll make some even more detailed mock-ups in photoshop. I'll explain and show a bunch of these features in another post. This is just to get my foot through the blogging door again.

While the main site is in development, we want to have a discussion board and mailing list online at a skeletal interim site about iGEM2007. I am really excited about the Vanilla discussion forum. It looks... awesome. Today my goal is to get a test version of it running on my machine here, blamo. I found some nice tutorials on getting into the superficial layers of apache in OS X at O'Reilly's onlamp.com.