Hi, confusing switches-of-first-person-and-third fans, and welcome back to my OSCON incoherentfest. You join us slamming into the Marriott, late enough to miss the redeye all-Dyson triple-slam keynote completely. Sorry. I ended up staying up too late trying to stop Ada bowling into Larry Wall’s legs, tipping him over Nat Torkington’s hotel balcony, and other acts of chaotic-god toddler action. So you get the first post-keynote session: Edd Dumbill’s (whose name I always misspell) DOAP talk.
Aim: to cut down the amount of work a software maintainer has to do to get the news out about their work. Too many project registries: Freshmeat, OSDir, GNOME software map, blah, blah, blah. Hard to keep up to date. Flip-side is if you’re trying to maintain your own registry, it’s hard to keep track.
Goals: it had to cope with internationalized descriptions. There had to be tools for the creation and consumption of these descriptions. It had to have interoperability with other Web metadata – FOAF, Dublic Core, RSS.
Use cases: easy importing of details into registries; exchange between registries; automatic configuration of resources – finding CVS repositories or bug trackers; and assisting packagers.
Tried to learn from recent metadata successes and failures. Dublin core is double-edged sword; mostly goodness, great documentation, raising awareness. Not done so well is that they underspecified in various way, so there are questions about how to use certain terms (what’s “author”? Name? E-mail address? URL?). RSS: very messy history, suffered from underspecification too. (Edd was involved in RSS 1.0, which was very notunderspecified, if i recall). ebXML is an electronic business vocabulary – boring, but they have schema and lots of documentation. HTML – hard to retrofit validation. Lessons Edd drew from this: docs, interop, schema, community.
XML or RDF, that is the question. Straightforward XML? Or RDF? XML comes in many flavours, though: well-formed, with an W3C XML schema (huge processing overhead; Edd doesn’t like), RELAX NG (feels a lot more lightweight, but has its own issues). RDF provides “webby data” – semantics as well as syntax. Edd likes RDF.
Surveyed existing work: Freshmeat, SourceForge, GNOME, KDE, Open Metadata Framework, Advogato. He here shows huge spreadsheet of relative features like mirror site lists, purchase links, demo site, license, etc. He thought the social relations between developers and projects was particularly important, because egoboo is so important. And screenshots! Must have screenshots!
Fields that weren’t anywhere else, but he stuck in: non-CVS repositories, wikis, more project roles: translator (woefully underrecognised), testers, etc. Spread the egoboo.
Biggest issue: how do you uniquely identify a project? Can’t use name, names change, names clash. How about a URL? But what URL? What happens if lose that domain? He picked homepage, but what if homepage moves? So Edd added “old homepage” property. If two DOAP descriptions share a homepage (either their current or their old homepage), then they’re the same project.
What about license? So that people can compare licenses in different projects, we give a unique URI for each license. He’s defined URI’s for common licenses. If you actually resolve those URIs you get an RDF file that points to the GNU site, etc. (Edd makes the argument that FSF might move their license, so he’s pointing to his own URIs. But what if Edd loses his domain or gets hit by a truck. Not sure this isn’t just adding an indirection, and not solving problem.)
Shows a simple DOAP file. Looks good, nice and simple: DOAP file pulls in FOAF and RDF namespaces to define a bunch of stuff. Looks easily createable with a template.
Tools: need a Creator, Viewer, and a Validator. Someone else wrote a DOAP-a-matic. There’s also a rel link for autodiscovering projects on Webpages. Someone else has written a firefox plugin that shows a colourful human-readable version of the DOAP data if a HTML page has a link to a DOAP file in it. Edd is writing a toolkit for validating, written in Mono.
Participants: OSDir.com and GNOME Software Map are already interested, looking to engage others. Needs ore tools, like autoconf and distutils, Makemaker to spit out info about the project they’re managing.
Q and As: Edd says he’s deliberately staying out of category discussions. You can add categories by just pointing to URI of the category — so you can point to Freshmeat categories, Debian tags, etc. And of course because it’s RDF, you can assert relationships between those categories.