Tuesday, August 01, 2006

On Googleability and Hyperlinkability

This is something I wrote some time ago when I was at Yale University's Technology and Planning group. Note that gmane has addressed some of my concerns.


The Sakai Communications Working Group recently polled a list I'm on for feedback about the tools in support of collaborating on Sakai. This is what I had to say:


Please contact the Sakai Communications WG at sakai-comm@collab.sakaiproject.org with any comments. Thanks in advance for any feedback.

Most importantly, we want to know how you use the current collaborative tools available to the Sakai community (Confluence, Jira, SakaiCollab, Project MatchmakingTool, etc.) What do you like or dislike about the current format? What is the most important feature for you in collaborating with others?


My comments are specifically about googleability and hyperlinkability as important features in collaboration software in support of Sakai-related work.

Email archives within Sakai currently lack several key features. Firstly, the archives are not Google-indexable. This means that content on otherwise public lists (such as sakai-user, sakai-dev, sepp-ent, etc.) to which anyone can subscribe and which anyone can browse by creating an account on collab.sakaiproject.org are not searchable via a generic web search, e.g. from www.google.com.

Secondly (and technically related), these archives are not hyperlinkable. I can't (or at least, I don't see how to) produce a hyperlink that will take you directly to an archived message.

The combination of not having these two features makes Sakai email archives less valuable than they might otherwise be.

Other list archival software has these features.


  1. JA-SIG uPortal developer list archives

  2. CAS user list archives



It would be an improvement for the public Sakai email lists to have archives with these features. This can probably be accomplished as a supplemental email archive driven by an archival user / email address subscribed to the lists.

The advantages of Googleability and hyperlinkability of email archives is that the conversations on the email lists can later be used with minimum additional effort to answer questions on list (Your question was answered in this email thread) or to avoid questions needing to be brought up on list again, because answers are found via Google searches.

Making conversations on the email lists transparently and automatically made Googleable and hyperlinkable frees the harvesting of information from the list archives into places like the SakaiPedia to be more of an adding-value effort of organizing and refining the product of discussions on the list, and less of a required effort to make content Googleable and hyperlinkable.

Using Sakai Resources as the mechanism whereby PDFs, Word documents, Powerpoints, and other documents are distributed does currently offer hyperlinkability.

This link will take you to a particular Minutes of the SEPP-Enterprise working group:

A minutes of the SEPP-Enterprise working group

However, such links do not often become Google indexed.

Contrastingly, placing documents in a publicly-accessible Apache folder, for instance, does make them available for Google indexing:

Dr. Chuck's talks

A powerpoint from among Dr. Chuck's talks

I'd suggest that publicly available documents used for Sakai collaboration that are currently stored as archived attachments to email messages, as Resources in the collab.sakaiproject.org Sakai instance, as attachments to Wiki pages, would ideally be presented at URLs such that Google is able to index them and they can be included in Google search results. This gets maximum value out of a document-production investment.

One way to go about achieving this would be to make Sakai able to produce Apache-like directory indexes of content.

http://collab.sakaiproject.org/access/content/ 

Would produce a directory-index-like list of links to corresponding to sites with publicly available content. Those links might look something like this:

http://collab.sakaiproject.org/access/content/group/1103227538796-57571/ 

and would produce a directory-index-like list of links to folders of content, the content tree, ultimately providing links to actual entries.

Google's entry point becomes

http://collab.sakaiproject.org/access/content/ 

from which it can find and index all the publicly-available content.

In general it's important that as much content as possible is available without logging in, at URLs that Google can find, follow, and index, and that can be bookmarked, shared.


Our goal is a simple way to access and learn about existing projects, to start development on new projects, and to disseminate information to the community. This environment will give the public a straightforward way to browse existing Sakai tools and participate in tool and/or core development. Our goal is to have a board approved tool and model by December the latest.


Best wishes in this endeavor. I hope these comments will help inform the effort.

No comments: