Data Mining

First it was tubes, now it’s pipes. I was watching a video from the New Media and Social Memory conference and feel as perplexed as ever about “folksonomies.” It seems natural that pipes (which obviously have walls as a matter of course) would reemerge as a useful construct to deal with the multiple types and pathways of information on the net. I think the difficulty of this conceptualization is the huge gap between thinking of pipes as vectors (a direction which information can be made to follow) and as physical “pipes” subject to the limitations of physicality—notions such as “bandwidth.” To consider a pipe without thinking of some limitations (as half-baked as Ted Stevens argument was) means that the meaning of “pipe” is akin to the concept of “vector” or “path” instead. I suppose I’d be much happier if Yahoo elected to call their approach “pathways” rather than pipes. But vector is really best of all, because a vector passing through any information cloud is sure to encounter information that has been mislabeled, or ultimately doesn’t fit. It’s the byproduct of aiming to universalize information.

As I was researching, trying to remember what was different about the programming concept of “pipe” and the physical one, I was sucked into a weird time loop. I haven’t programmed anything in decades—I started working with the 6502 processor in the early 1980s and then stopped completely about 1986. The 6502 is credited as having the first “instruction pipe,” but it dawned on me reading the wikipedia entry that it was actually more of a cache holding a single instruction ready to go. The Yahoo effort, and Apple’s automator (another “pipe” technology) are great if you already know what you’re doing. But the real hazard of these sort of vector approaches is that in order to be effective you must limit the array of available operations to a carefully controlled, universal vocabulary (or instruction set). This is not the same as tying the tubes (as Ted Stevens would have it) but rather a matter of charting only predetermined destinations—the path to the CPU, to another process, or to executing a complex query. In short, walking only in the ruts.

I digress. Back to the folksonomy thing.

More

30

February 19, 2007 3:06 PM