James Leftwich, IDSA (jleft) Thu 31 May 07 12:21
I anticipated the emergence of Tags (keyword-based metadata), but also other classes of objective and subjective metadata, and ways in which these could be associated with visual, spatial, and behavioral attributes and interactively used to explore and understand information and search results on a level far beyond our present standard methods. The following slide: <http://www.well.com/user/jleft/orbit/vizrev/slides/8.html> Shows a model for a desktop that has become a visualization space for viewing search returns (or directory or collected file/data sets). At the lower left you see a "Filter Set" which is being used, and which consists of a set of metadata types (AV1 through AV16 in the illustration) that have been associated with some particular visual, spatial, or behavioral attribute (volume, height, color, shape/form, opacity, spatial location, etc..). These filters could be used passively (running new data through them, for example "What's popular on YouTube today?" or "New Flickr Photos"), or could be interactively "twiddled," with the sliders moved up and down between full on and full off. The result is the ability to interactively explore and tease out different types of comparative relationships among very large sets. As I've said numerous times, so much of the traditional data research has been about "boiling down" vast sets of information to a simplified form, rather than use metadata to make it interactively visualizable, which then produces something that our highly evolved visual cortexes can easily understand in a pre-cognitive manner. I use the example of a lawn to illustrate this point. When I walk outside, I don't see a sign that says, "I'm a lawn. I have 72,345,432,847 blades of grass in me. Here are the first ten." I see instead the entire lawn and my visual cortex can quickly spot variations in length, dandelions, stray toys, etc.. Of course a lawn, even as complex as it is, is relatively static in that you cannot then ask it to rearrange or revisualize itself using new metadata attribute filter sets. But it illustrates the underlying point. Humans will not be able to fully move into the information age until we begin to harness the power of next-generation metadata models coupled with interactive visualization capabilities. I'm less interested in using computers to think for me (boil things down), than I am in using computers to convert complex data and metadata (which can come from many sources in addition to coming attached to data) into something that the visual cortex is already evolved to process much more efficiently than through traditional, primarily textual and static means.
Harmless drudge (ckridge) Thu 31 May 07 12:41
>But tags are entirely new, already have several meanings, and are used to define material in many different contexts.< Look at the index of any scholarly work, and you will see a set of tags selected by the author of that work, or by someone hired to do the job. These terms are not controlled; they were selected for that specific work. Similarly, every local library once did its own subject cataloging, according to the specific needs of that collection. The use of standardized, controlled subject headings is relatively recent. There is no hope of getting anyone but users to catalog most of the internet, or even most sites on the internet, and that means uncontrolled tagging, for the most part. I can imagine one place where one might want authority control, though, and that is in marking terms as related to one another or as synonyms of one another. You are going to get furious arguments as to whether "Holocaust, 1939-1945" should pull up a reference to "Urban folklore." Some people will insist that "Taiwan" should be treated as a synonym of "China," and some will disagree furiously. Some users will insist that "Homeopathy" should appear as a subheading of "Medical sciences," and some the contrary. In some of these discussions there will be a fact of the matter, and in that case, the argument should be settled from the top down.
David Weinberger (dweinberger) Thu 31 May 07 13:06
Wow, jleft, I look forward to reading through the material you've posted and linked to. Sounds fascinating. ckridge, your Holocaust example is right on the point, but I draw the opposite conclusion. There is no possibility of us all agreeing on the categories we want to use. Sometimes people are just wrong (as in your Holocaust example), sometimes the point is endlessly arguable, and sometimes there just isn't a right answer. Top down systems have to fail at capturing how the world is categorized because there is no one way it's categorized. We don't agree, haven't agreed, and never will agree. That seems to be one of the unambiguous lessons of history. Tagging systems help us deal with that situation. The same thing can be tagged in inconsistent ways, therefore allowing navigation as one prefers.
Harmless drudge (ckridge) Thu 31 May 07 13:17
Here, I think we disagree. We don't find out how the world is by consensus. It is not a democratic process, and it is certainly not a democratic process that requires universal consent. Some opinions count more than others. It doesn't matter whether someone is deeply interested in the Holocaust being a myth, or in homeopathy being a branch of science. Those things are just not so, and if they are cataloging a public resource, they can't say they are so. That would be vandalism. How they catalog their own private libraries is, of course, their own business.
Jon Lebkowsky (jonl) Thu 31 May 07 13:41
I think that just gets back to the question of authority, and the extent to which tagging doesn't claim to provide an authoritative namespace, even when it's social and public. Social tagging can accommodate diverse visions without acknowledging any particular vision as right or wrong, no?
David Weinberger (dweinberger) Thu 31 May 07 14:36
ckridge, I don't think truth is a matter of consensus. Personally, I like science. But tags don't always make categorical judgments. I may tag a Holocaust resource as "myth" because I'm researching the mythic elements of large-scale historical events. It could happen (which, on the Web, usually means "it has happened"). Further, tagging systems allow for difference because systems don't have to settle on only one tag. 73% of the tags of an acupuncture manual may be "medicine," but the 27% who tag it as "superstition" can still find it. The array of tags may well indicate the range of popular beliefs, but no one is suggesting that that array determines what's true. Behind our disagreement, ckridge, may be a difference about the role of categorization. Tagging systems think of themselves as providing navigational and re-finding aid. Traditional categorization systems sometimes thought of themselves as attempts to parse the real order of the universe. Parsing the real order is a worthwhile activity, especially if one does not insist that there is only one way to carve up the goat (or the Solar System). And acknowledging there are many ways does _not_ imply that _all_ ways work.
Ari Davidow (ari) Thu 31 May 07 18:38
I've just finished reading the chapter on the Dewey Decimal system (and on the limits of categorizing when everything has to occupy just one space). I suddenly had a picture of sad discussions with librarians years ago about how the disappearance of card catalogs was going to mean the disappearance of amazing serendipity. As anyone who was (un)fortunate enough to use those beasts, looking at cards on the way to, or just past, or clustered around the one you wanted, or looking at the "see also" cards, all contributed to a wonderful exploration of the world, no less wonderful than wandering through encylopedias or browsing shelves - any placement that juxtaposed unrelated items. But between tags and collaborative filtering, serendipity is more fun than ever. When categorization first went electronic, most of us just saw what we were leaving behind. We now starting to see what we can gain, and some of it is very, very exciting.
bill braasch (bbraasch) Thu 31 May 07 19:49
the book _Stumbling toward Happiness_ describes this flaw in our prospective ability. we tend to see what's leaving and can't see what's coming next. what's coming next? seems like there's no shortage of data to tag. I can see that we'll all have plenty of photos to reminisce over in years to come. EBay bought Stumble Upon today. $75 million for a site that delivers serendipity according to your preferences. They also preview ads on there. Google's Eric Schmidt said in a speech that the goal was to enable us to ask Google questions like 'what should I do tomorrow' and 'which job should I take'. The EU reacted in fear that the do no evil algorithms threaten to take away human freedom. Nightly Business Report closed with that today. Algorithms and alchemy. How do you see the politics taking shape around these abilities?
David Weinberger (dweinberger) Fri 1 Jun 07 11:23
I'm sure the mix of algorithms and human intervention, explicit and implicit, structured and unstructured, precise and fuzzy, personal and group, and just about every other opposition is going to increase. We'll use top down taxonomies where they make sense and ESP-based tags when that's what works. We're quite practical. There's a politics to each of these decisions. That's inevitable because the tools we use make assumptions about us as social creatures with needs. But the mix of tools in this case makes a discussion of "the" politics difficult. Nevertheless... Since the idea of the miscellaneous is that information is becoming available outside of its normal domains and sometimes free of its traditional owners, the institution that relied on controlling that information lose some of their power. Some of that power will go to grassroots, democratic semi-organizations. But some inevitably will go to new authorities who come up with useful ways of aggregating and/or standardizing data. E.g., Google Print puts Google in a position to make its metadata standards for printed materials the de facto standard. The One Laptop Per Child group is also in an interesting position of potential power. On the other hand, the uBio project is connecting "messy" species information back into the "clean" data of the Encyclopedia of Life. In short, bbrassch, my answer to your excellent question is: I dunno.
bill braasch (bbraasch) Fri 1 Jun 07 11:31
change that to 'it depends' and you can sell it by the hour. I had a conversation with a google developer yesterday, talking about their Gears announcement. He started telling me about all the interesting business models that wrap the stuff they give away. We talked about bloggers. A fair number of the developers are tapping add revenues with some sort of blog. He asked a successful blogger what advice he would give to someone starting out. "start five years ago, just writing down every day what interests you, and you will have your audience". at the fireside chat, they introduced the Gears team. One of them is the official blogger. one funny thing about the Gears demo was that they had to find a way to include search, so they build a search page that finds the last page in their demo.
Jon Lebkowsky (jonl) Fri 1 Jun 07 21:58
Speaking of which, I've been thinking how much my experience of the Internet these days touches Google. Since Google's mission is "to organize the world's information and make it universally accessible and useful," its operations and future would seem very miscellaneous. Where would we be if Google went away? And where will we be if it doesn't?
bill braasch (bbraasch) Fri 1 Jun 07 22:23
Sergei showed up at the keynote and advised the developers to be thoughtful in the use of the tools. We're headed for the wisdom of the commons, for better or worse. I expect better to win. Behaviors change quickly around new information. Look what's happened with google street views. Hello kitty!
Jamais Cascio (cascio) Fri 1 Jun 07 23:47
David, any thoughts on strategic metadata -- particularly metadata added with the conscious intent to confuse or obfuscate? (I sometimes think of these as "fauxsonomies.") Spammers using lines from novels to muddy Bayesian spam filters offer one example; efforts on the part of the residents of the northern California town of Bolinas to misdirect and confuse potential visitors offer another.
bill braasch (bbraasch) Sat 2 Jun 07 09:01
now you've said it. The BoGas station is on GPS now, so people who need gas find the town even without the signs. The gas station supports community housing, so they need the sales. There was a meeting last year to discuss ways to promote the station without putting any signs on hiway 1. Along came some scottish tourists, looking to buy gas. Someone asked how they found the place and they said 'GPS'. So now there's a local arts store next to the station. People who drive cars with GPS receivers have money to spend. Strategic metadata is also an adwords game when you're looking to make impressions at lower cost per click (just to swerve this back on topic).
David Weinberger (dweinberger) Sat 2 Jun 07 11:47
By "strategic metadata" you mean the equivalent of search engine optimization? (SEO is a bit of a euphemism, since it oftenintends to optimize search results for the commercial entity, which often -- not always -- means degrading them for users. There's the SEO that aims to help the search engines understand what the site is about, and there's the SEO that tries to distract users from useful results.) Spamming tag sets will only get more and more attractive to spammers. The tag search engines and aggregators are going to be trying to stay one step ahead, just as the search engines and aggregators do now with plain old spam. I'm hopeful that we'll beat out the bad guys, although it will be a perpetual struggle. The analysis of patterns and of the associated metadata may well give the engines enough to go on. I hope. As for Google: I'm a fan not only because it's a great search engine, but because if you look for large companies that are out front lobbying for what I consider to be core Internet values, the list begins with Google...and doesn't end far past them. (No, I don't agree with everything Google has done.) Jamais, I love your term "fauxsonomy," and blogged about it over at www.EverythingIsMiscellaneous.com. Thanks.
bill braasch (bbraasch) Sat 2 Jun 07 12:02
I met a fellow recently who sells what he called buzz channel marketing. His idea is to spot the new keywords when they hit the blogs, then jump on them while they are still inexpensive. Just like in politics, there are buzz creation channels. He's doing arbitrage by tapping the buzzwords early. This is what futurists do to spot trends, but this fellow's model worked over a number of days. He said he could see buzzwords jumping from blogs to corporate websites in the span of three days. These ad words may cost a nickel one day and $3.00 the next.
Jamais Cascio (cascio) Sat 2 Jun 07 12:04
In this case, more like the obverse of SEO -- the intentional pollution of metadata in order to achieve a particular outcome. "Google bombing" is another example: people linking a particular term to a particular location, across thousands of sites, intending to make that location the #1 hit for that term. The canonical example is the (now defunct) link from "miserable failure" to the George W. Bush bio at the White House -- or, even better, the attempts by conservatives to dilute the miserable failure=W googlebomb by trying to link it instead to Democratic/liberal figures. It seems to me that, as more people begin to recognize the importance of digital metadata as a way of filtering and finding useful information, the more we'll see strategic efforts to modify where that metadata points.
Jon Lebkowsky (jonl) Sat 2 Jun 07 12:44
There's also SMO - "social media optimization" - an approach to Internet marketing that advises site owners to make tagging and bookmarking easy (i.e. use metadata more effectively), and generally to increase your visibility by leveraging social media on your blog and across the web.
Andrew Alden (alden) Sat 2 Jun 07 14:56
You make the "miserable failure" struggle look like Capture the Flag.
Brian Slesinsky (bslesins) Sat 2 Jun 07 18:00
I started to write a post here but it turned into an essay, so it's on my blog: It occurred to me while participating in a discussion about David Weinberger's Everything is Miscellaneous that the problems with what he calls the first order of order aren't entirely solved by putting information online, because they often have nothing to do with physical constraints of stores and libraries. They're actually about how we express ourselves. http://slesinsky.org/brian/misc/chop_it_all_up_and_start_anywhere.html
bill braasch (bbraasch) Sat 2 Jun 07 19:41
that's miscellaneous for ya. I just took a couple hundred digital photos of a prom party. I'll upload them and pass out a pointer, but I'll toss out obvious duds. everyone at the party had their own context, except perhaps the pork loin and the strawberries. our kids learned the mouse and the mac on cosmic osmo. the guy who could talk his sister through the clicks to get to the pinball machine that had a removable front plate with all the quarters is now taking engineering classes. he likes the information this way. he was confused in a chem class and found a better one at the dartmouth site, so he learned at dartmouth and took the tests at the UC. if at first you don't succeed, search. It's about knowing what to search for. <cascio>'s concerns bother me. I was in a tagging session at a conference and the In-Q-Tel vc stood up and said there were 7 investments in the room. the cost of spidering and spinning is coming down, but what happens when the spiders and the spinners get the hang of this? Can mass media take over the Internet anyway? I was wearing a YouTube shirt today and the guy at the tea shop told me about a guy in Germany who got naked and protested the taxes, and of course got his message out. All he needed was a sign and a camera.
David Weinberger (dweinberger) Sun 3 Jun 07 11:56
bslesins, I don't think we're that far apart. It's true that since we expect shorter pieces on the Web, we need to make the connections more explicit ("You're reading part 3 of a 7-part post," to take your example). That's in part an artifact of the crappiness of computer displays, in part due to our reduced attention span, and in part due to a recognition that if we chop things up and give them accessible labels, they can be found and reused. But as a result, we get the author's order (which, in an extended work counts for a lot) plus the ability to reuse the pieces in new contexts. And if you don't want to, you can post the whole thing as one, massive PDF file.
Jon Lebkowsky (jonl) Sun 3 Jun 07 21:46
My thought was that putting information online wasn't so much a solution (of problems with the first order) as a change as we facilitate the third order, which has problems of its own, no?
Harmless drudge (ckridge) Mon 4 Jun 07 13:04
>Tagging systems think of themselves as providing navigational and re-finding aid. Traditional categorization systems sometimes thought of themselves as attempts to parse the real order of the universe.< A navigational and finding aid for information is, in essence, a way of associating thoughts. If one can seize control of the way that people associate thoughts, one has taken a step toward seizing control of those people. One can hide information, or color it by association. The old cross reference "WOMEN: see MAN" and LC's shelving of marxism near criminology are examples. Sanford Berman has devoted a good part of his life writing about the political implications of subject headings. If one ceases to base one's subject headings, subject cross-references, and classification systems on the way the world is, one has opened oneself for continual rhetorical struggle over what they are going to be. It will be competing sets of fauxsonomies fighting one another forever.
David Weinberger (dweinberger) Mon 4 Jun 07 18:48
jonl, often I think we do put things online to solve problems with the first order, i.e., with atoms. But the third order certainly has its own problems. For one thing, it's way honking huge. Back in 1995 or so, when I worked at Open Text, we thought we were breaking ground by indexing more than 100,000 pages and then a million pages. Hah! For another, the third order is decentralized (which is how it go so honking huge, of course). For a third thing, it's therefore essentially disorganized, with people sticking pages here and links everywhere willy-nilly. That's why we keep inventing new technologies and techniques. And it's why we need to develop new principles of organization.
Members: Enter the conference to participate