inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #26 of 155: David Weinberger (dweinberger) Wed 30 May 07 13:09
    
Harmless Drudge, nicely put. But there's one more limitation of
traditional library taxonomies: They require classifiers to decide what
books are about. Some systems allow only one heading; some allow ten.
Nevertheless, classifiers are deciding for others what a book is about.
They have to in the first and second orders of order because someone
has to, and professional classifiers are expert at it. And their
decisions are almost always right. But there's no telling what a book
will be about to a particular reader. In the digital world, we can have
it all. We can feature the official classification on the home page
and allow tags, too. (The U of Penn library has a hybrid system like
this.) 

And, of course, thanks to thesauruses (compiled manually or inferred
by computers), we can permit an indefinite number of ways of referring
to the same object. There are, of course, exceptions where we
absolutely need to find every scrap of information about a topic -
e.g., a lawyer researching cases - in which case we'll insist on a
controlled vocabulary.

But, again, we can have it all. That's the first principle of the
third order of order, imo.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #27 of 155: Jon Lebkowsky (jonl) Wed 30 May 07 13:52
    
One concept you refer to in the book is that of natural "joints." 
Could you explain how knowing the world is like butchering an 
animal?
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #28 of 155: David Weinberger (dweinberger) Wed 30 May 07 16:17
    
Ah, Plato's phrase from the Phaedrus. He says that the skilled
thinker/talker carves nature at its joints.  It's such a vivid and
obvious image. 

The notion is that the world comes divided up into natural units that
come apart cleanly when we're thinking right. But when we're not, it's
like hacking away at a bone. Plato is expressing a belief that there is
a single, knowable order of the universe, just as there's a single,
natural way to butcher a goat.

A joint in this case is an essential property of a thing. But it turns
out that every property or attribute of a thing can serve as a joint,
i.e., a likeness by which we can cluster it with other things. So,
big-and-roundness is one type of joint that enables us to carve up the
Solar System in a way that gets most of the planets, while "has water"
is a joint that gets us the objects in the Solar System that might
support life, and "going counter-clockwise around the Sun" is a joint
that gets us a different set of objects. There is an indefinite number
of joints available to us.

Which ones matter to us depends on what we're trying to do. We cluster
one set of ingredients if we're trying to bake a cake and another set
if we're looking for food to donate to the local open pantry. 

That's why the notion of there being a single way of carving up the
universe doesn't make sense, short of G-d declaring some joints more
real than others. The single order of the universe would be the one
that was independent of all our projects and interests.

That is, the single order of the universe would be, by definition, the
one we don't care about.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #29 of 155: bill braasch (bbraasch) Wed 30 May 07 17:33
    
That takes me right back to Zen and the Art of Motorcycle Maintenance.

maybe we went the wrong way on that.

I suppose a leading indicator of that would be the Geritol ads that pay for
the network news broadcasts.  To the kids, it's only news if it comes in a
text message.  Everything else is ads.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #30 of 155: bill braasch (bbraasch) Wed 30 May 07 19:10
    
I saw a presentation recently by a company that built software for fraud
detection in casinos.  it basically matched up metadata to see who had
something in common with someone else on their blacklist.

The government came to see it and they're using it now.

We're defining ourselves by the photos we tag, maybe the places we visit
(http://www.plazes.com has that), the breadcrumbs we drop on twitter.  It's
a much richer model of the dog on the internet than we've had before.  So
far, I'm really me on the internets, but it might be handy to have a couple
more me's, depending on who's looking.

the kids have all seen this coming on facebook.  we're miscellaneous until
we add tags.

How well tagged are you?  How well do you think your tags define you?
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #31 of 155: Ari Davidow (ari) Wed 30 May 07 19:33
    
I'm interested in how we deal with synonyms. In traditional taxonomies 
there will be an authority file. That is sort of like a thesaurus, with 
the added attribute that if you type one of the synonyms, you will not 
only get back all results, but you'll also get information on the 
"authority" term.

For some purposes I can see how that gets in the way of people typing 
according to how they will look for information again. But in a lot of 
cases, people shouldn't have to choose between SF and Frisco - the 
computer should be smart enough (or humans shoujld be able to point these 
synonyms out to the computer, which will then act on them).

But is anyone supporting such a thing? I guess so, or we wouldn't have 
such smart 3rd generation searches. On the other hand, there is a 
difference between a search engine able to know which "Capri" a tag might 
refer to, and being able to know that "The Dude" should return the same 
results as a search on "The Big Lebowsky" or whatever - or that 
searches on Peking and Beijing will usually want to return the same 
results.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #32 of 155: David Weinberger (dweinberger) Wed 30 May 07 20:46
    
bbraasch, you can often tell more about what a person is interested in
by her tag cloud than from her explicitly constructed "profile"
precisely because the tag cloud is based on implicit metadata.

Someone in an article recently (sorry...too tired to try to find it!)
said that the bottom of your Netflix queue is who you'd like to be and
the top is who you are :)
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #33 of 155: David Weinberger (dweinberger) Wed 30 May 07 20:57
    
ari, I've suggested exactly that to Technorati.com (disclosure: I'm on
their board of advisors). Right now, when you search there for
"america," you are told the "related tags" are politics, bush, iraq,
news, war, usa, religion, government, terrorism, and islam. Now, only
one of those is a synonym, and Technorati doesn't know which one that
is. It only knows that where the "america" tag is used, those other
tags are likely also to be used. Over time, with enough tags, perhaps
algorithms will be able to figure out that "america" and "usa" are
(nearly) synonyms, and that not only are "San Francisco" and "Frisco"
synonyms, that city is in CA. Or perhaps someone will allow users to
click on the tags that are synonyms, to help our poor, silicon-based
partners along. Or maybe Technorati will just buy a damn thesaurus and
gazetteer. 

One way or another, we're going to get there, or at least get much
better at it.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #34 of 155: Brian Slesinsky (bslesins) Wed 30 May 07 23:33
    
One way to reduce unnecessary synonyms is simply to spell out subject
headings.  That is, tag things with "Worldwide Association
Confederation Guild 2007," not just wacg07.  Of course that's harder to
type, but that's what field auto-completion is for.

Tags strike me as abbreviations that we should just get over.  In
programming, we've seen this already, where older languages used names
like strlen and newer languages use String.getLength().  Having to
remember abbreviations (other than the truly common ones) is just not
worth it.  Correctly-spelled words are a wonderful standard and we
should make the most of it.

I'm actually very impressed by how Wikipedia manages its namespace. 
They just give each article an encyclopedia-like name and add
disambiguation pages when needed.  So many sites got it wrong by using
a Yahoo directory-like hierarchy rather than a flat namespace, and even
most Wikis got it wrong by using InterCaps rather than sticking to
regular English.  (And the same applies for many other languages of
course.)
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #35 of 155: David Weinberger (dweinberger) Thu 31 May 07 03:49
    
bslesins, really interesting point. It's the first time I've heard yor
suggestion, which seems so obvious (the mark of many a good thought).
It'd help, of course, if tagging engines uniformly allowed us to use
spaces as characters in tags. But that's just whining.

Prolix tags work in the example I gave of a conference encouraging
bloggers to use a uniform tag. That example has some peculiarities,
though: 1. A central authority can stipulate a particular vocabulary;
2. Taggers are highly motivated to use the standardized tag (because
they want their posts to be included in the conference cluster); 3.
Taggers are tagging more than one item -- all their posts about the
conf -- with that tag, so  auto-complete can amortize their labor.

In many other environments, no one is in a position to stipulate the
tag set, in which case prolix tags may actually increase the number of
synonyms: You tag the photo "New York City at sunset," but I tag it
"Night falls on Manhattan." In such a case, having a mix of short tags
would probably make it easier for a computer to figure out that the
tags are related. But, I am not a computer scientist (IANACS).

Also, tags have succeeded even though people generally hate explicitly
creating metadata in part because tags don't take a lot of thought or
typing. Increasing either of those is likely to decrease the number of
tags. Somewhere there's a balance of convenience and tolerance of
ambiguity that we will strike.

Wikipedia is a great example of so many things, including what you
point to, Brian. In my book I point out that the list Wikipedia
presents of disambiguated meanings of "elephant" is interesting on its
own terms. But, of course, Wikipedia is a highly cultivated garden,
with a single name for each article (which is part of your point).
Tags, on the other hand, are usually accidental gardens. We want to
allow multiple ways of saying the same thing with a tag because we want
people to remember their stuff the way they want to. Besides, synonyms
are rarely fully synonymous; names are a special case, but even they
rarely have a single way of being expressed. Right, bslesins? I mean
Brian. I mean the Brimeister. :)

So, I admire Wikipedia's way of solving its particular problem within
its particular constraints. It works. The lesson I draw is not that
this is a generalized solution (not that that's what you're suggesting,
Brian). Rather, it's that solutions need to be particularized.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #36 of 155: Jon Lebkowsky (jonl) Thu 31 May 07 03:56
    
But the problem is in lacking a consistent approach across many systems, no? 
E.g. some systems don't handle tags with spaces, others do. I generally don't 
create tags with spaces for that reason, though that's probably relevant to 
the fact that I lean more toward social tagging than selfish tagging.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #37 of 155: Jon Lebkowsky (jonl) Thu 31 May 07 03:57
    
David's post slipped in while I was typing.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #38 of 155: Jon Lebkowsky (jonl) Thu 31 May 07 04:07
    
David, it occurs to me, reading your last post, that metadata can have its 
own metadata, like contextual data. I.e. maybe a system looks, not just at a 
tag, but at the context for that tag in assessing its meaning or relevance. 
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #39 of 155: Sharon Lynne Fisher (slf) Thu 31 May 07 05:17
    
In your example above,

"You appear to be tagging this picture with Manhattan. Our system has
373,082 tags with Manhattan, and 1,483,217 tags with New York. Do you
want to modify this tag? Replace Add Leave the way it is"
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #40 of 155: Jon Lebkowsky (jonl) Thu 31 May 07 06:42
    
Is this the origin of the phrase "I'll take Manhattan"?
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #41 of 155: David Weinberger (dweinberger) Thu 31 May 07 06:50
    
jonl and slf, these are questions that can only be answered in
practice.  It depends on what the site is trying to accomplish. If it's
trying to pinpoint precise answers, one set of practices is
appropriate. If it's just trying to show you some photos of a city
before you go there on vacation, another set is called for. 

In general, I think the best guideline is to allow for as much
messiness as possible. (Of course, the "as possible" brings it back to
particularities.) This is advisable because getting users to explicitly
create well-behaved, well-formulated metadata not only is a burden
that will chase many users away, it results in metadata that, because
it is explicit, is not as rich. So, rather than insist that users use a
controlled vocabulary, it'd be better (usually) to let them use
whatever words they want, and then have the computers sort it out.

That's true for the meta-metadata that helps us contextualize tags. We
have only scratched the surface of the meta-metadata that's just
sitting around for us to grab it (or deduce it). That should be our
first resort...

...imo, and always paying attention to the particular needs and aims
of the users.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #42 of 155: Jon Lebkowsky (jonl) Thu 31 May 07 07:11
    
I think it's really interesting how those needs and aims can coevolve with 
the technology, where a tool may show me something I couldn't do before and 
didn't know I needed, but once I have it, my behavior with it may suggest 
tweaks to the developer. Hence the "perpetual beta" development loop, which 
amounts to a conversation between developers and users. I think that's how 
we got the evolution of social from selfish tagging.

The developer and user share authority for emerging technologies.

And speaking of authority, that seems to be a key theme of the book. You keep 
showing how the location and flow of authority has changed. In that sense, 
isn't the book implicitly about politics?
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #43 of 155: David Weinberger (dweinberger) Thu 31 May 07 08:29
    
Yes, jonl, the book is implicitly about politics in the extended sense
of the term. That is, it's about the shift in the locus and nature of
authority that has accrued to those who have done the job of filtering
and organizing knowledge -- a job shaped by the accidental nature of
paper.

Ironically (?), politics itself is likely to be one of the last hold
outs in the democratization of knowledge. Politicians are such
dedicated, immersed marketers!
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #44 of 155: Jef Poskanzer (jef) Thu 31 May 07 08:30
    
I'd like to see a collaborative filtering system that lets me give
thumbs up / thumbs down on individual acts of tagging, building
up a trust rating for other taggers.

I'd like to be able to mark pairs of tags as synonyms.  And give
thumbs up / thumbs down to other peoples' synonym links.

I'd like this to work for other objects besides tags, e.g. flickr
groups - same problems.

And I'd like a pony.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #45 of 155: David Weinberger (dweinberger) Thu 31 May 07 09:12
    
jef, interesting ideas. Finding a good tagger is a good thing.
Thumbs-upping/downing them is one approach. So is having our computers
derive which ones we trust by watching what we do. Eventually we'll
invent them all.

As far as marking synonyms, you may like Freebase, although it's not
quite what you're looking for. It's a wiki-based approach to coming up
with metadata categories for bunches of different domains (businesses,
movies, etc.), and then to collaboratively filling in those metadata
categories for as many entities we can find. Very very interesting.

PS: Your pony is in the mail.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #46 of 155: Jef Poskanzer (jef) Thu 31 May 07 09:27
    
I got a Freebase beta account and totally couldn't understand it.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #47 of 155: Ari Davidow (ari) Thu 31 May 07 09:34
    
You know, when a problem is entirely understood, it is easy to look at it 
and say "this is the best practice". Within some limits, Library of 
Congress classification works for other libraries, and probably works 
better than Dewey.

But tags are entirely new, already have several meanings, and are used to 
define material in many different contexts. It feels like we are more 
constructive working out how to help people make links than trying to get 
people to all tag things the same way with the same terms. In a broad 
sense, aren't tags an effort to get away from the idea that one set of 
terms can apply to a single item?

David, I'm only a couple of chapters into the new book so far, but am 
finding it fascinating. Many thanks.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #48 of 155: bill braasch (bbraasch) Thu 31 May 07 11:17
    
the wisdom of the commons is cluttered.  no pony for jef til we unclutter
the commons.

I think it will be an old pony.
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #49 of 155: Jim Leftwich (jleft) Thu 31 May 07 11:59
    <scribbled by jleft Thu 31 May 07 11:59>
  
inkwell.vue.300 : David Weinberger, Everything is Miscellaneous
permalink #50 of 155: James Leftwich, IDSA (jleft) Thu 31 May 07 11:59
    

Hi David.  You've written a very important book, and this is a great inkwell
conversation.

I'm a product and software interface designer that began working with
conceptual models of metadata-driven systems since the late 1980s.  My
earliest metadata-driven OS/internet model was InfoSpace, which was
presented at 3CyberConf at UT Austin in 1993:

<http://www.well.com/user/jleft/orbit/infospace/index.html>

Your book is the first I've seen to address many of the same issues which
I'd discussed and illustrated starting back then and continuing on through
1999, culminating with a topic here on the WELL.

In the WIRED conference there's an entire topic where I laid out many of the
same metadata concepts and issues you're addressing in your book, but I had
arrived at them from the direction of the user interface, where I realized
early on that it was the underlying metadata model, and how it could be used
to allow interactive visualization of complex and interrelated data
(retrieved from queries).

I appreciate how incredibly difficult it is to express models regarding
metadata systems, and this is captured in that topic.

Topic 327 [wired]:  (jleft)'s Prophecy: The Visualization Revolution

There's also a set of slides that I'd done in 1997, which I've lectured on
in several presentations, which make some of the same points I've picked up
from Cory Doctorow's posts about your book on Boing Boing:

These don't have the text of my lectures associated with them, but I'm
betting you'll realize many of the same underlying issues presented by them.

<http://www.well.com/user/jleft/orbit/vizrev/slides/>

And these slides in particular:

<http://www.well.com/user/jleft/orbit/vizrev/slides/1.html>
<http://www.well.com/user/jleft/orbit/vizrev/slides/2.html>
<http://www.well.com/user/jleft/orbit/vizrev/slides/3.html>
<http://www.well.com/user/jleft/orbit/vizrev/slides/4.html>
<http://www.well.com/user/jleft/orbit/vizrev/slides/5.html.
<http://www.well.com/user/jleft/orbit/vizrev/slides/6.html>
<http://www.well.com/user/jleft/orbit/vizrev/slides/8.html>

In general these show how the current (1990s, but still existing today)
model of a singular presentation of data (both in structure as well as
embodiment) could be evolved towards one that used a number of metadata
models to enable a second user-side interface.

Very difficult to present in the limited text forum here on the WELL, but
much easier to discuss verbally and with examples.

I'd certainly appreciate the opportunity to discuss this work with you at
some point.
  

More...



Members: Enter the conference to participate

Subscribe to an RSS 2.0 feed of new responses in this topic RSS feed of new responses

   Join Us
Home | Learn About | Conferences | Member Pages | Mail | Store | Services & Help | Password | Join Us