Tom Reamy (treamy) Sun 10 Jun 07 13:40
David, on one thing we definitely agree the importance of metadata. Im particularly happy to see people who for years denied any need for metadata (too expensive, not needed the search can find everything) are now coming around. But one last general point I know that book titles have to be catchy to sell. I dont think that a book titled, Some things are miscellaneous and in some circumstances and for some kinds of content the miscellaneous can be used to find and discover things a little better than some of the older more rigid forms of structure., would sell very well. Still when you say that **everything** is miscellaneous well Im sure youve heard the nit picky response before. And, of course, for taxonomists, the term miscellaneous has special negative connotations being what is left over that cant be categorized into the taxonomy and thus to be avoided like the plague. On a more specific point, you include faceted navigation in the camp of the messy, but I have to disagree. True, facets are not pre-coordinated like a traditional taxonomy and this allows the really powerful capability of allowing users to start from any facet and take any path to find and/or explore the domain. However, they support this capability precisely because they are so well structured both in terms of the full selection of which facets to expose and that each facet is highly (and often hierarchically) organized. Mapping multiple highly structured facets together to zero in on what you are looking for is not what I call messy, although it is dynamic and often creative. See Marti Hearsts work, http://flamenco.berkeley.edu/
Tom Reamy (treamy) Sun 10 Jun 07 13:43
Returning to folksonomies, <jonl>, I agree that one powerful use for folksonomies is to explore a domain rather than to support findability, but if we restrict them to simply creative browsing for discovery, they become a lot less interesting and useful, and I dont think you will find too many folksonomy advocates agreeing with that restriction. And Ill make a prediction folksonomies will become less and less useful as the sizes of the various domain grow and everything gets swamped in the info overload (Its not like we havent seen this before). Having 1,000 photographs labeled Dog can be interesting, having a million becomes less so as meaningful distinctions become harder and harder to find and thus take longer and longer to wade through something most people dont have the time or inclination for. Another prediction folksonomies will survive best in the domains of things (photographs and others objects) as pointed out above, documents with all their verbiage are much messier than things. However, even there if folksonomies continue to be significant, it will probably be because someone has developed ways to add more structure to them.
Jon Lebkowsky (jonl) Sun 10 Jun 07 14:19
Or facilitate constraint in selection, which is an effect of delicious' tag suggestions... I suppose that could be considered 'more structure.' I didn't mean to suggest that tags would be used only for discovery, and my point was (as I suspect you realized) that tags are not limited in their use or value... a label or pointer may be metadata, but it doesn't follow that the sole use for metadata is in labeling or pointing.
James Leftwich, IDSA (jleft) Sun 10 Jun 07 16:50
<treamy>'s statement regarding the difficulties encountered with large sets _only_ highlights the shortcomings of current, primitive metadata usage strategies. But if the data is not limited to simply being the subject data objects (photos, files, media binaries, or even subject groups) and are _additionally represented_ by dense interactive visualizations that are driven by metadata, then large numbers become quite manageable - visually. A dataset of a million photographs with the one-dimensional metadata such as the keyword "dog," is so simplistic as to be absurdly useless. However, if each member from among those millions can be separately visualizable using a wide array of metadata types, each interatively mapped to associated visual attributes, then within the activity of interactively toggling (raising and lowering) individual or sets of metadata attributes, individual and comparitive differences among the members can be brought out. Each member will have many types of metadata associated with it. Size, place of origin, popularity, owner, date, qualities, etc. (both objective as well as metadata of varying degrees of subjectivity, such as good/bad, ratings, etc.) Every day of our lives we confront visual complexities many orders of magnitude greater than this, and with multiple senses. People who believe that data will be "tamed" and "boiled down," for easy digestion, and that this is the way we're going to go deeper into the information age are mistaken. Our visual cortex already has the means to see subtle differentiations in complex visualizations. And this ability is compounded when the visualization is interactive and dynamic. We simply haven't yet configured our computing experience to augment and utilize this biological capability. Visualization technology remains relegated (mostly) to research, scientific and commercial applications. But it will eventually be realized that it's the key to allowing everyone to have powerful and interactive overviews of much greater amounts of data from a separate, higher level of the information experience. Computers today, though having evolved a lot from their humble beginnings are still essentially apeing the cave wall. We still treat data, for the most part, as a single level experience. For example, those vast arrays of photography thumbnails that you can zoom out from and into. All of that zooming is still not the same as an additional, separately interactive dataset visualization layer or display. Every piece of data, media file, subject, etc. deserves to have a separate representation which itself is a member of a larger visualized swarm or field.
Harmless drudge (ckridge) Sun 10 Jun 07 17:30
>a wide array of metadata types, each interatively mapped to associated visual attributes< The thing is that there are only as many kinds of metadata as there are: title, author/creator, publisher, publication or creation date, place of publication or creation, medium, extent or size, and subjects. Computers make them all searchable, and display them in new ways, but do not add to their number. Unless someone can come up with some new sort of metadata, we have about the same number of ways to sort information, and a much larger sea of data to sort it out of.
bill braasch (bbraasch) Sun 10 Jun 07 19:09
David, I'm enjoying your Google talk on YouTube. <http://www.youtube.com/watch?v=43DZEy_J694> the slides are great. they really bring out your ideas. do you sit with a slide sorter and select these from your photo drawer, or did you dig them up on the web?
James Leftwich, IDSA (jleft) Sun 10 Jun 07 19:45
Of all the metadata types you list (Title, author/creator, publisher, publication or creation date, place of publication or creation, medium, extent or size, and subjects), all are roughly in the category of metadata I called "objective," and the last type, "subjects" falls toward what's more subjective (as David has elaborated on in his discussions of the different types of valid meanings assignable to anything). Furthermore, you imply metadata that's only coming from/attached to the data itself. It's unfeasible to imagine every file/media object/subject being responsible for carrying around _all_ of it's own metadata, even though it can carry around a great deal. There's absolutely no limit on the types and amount of metadata that can be generated and stored separately, with pointers/aliases/references to files/media/subjects though. Third parties (or the hosting service for the data object itself) can generate more, and this can be brought to bear in a secondary manner and aggregated with that which the data carries with itself (as you'd mentioned): *bear in mind this list could go on infinitely* - how many times has this been opened/viewed/cited/linked (relatively objective, in that it's a count) - how this has been rated (and there could be multiple rating systems) (both subjective on an individual rating basis, but more objective as its accumulated and tablulated - this form of metadata actually points to a potential form of a future election or polling system as well) - Predominant colors (in the case of photos, this can be discerned computationally) - How many people are viewing/watching/reading/accessing this right now (relatively objective, in that it's a count) - How common it is, based on how many of its many metadata attributes are shared with other files/media/subjects. (relatively objective/computable) - Secondarily derived metadata, such as the four-value metadata of season (Spring, Summer, Autumn, Winter) which is derived from the metadata of date and location (northern/southern hemisphere). - Live/ephemeral vs. permanent/archived (appropriate for certain types of data) - Metametadata - Which allows higher-level filtering of the third-party or external metadata brought in secondarily (by source, reputation, field of interest, etc..) etc..
James Leftwich, IDSA (jleft) Sun 10 Jun 07 19:47
(slipped by <bbrasch>) My response in #132 is to <ckridge>'s #130
bill braasch (bbraasch) Sun 10 Jun 07 21:15
the closing question in the google talk is from a voice asking David how he thinks it will turn out, will the people who think it should all be put in buckets find a way to get along with the people who think it shouldn't? that's it in a nutshell. the response, that some low single digit % of us fill out the document properties in our word documents, but people will tag things in a greater percentage, so get used to it elicited the reply 'well, good luck finding it'. great stuff there. you see a lot of the energy radiating out from the common pools of information as people form or strengthen communities around them. give us ten years to get this together, you say at one point. I realize that this is and can continue to be much more profound as it becomes the way we share what we know. China has Flickr turned off because of pictures they'd rather not share. Even that needles the man. And all the time we're adding facets. Good luck finding it. Good luck finding an address in London without knowing the neighborhood. Your wiki filtering layer for the aspect that interests me is what I want to find.
David Weinberger (dweinberger) Mon 11 Jun 07 13:44
The evidence so far suggests that, at least sometimes, the larger number of tags results in _greater_ precision. Flickr can cluster photos of noses into dogs vs cats only because there are so many photos of each. Further, we are going to get better at figuring out how to sort through this pile, using yet more metadata to tell us what type of metadata the original metadata is metadata of. There will also undoubtedly be realms where the increase in info and metadata creates confusion...but that will spur us to yet new heights of sorting glory. E.g., we can often find people with common names in large namespaces because affordances are built or because people hack their own names. These are problems we have to solve, so we will solve them. And sometimes the solutions will involve human editors filtering info for us. Sometimes that's just what we need. The Web is more of everything. treamy, about the title: The most common misunderstanding of my book I'm encountering is the idea that I'm saying we need never sort or order anything. Swim in the chaos! My subtitle certainly suggests that. But that's not what I mean, and I'm even willing to defend the title "Everything Is Misc." It's a title, so we agree an author gets a little leeway. Nevertheless, everything IS miscellaneous. In the digital world, it stays misc underneath the orders we cast on top of it: We don't know or care which platter the mp3s are on because we have a library list or a playlist. Of course I do mean something special by "misc.," and I am indeed playing on our traditional sense that the misc is the category for failures of the organizational scheme. But, by misc I mean the digital pile that is not just unlike things put together, but unlike things that acrcete more and more ways in which they're potentially alike. All those links! So, for me, the misc is a pile that contains an indefinite set of potential orders. It stays gloriously miscellaneous as we layer those orders on top of it.
bill braasch (bbraasch) Mon 11 Jun 07 14:07
I fret less over my miscellany having read your book. I spent some time on a website design today. Instead of thinking in outline terms, I drew a tag cloud on a sheet in powerpoint and now I can drag the tags around, see how they cluster and understand the content we'll need. I could have used a mindmap, but the tag cloud gives me a way to vary emphasis and alter placement. Has anyone developed a tag cloud visualizer?
David Weinberger (dweinberger) Mon 11 Jun 07 14:20
bbraasch, someone must have built a tag cloud visualizer. (I wrote a pathetic little cloud maker that turns text into a cloud, but it breaks at the least exception.) As far as the photos in my slides go: I try to use Creative Commons licensed ones from Flickr, but occasionally I cheat and grab one from Google Photos without asking permission.
Jon Lebkowsky (jonl) Mon 11 Jun 07 20:35
David, it strikes me that our computer operating systems store data in profoundly miscellaneous clusters on hard drives, and the more miscellaneous the distribution of these data clusters, the less efficient is their retrieval, so that ultimately we defragment. I'm wondering, in the world of metadata that you've been studying and writing about, whether we will find or construct deframentation devices and strategies? I could probably use one about now.
Christian Crumlish (xian) Tue 12 Jun 07 00:08
how much longer do we have David? i've been recovering from a nasty flu and reading this conversation has been a nice reward for the time stuck in bed. one question I have, if there's still time, is "Is there a fourth order of order." In the book we have the first three orders. Is it 1, 2, 3 many or is there another dimension to come? (I suspect it's the former, as in induction, when you've generalized the generalization you've got a principle for all the further steps.) A month ago David spoke with Bradley Horowitz at my workplace (video here: <http://video.yahoo.com/video/play?vid=514373>) and I found the dialogue very stimulating. I took copious notes and had the context allowed for me to monopolize the Q&A portion I would have asked about 10 or 15 followup questions. (Instead I plan to blog my questions and ask David to take a look, but since then work, life, etc., have intervened.) In the meantime I've had a chance to read (most of) the book and find it to be a charming and engaging romp through a fascinating range of topics. For one thing, I would make it required reading for anyone interested in information architecture. I enjoy following David as he takes us to various places in history and location and tells fascinating stories while managing to keep a thread of inquiry going. If there were a less cheesy word that meant edutainment, I would use it here. I'll spare the topic all my notes and thoughts from that time but looking over my notebook now I notice a strange concurrence of David's topic and a handful of Steven Wright quips: "You can't have everything. Where would you put it?" "I also have a full-size map of the world. I hardly ever unroll it." and "I have an enormous sea-shell collection. I keep it scattered on beaches all over the world."
Ludo, Ergo Sum (robertflink) Tue 12 Jun 07 03:53
>The thing is that there are only as many kinds of metadata as there are: title, author/creator, publisher, publication or creation date, place of publication or creation, medium, extent or size, and subjects.< And we all know that subjects readily sort out into a few categories.;-). BTW, does anyone have a handle on the label for a phobia about disorder?
David Weinberger (dweinberger) Tue 12 Jun 07 05:46
xian, I'm familiar with the first two Steven Wright jokes, and even refer to the map one in the book. But I don't recall the seashell joke...which I _love_. If you could make sub-collections of shells just by doing playlists that point to where the shells are scattered, you'd have a pretty good metaphor for the miscellaneous (and a totally not-funny joke). And, oddly, today in the shower I was thinking about the arbitrariness of writing stuff down. E.g., in the book I stipulate three orders of order. But I made that up. I could have said there are four, with "social ordering" being the fourth. Or I could have said there have been two dimensions of order, and now we're entering a multidimensional space. But instead I said there are three orders, and now that's what's fixed in ink. I had the same Author's Regret with my previous two books, but the arbitrariness of writing used to strike me regularly back when I was reading systematic philosophers who come up with a neat division (usually into threes or fours). They get stuck with what they write, even though it's really (?) just one way of slicing up the cake. So, are there really three orders? Nah. It's just a useful way of framing some issues ... which means that while it reveals some, it also obscures much. That seems to be how understanding works.
David Weinberger (dweinberger) Tue 12 Jun 07 05:47
jonl, "defrag" is a nice metaphor. In fact, Eric Norlin has started a conference called "Defrag" to talk about how we're pulling things together. From my point of view, all of the ways we pull order out of the miscellaneous is a type of defrag.
Jon Lebkowsky (jonl) Tue 12 Jun 07 07:29
Ha! Someone should start by defragging the bazillion conferences that are popping up every year, all of which I want to (but can't) attend. Here's a link for defrag: http://www.defragcon.com/
Christian Crumlish (xian) Tue 12 Jun 07 08:20
David, didn't mean to hoist you on your own canard. I had a philosophy professor who noted that things naturally break down into threes (when doing hierarchical lists: he was constantly writing blackboard notes with three items under each branch), because there is always "A, not A, and everything else.
bill braasch (bbraasch) Tue 12 Jun 07 09:13
metaxian (xian) Tue 12 Jun 07 09:38
and, since I forgot to close the quote*, the rest of the this topic is all part of everything else...." * quotation marks are metadata, right?
James Leftwich, IDSA (jleft) Tue 12 Jun 07 09:45
> They get stuck with what they write, even though it's really (?) just > one way of slicing up the cake. So, are there really three orders? > Nah. It's just a useful way of framing some issues ... which means > that while it reveals some, it also obscures much. That seems to > be how understanding works. Which is why I feel understanding equals/benefits from/deserves ongoing interaction, as opposed to simply reading/viewing/receiving some authoritative/controlled/static viewpoint/definition/framing/query result. The lack of the ability to dynamically reframe any viewpoint/definition/model/query result represents a limitation, even though convenience is a worthy goal. This has been an artifact and liability of the static written word, with the artifacts often being confused for some kind of actual finality/definition. The map is not the territory, etc., and words/descriptions are maps of concepts. Looking at this on a much larger, epochal scale, humanity now faces essentially an information representation/communication/perception crises, due to how much information there now is in the world. Both recorded (retrieving- and exploration-related) as well as live and in real-time (accessing- and awareness-related). This is an interesting, and shocking, thing to ponder, given that the written word has brought us from the state of hunter-gatherers to our present world. But we've now (the present era) reached the inherent limitations of that mode (alone). Words and single-level perceptual models for exploring and communicating ideas and information are, well, inefficient. Just as walking was eventually augmented by driving, and then flying. So will our historical information technologies eventually be joined by more powerful higher-level forms of processing, interaction, resulting in similarly higher and more powerfully efficient levels of understanding and awareness.
watch the parking meters (xian) Tue 12 Jun 07 10:04
I refuse to recognize the authority of what you just wrote <ducking> seriously, though, one thing that interests me about the web, about wikipedia, etc., is the idea of how authority can emerge or be constructed tendentiously, temporarily and how the older notion of authority based on credentials and handed down from on high, from a mountaintop perhaps, is seriously under siege.
Jon Lebkowsky (jonl) Tue 12 Jun 07 15:43
Yeah, I'm fascinated by the authority of the lead bird in the flock. It just emerges.
James Leftwich, IDSA (jleft) Tue 12 Jun 07 16:06
Kevin Kelly had some fascinating things to say about that aspect of bird flocking in his book, "Out Of Control," if I remember correctly.
Members: Enter the conference to participate