Thesis: personalisation, recommendation and ‘smart’ filtering features have become so ubiquitous and so bad that I am actively reconsidering before searching and clicking on certain items for fear of the subsequent pollution of my clickstream.
I haven’t undertaken extensive research into the implementation of these algorithms, but it is my experience that there are two basic techniques in use:
- Suggest other items based on an obvious similarity to preceding hits, such as shared authorship or source, common theme, other elements in a series, etc. For example: you like this album by band X, therefore you may also like <all the other albums by band X> and <solo albums by members of band X>.
- Match the customer profile against the overall customer database, locate similar profiles and make suggestions based on any remaining gaps between them. For example: you like bands X and Y, other people who like bands X and Y also like band Z, therefore you may like band Z.
If these examples sound most closely like an Amazon shopping experience, you’ve guessed what I had in mind while I was writing them. But note that Amazon will also use as input the items you searched for, briefly perused, added to a wishlist (regardless of whether it’s for you or someone else) and pretty much any interaction you’ve had with their site, no matter how light your touch.
The problem is, none of these basic techniques are a good model for how people actually like something. They deal only with the most obvious or surface attributes of the items: who made them, what they’re (ostensibly) about, their proclaimed relationship to other items. None of these say anything about the item’s relationship to me, other than in the most vague and coincidental terms. Yes, I like Pink Floyd albums, as many do. But I like them because years ago, something in the music appealed to my stumbling, flailing, exploring adolescent self - and furthermore, other things outside the music such as the example of peers or role models (thinking of you and your mutant camels, Jeff Minter…) also played a strong part. However, by and large, I don’t enjoy the few solo albums of various Floyd members that I’ve heard, and I actively frown on bands that seem to borrow all the trappings of Floyd’s music without adding anything new. But here’s Amazon to recommend not only all the other Floyd albums, including the ones I already own but haven’t told them I’ve got (even though it should be obvious that anyone who owns Wish You Were Here is bound to have Dark Side), the 20th/30th/40th anniversary rereleases, all the various solo duds and, of course, Fucking Coldplay. (There you are, everything that’s wrong with the world today: “people who buy Pink Floyd albums also liked” Fucking Coldplay.)
It doesn’t work like that. First of all, I like the music (occasionally, I like the cover picture or the sexy publicity shot of the artist first, but that may only last as far as hearing the music). And then sometimes, I only liked it because I was in that mood at that moment and that song was just perfect right then. But having bought it, and maybe played it a few times to try to recapture that perfect moment, I might not want anything else by that performer or that sounds like that song. Sometimes, quite often, one track or album is enough (indeed frequently a whole album is way too much of it).
And actually, I’ve heard those Floyd albums so many times now, and they are so much a part of my teenage self, that I rarely play them anymore. Moved on, sorry. Keep up, Amazon.
Music recommendation algorithms that rely on averaging and abstracting a large mass of individuals to find commonalities are doomed to fail, at least if your tastes are at all catholic or outside the ‘average’. Generally, I think it’s fairly clear from my online persona that I’m sad enough to like progressive rock - and even there, I tend to think that 90% of the genre is mince (the other 10% I can’t live without, however). There are people I follow on Twitter who also like prog, but I wouldn’t say I share more than 50% of my tastes in the field with any of them individually, and sometimes what they love the most can be anathema to me (e.g. Anathema).
I wouldn’t describe myself as a fan of classical, folk, jazz or, least of all, country & western, either but I have examples of all of them in my MP3 collection that I love. “On The Shore” by Trees would easily be in my Top Five albums, but I’ve never heard anything else in the folk world that quite hits me personally the same way - not that that has ever stopped Amazon recommending a succession of acoustic, waif-like and ultimately disappointing folk singer-songwriters. (I did my own research and turned up a number of interesting acts who were considered vaguely reminiscent of Trees by someone at one time or another, which was a rewarding exercise but led me to conclude that there just hasn’t been anyone quite like that band since. Which I guess is partly why they’re so great.) Right now, I’m on a jazz kick, a genre about which I was entirely ignorant and unconcerned for forty years. I’m still feeling my way, picking out individual tracks and subcategories rather than following the entire careers of particular players. But on the whole, people who like individual jazz albums tend to be Big Jazz Fans, and what works for them most likely isn’t going to work for me, at least yet. Or at the other extreme, pretty much everybody across the board likes the Dave Brubeck Quartet’s “Take Five”, so you can’t draw many safe inferences from that either.
My point is: I’m an outlier (“YES! WE ARE ALL INDIVIDUALS!”). There are doubtless lots of other outliers in Amazon’s customer profile database, some of whom may even share my eclectic tastes, but none of them will outlie in the same way. Even the person who has liked everything I also like up to now may diverge on their next purchase, for reasons too ineffable to ever discern. And they are, in any case, buried by the greater mass who strongly like a narrower part of their overall tastes (the Floyd freaks, the jazzers, etc.). Inferring an individual’s taste from all the sheer random, wonderful happenstances that generated that data is a hiding to nothing.
Think music is a bad example? Try photography (or any visual art): taste in that is, it seems to me, almost impossible to define unless you only like one very narrow, tightly constrained thing (only pictures of bins, maybe). I don’t know what sort of pictures I like, except that I’ll know when I like them - or to put it another way, people who like this sort of thing will find that this is the sort of thing they like. I have favourite photographers and many, many favourite photographs, and I can sort of draw connections between them thematically, but I’ve never been able to articulate those well enough reduce them to a concise search term. Finding suitable Flickr groups is impossible, unless you subscribe to a whole bunch, the sum total of which will cover all your bases. (About the closest I’ve found is in a small moment, but even that only covers one particularly strong preference, and good luck describing it well enough to locate any related groups.)
Then I tried looking for Google+ communities that might offer something similar, because I’d heard so much about how photographers were flocking to G+, and I gave up in short order. A group called simply ‘Photography’ is too broad and shallow. A group called ‘Black and White Photography’ is still too broad (just portraying a subject in monochrome doesn’t automatically make me like it and anyway, I like a lot of colour work too - e.g. Trent Parke does strong work in both). Anything more niche most likely doesn’t exist because after all, Google+ is hardly representative of anything much.
I have the same issue with Flipboard’s choice of photographs and photography-related articles. Too many ‘10 ways to improve your shots’ pieces, too much HDR, too much glitz and oversaturated gloss at the expense of meaning. I made the mistake once of viewing a shot of a white lion from a group called ‘Amazing Things In The World’, mainly for the benefit of my eldest Junior Research Assistant, and now I can’t seem to eradicate their damn tedious travelogue shots from my feed: polluted beyond use because of one idle click. (This isn’t an insurmountable problem caused by the sheer metric tonnage of populist crap in the online photography world: Zite managed to dig out interesting pieces on W. Eugene Smith and Garry Winogrand recently that were right up my street.)
Actually, Flipboard seems to have a problem with inappropriate curation generally: the new version is suggesting that I might like Motorsports, Autos and London which, even considering it might not yet have had enough time to adapt to my preferences, is quite a stretch and reeks more of throwing mud at the wall to see what sticks.
Google+ can now apparently automatically recognise certain common subjects in uploaded images. For example, it can identify the Eiffel Tower in tourist pictures. Let’s say there was an Elliott Erwitt photo of the Tower that I particularly liked. G+ could recognise it, and then it could suggest a thousand more shots of the Tower that I might also like - none of which I probably would, because I’m actually not that interested in it per se. It might even find shots of silhouetted umbrella carriers leaping in front of the Tower, and I suspect that still wouldn’t do it. Thinking laterally, for an algorithm, it might suggest lots of other Erwitt shots instead. Well, that has better odds of turning up some successes, because I do actually like Erwitt’s work in general - but not all of it (for example, the Dogs stuff I’m not too fussed with because I’m not a dog person). And anyway, I already own an enormous book of his work so meh, next please.
You could algorithmically analyse all the photos I’ve ever liked, break them down to statistical representations of the ratios of light and dark pixels, recognise the key subjects, fold in all the surrounding context of photographers, agents and verbiage, and then apply those criteria to your database of all the other available online images, and I bet you still wouldn’t be able to come up with more than a dozen other examples that I hadn’t previously seen and that would have the same impact on me, at least at that moment.
Here’s the problem with the sorting and filtering algorithms that the personalised social media world likes to use: individual taste simply isn’t reducible to tags and keywords. And by continually pretending that it is (because hey, Big Data will solve everything, right?), and pushing that pretence further and deeper into all your products, you are simply pissing off your audience. Telling me I “ought” to like something that I actively hate because your data leads you to believe I should conform to an average is not simply a minor inconvenience - it actually offends me. I don’t just ‘Like’ this stuff; it speaks to The Very Core Of My Being. You claim to know me, you prostrate yourself before me in the name of ‘personalization’ and tell me, “you got it!”, you solicit or gather all this personal data from me so you assure me you will predict my every desire - and then you suggest I like Fucking Coldplay?? Fuck you. I thought we had somethin’ and all the time, IT’S LIKE YOU DIDN’T REALLY KNOW ME AT ALL!!