Peter Organisciak
PhD Student, Graduate School of Library and Information Science, University of Illinois
orga...@illinois.edu

What do I do?
My research interests lie at the intersection of online systems and users. How do users self-organize within the constraints of a system and how can systems adapt to these needs. This juncture between the humanistic look at users and the technical considerations of systems design characterizes most of my research, which includes sociological looks at online crowds, the communication of data through visualization, and online communication data mining.
Ph.D.
Advisor:
Miles Efron
Education:
MA, Humanities Computing – Library and Information Studies, University of Alberta, 2010
Honours BA, Communications and Multimedia, McMaster University, 2008
Selected Publications, Papers and Presentations
See my Curriculum Vitae!
Honors and Awards
Ian Lancashire Graduate Award (Best student paper, SDH-SEMI 2011)
Blog
Why does iTunes Genius suck so much?
March 28th, 2009
I’ve recently been mulling over the question, “Why does Genius suck so much?” and the implications that it has.
Genius is the playlist generation tool in Apple’s iTunes music software. You choose a song that you’re in the mood for, and it creates an entire playlist of similar songs. Essentially, its a recommender system; if you like x you’ll like y. The problem is that you get a very narrow point of view, with very little genre skipping. and no pleasantly clever surprises.
What sets Genius apart from other song recommender systems is that its essentially powered by the crowds. Apple has the luxury of a rich data set of habits and rating, and it appears to factor heavily into the recommendations. Indeed, algorithmic playlist generators were creating better results years before Genius came on the scene. So, what does this mean for the crowd?
The fact that computers can be better than humans in understanding art is off-putting. I’m still working through this problem, but here are some thoughts toward untangling it.
Ratings data is emotionless. When you rate a song 1 or 5, you’re giving it a universal ‘like’/'dislike’. This data doesn’t factor the mood of the song or the emotion of the listener. This is all very removed for circumstance. As I suggested to Bill Turkel, perhaps such simple crowd-based recommendations are better for high-level suggestions, like artists you may like, but useless at the micro-level (unless that data crowds are contributing is more specific to the topic of recommendations). In contrast, technology can quite effective interpreting the types and patterns of sound which represent an emotion. Certainly it can’t easily understand whether a song is good, but if you want a slow, jazzy rock song, that’s fairly achievable. This is something in which music recommendation is fairly unique, as it is easy to interpret than it would be to interpret thousands of movie plots or millions of book themes.
Despite this, perhaps the most-cited example of a good music recommender is Pandora, which is an internet radio based on the Music Genome Project (MGP). The MGP does use humans to categorize songs, having professionals tag each song with over 400 tags and using an algorithm to weigh the values. Pandora’s success shows that humans are indeed effective at understanding music, given that they’re looking at it in the right way.
There’s also the effect of popular media that makes human-based recommendations unbalanced. If a lot of people like Coldplay, the range of music that it will be recommended for will be broad. This additionally creates an echo loop where popular music simply grows in popularity. Inversely, it is very difficult for new music to enter the loop. If everybody that likes The Strokes like Yeah Yeah Yeahs, the recommender will reinforce this, brushing aside any similar new bands.
However, such problems are limited to the balance of the algorithm. Last.fm, which tracks all of its users’ listened music, is fairly effective in recommending similar music. Also, because of their detailed information on what a user has listened to, they can suggest less listened to songs. Though they don’t offer playlist generation, I wouldn’t put this beyond their abilities.
So where do crowds factor in here? If anything, Pandora suggests that this is best left to professionals. Certainly, you can’t get that sort of exhaustivity with crowds. The answer may lie in reliability. Large groups would be able to make much simpler connections, but on a larger and more verified scale. When I make a playlist with Lou Reed’s Take a Walk on the Wild Side, I always follow it with Urge Overkill’s Girl, You’ll Be A Women Soon. The songs are linked very little, but there’s something in me that recognizes the similarly cool feeling that I feel. If you could somehow capture millions of these sorts of links, that could lead somewhere.
Artistic visualisation
March 28th, 2009
This isn’t as much as direct response to Rockwell and Bradley’s Printing in Sand as a reaction to it. As I read through their embrace of scientific visualisation, a thought that I’ve been tossing around came to mind again. If visualising data should be concise, precise and easy, can is there any place for the abstract? That is to say, the artistic, the random, and the unfamiliar? Last year, Wordle struck a chord with the masses, despite it not providing much meaning beyond a pretty word count. Perhaps there’s a place in our hearts for the puzzle graph, where we don’t know immediately what’s going on, but we like to savour the time of figuring it out.
Why does iTunes Genius suck so much?
March 28th, 2009
I’ve recently been mulling over the question, “Why does Genius suck so much?” and the implications that it has.
Genius is the playlist generation tool in Apple’s iTunes music software. You choose a song that you’re in the mood for, and it creates an entire playlist of similar songs. Essentially, its a recommender system; if you like x you’ll like y. The problem is that you get a very narrow point of view, with very little genre skipping. and no pleasantly clever surprises.
What sets Genius apart from other song recommender systems is that its essentially powered by the crowds. Apple has the luxury of a rich data set of habits and rating, and it appears to factor heavily into the recommendations. Indeed, algorithmic playlist generators were creating better results years before Genius came on the scene. So, what does this mean for the crowd?
The fact that computers can be better than humans in understanding art is off-putting. I’m still working through this problem, but here are some thoughts toward untangling it.
Ratings data is emotionless. When you rate a song 1 or 5, you’re giving it a universal ‘like’/'dislike’. This data doesn’t factor the mood of the song or the emotion of the listener. This is all very removed for circumstance. As I suggested to Bill Turkel, perhaps such simple crowd-based recommendations are better for high-level suggestions, like artists you may like, but useless at the micro-level (unless that data crowds are contributing is more specific to the topic of recommendations). In contrast, technology can quite effective interpreting the types and patterns of sound which represent an emotion. Certainly it can’t easily understand whether a song is good, but if you want a slow, jazzy rock song, that’s fairly achievable. This is something in which music recommendation is fairly unique, as it is easy to interpret than it would be to interpret thousands of movie plots or millions of book themes.
Despite this, perhaps the most-cited example of a good music recommender is Pandora, which is an internet radio based on the Music Genome Project (MGP). The MGP does use humans to categorize songs, having professionals tag each song with over 400 tags and using an algorithm to weigh the values. Pandora’s success shows that humans are indeed effective at understanding music, given that they’re looking at it in the right way.
There’s also the effect of popular media that makes human-based recommendations unbalanced. If a lot of people like Coldplay, the range of music that it will be recommended for will be broad. This additionally creates an echo loop where popular music simply grows in popularity. Inversely, it is very difficult for new music to enter the loop. If everybody that likes The Strokes like Yeah Yeah Yeahs, the recommender will reinforce this, brushing aside any similar new bands.
However, such problems are limited to the balance of the algorithm. Last.fm, which tracks all of its users’ listened music, is fairly effective in recommending similar music. Also, because of their detailed information on what a user has listened to, they can suggest less listened to songs. Though they don’t offer playlist generation, I wouldn’t put this beyond their abilities.
So where do crowds factor in here? If anything, Pandora suggests that this is best left to professionals. Certainly, you can’t get that sort of exhaustivity with crowds. The answer may lie in reliability. Large groups would be able to make much simpler connections, but on a larger and more verified scale. When I make a playlist with Lou Reed’s Take a Walk on the Wild Side, I always follow it with Urge Overkill’s Girl, You’ll Be A Women Soon. The songs are linked very little, but there’s something in me that recognizes the similarly cool feeling that I feel. If you could somehow capture millions of these sorts of links, that could lead somewhere.
(Cross-posted to Crowdstorming. Leave any comments there.)
Returning to DIY
March 27th, 2009
After my post on underestimating the ubiquity of data, Jeff Biggar asked me to expand on my predication that prediction that “business practices and marketing will take a back seat to quality and value to society.”
You can see my response there, related to a paper that I wrote last year, but today I’d like to relate this to Willard McCarty again, once again from the narrowed scope of only computing. This is partially for posterity, as his notes greatly overlap with mine, and I’d like to return to them if I ever find myself polishing my paper.
McCarty notes that we’ve seen the “gradual transfer of ability to construct artifact from highly specialized technicians to ordinary users, and the simultaneously increasing technical sophistication of these users”, or DIY computing. This has happened mainly due to three reasons: the regaining of computing unity through networking, the development of operating systems so as to free users from higher-level tasks, and an amateurization in the nature of software (notably the introduction of lower-level programming languages).
These three points provide a premise for the trend of increasingly content-driven computing. When more people are able to create, more are likely to do so when there is a necessary artifact. However, McCarty’s point on operating systems is important as a generalized rule: freedom from higher-level tasks. Rather than many people reworking the same problem, why not standardize the solution and let them worry about other things? The operating system takes you partway there, software libraries and modules take you further. A JavaScript library such as JQuery, for example, lets web developers stop worrying about JavaScript compatibility between browsers by offering it’s own functions, which it then translates properly into JavaScript based on the quirks of whichever browser it’s running in. Ruby on Rails is another web technology that builds on sensible defaults to allow users to skip higher level concerns like full links between their modularized code, full functions for common tasks, or complex server interaction. Consider that Twitter was originally built on Ruby on Rails. Twitter was a very novel concept and – as those who’ve tried it can attest to – is hard to understand in strictly abstract terms. However, Ruby on Rails allowed the creators to create Twitter as a side-project, with time away from their day job, and experience the new concept.
Externally looking inwards
March 22nd, 2009
Though class has moved on, I’m still digesting the early chapters of Willard McCarty’s Humanities Computing.
One thought that I posted during my Day of DH blogging is the idea of trying to model oneself. What if you started writing down every self-reflective thought that you have and real-life character example, and subsequently organized them into some sort of logic? Would such a systematic process allow you to derive understanding that you haven’t explicated, by virtue of it seeming “wrong” without it? Would such a removed process help you reach a more concise understanding of your quirks and your motivations?
Computational modelling and the Netflix Prize
March 17th, 2009
In Humanities Computing, Willard McCarty notes that “computational form, which accepts only that which can be told with programmatic explicitness and precission, is thus radically inadequate for representing the full range of knowledge – hence useful for isolating what gets lost when we try to specify the unspecifiable.” (25). In other words, there are certain ways of knowing that we cannot explain, but because computers can only accept concise directions, they allow us to understand what’s missing when we do try to model these ways of knowing. To attempt to explicate something human through a series of instructions, you can compare the result to what you feel is the the result, and adapt the instructions as necessary. Thus, as Willard McCarty did in his research on personification in Ovid’s Metamorphoses, the process of modelling becomes an iterative process of comparing, identifying, and changing. However, the trick to improvement is that any changes affect all examples, and thus a change to accommodate a misnomer must also not break the model’s tolerance for something already accounted for. Or, if it does break it, perhaps it had not been explained by the model in the first place.
Such a process of modelling is apparent in what’s perhaps the most well known datamining project: the Netflix Prize. The Netflix Prize is a $1 million prize being offered by Netflix to the team that can can improve their recommendation system algorithm by a baseline of 10%. The contest has been running since October 2006 and teams are in the home stretch. Eleven teams are at 9% or higher, with the top two teams at 9.64% and 9.63%. However, progress has slowed to a crawl, as the teams push the limits of how much a computer can understand the intricacies of human preference.
Throughout the contest, teams have been remarkably open about their strategy, “acting more like academics huddled over a knotty problem than entrepreneurs jostling for a $1 million payday” (Wired – This Psychologist Might Outsmart the Math Brains Competing for the Netflix Prize). Thus, we see the effects of the iterative process of modelling: one team has an “a-ha” moment, notes the idea to the community and suddenly, everyone else has the same eureka moment.
What I find most fascinating about the prize is that there is a limit to what can be done. It apparently took only a month, out of the last two and a half years, for the leaders to get halfway there. Yet, now everyone’s poring over those misnomers, and can’t quite figure out why people like the most polarizing of films. The New York Times Magazine refers to this as the “Napoleon Dynamite” problem, after one of the worst of the misnomers. Other ones include “I Heart Huckabees,” “Lost in Translation,” “Fahrenheit 9/11,” “The Life Aquatic With Steve Zissou,” “Kill Bill: Volume 1” and “Sideways”.
History in academia
March 16th, 2009
On Humanist, Willard McCarty recently wrote an eloquent response to the question, “Why is it that you are looking to the past when you search for answers concerning the future?”, and it’s gotten people talking.
Now the question of history and precedent is an interesting one. I very much believe in founding our current knowledge on what we’ve learned from the past. At my old school, I became very outspoken about the fact that, by fourth-year seminars, my fellow communications students still had a lack of historical understanding in their thoughts, resulting in shallowness and alarmism. For example, we’d heard the exact same arguments about email and Facebook that were raised in the face of the telegraph and telephone and haven’t stood the test of time. To be premised in the present inevitably leads to a problematic and erroneous understanding of the world. This is something that I’m sure most of academia would agree with. Yet, I feel that we do not practice it.
The overwhelming feeling that I’ve be had for years is that parts of the academic system are stuck on repeat. Tradition has impacted heavily on us, and we find ourselves continuing decades-old practises and discourses that have not affected the world in any discernible way. We believe so strongly in history, but yet we ignore when something has shown to be, in the ugliest of terms, useless to society.
I’ve repressed this opinion for a long time, until a recent chat with Kathleen got me thinking about it again. It’s the very reason for my choice to study in this field: I feel like the Digital Humanities, in it’s unfolding state, is an area where I can make a forward-moving difference. How appropriate, then, was the timing of McCarty’s post. He surprised me by addressing this directly and, taking it a step further, did so within the context of Digital Humanities.
Take text-analysis, for example. As a whole text-analysis isn’t terribly successful or satisfying, as many others in the field keep saying, and have said year after year since the early 1960s. Indeed, the postgraduate course in text-analysis that I teach is based on the question of why it is we (firmly in the present, with eyes fixed on the then present moment) run unto a metaphorical brick wall so soon after getting started; or less metaphorically, how we can get beyond the level of the individual word and individual words nearby, lemmatized or otherwise, to whatever it is that could be considered “context”; or, more philosophically, how we can possibly justify what we consider “context” to mean in any given textual situation. …
So the literary critic or textual editor, focused on interpretation of texts, doesn’t find him- or herself in a particularly good situation with respect to computing. Yet at the same time, let us say, he or she has this nagging feeling that the computer really could be useful, somehow. And, let us say, this critic, firmly in the present moment, has ideas about what went wrong and might be done about it. Isn’t it important at such a moment to know what’s been tried already? Isn’t it equally or more important to be able to extrapolate from the trajectory that text-analysis, say, has taken all these years to where now it makes sense to go?
Sure, McCarty does not directly address my concern of historical-ambivalance, but he what he does suggest with understanding is that there will emerge people with feelings like mine, not finding what they have “terribly successful or satisfying”, and extrapolating from history how to evolve past the unsuccessful models. I guess the very fact of this discourse is evidence in support of Willard McCarty’s point.
A Saturday Morning
March 11th, 2009
A glimpse into digital culture
March 7th, 2009
Everybody should drop what they’re doing and head over to Thru-You. It’s a wonderful remix project where the creator takes various samples form youtube —a cello player here, a guitarist there, perhaps a Capella little ditty— and makes them into songs. There is no way to overemphasize the wonder in these videos/songs.
I love these glimpses into the real world people on Youtube. It reminds me of Mick Bianci’s phenomenal Youtubers video from over two years ago. The original video has since been removed, but luckily I found the one below.
Underestimating the ubiquity of data
March 2nd, 2009
Via FlowingData, I came across “Hal Varian on how the Web challenges managers” from the McKinsey Quarterly.
Varian, Google’s Chief Economist, speaks on a wide variety of issues, but all of them centre around the ubiquity of computing and free information. We are in a time of “combinatorial innovation”, where there’s an abundance of raw components, and innovation lies in using what is already available in the right combinations. In other words, we are standing at the start of a period of potential: we have what we need to innovate and now need to play around with it. Such periods revolve around a specific innovation (electronics in the 20s, integrated circuits in the 70s), and this time around, the fulcrum is the Internet.
This is similar to the point I suggested in a paper last term, where I argued that the ubiquity of tools positions us at the beginning of an “age of innovation” (borrowing the term from Felix Janszen). As more people become comfortable with computing and as tools for software innovation become more accessible, we have been and are going to continue seeing an acceleration in the realization of good ideas. Business practices and marketing, I predict, will take a back seat to quality and value to society. This is why the most successful online companies, such as Facebook, Twitter and Google, concentrate on the product first, and the revenue stream later. I have seen this baffle tradition business-types (and of course journalists), but a quality product is the only way a company can ensure that a better service created in some kid’s basement bedroom won’t pull the rug out from under you (as Facebook, Twitter, and Google have all done themselves).
