Notes from WWW 2014

IMG_1763

Last week on behalf of SOCIAM I attended the 23th International World Wide Web Conference,  WWW2014, which is the premiere academic forum for web-related research. Being a A*-conference, it prides itself on its highly selective (and often brutal) review process, yet by remaining general (pertaining to any aspects of the Web), it seeks to attract submissions from a diverse set of sub-topics. In fact, there were a great many topics, new algorithms and methods discussed.  I will discuss some of the specific topics  I found interesting in a separate post.

But here I wish to discuss one aspect of my experience that left me wondering about the future of the conference and also implications for future attendees.  In short: if the statistics of recent years of the conference keeps up, I think the conference should change its name – or, we really need a new set of criteria for evaluating submissions to the conference.

But before talking about this I want  start on a high note — the opening panel, which lived up to an ambitious goal – to frame a research agenda for the Web for the next 20 years.

IMG_1751

Opening Panel: The Next 20 Years of the Web

The opening panel was moderated by Dame Wendy Hall, and alongside her, featured some of most respected senior academics of the community, including Sir Tim Berners-Lee himself, Ramanathan Guha of Google, Jim Hendler from RPI, Mary Hellen Zurco from Cisco/IW3C2, and several local South Korean academic stars, including sociologist Yong Hak Kim from Yonsei University and Unna Huh from the Startup Forum. The reason that this panel was interesting was that it allowed these respected luminaries to “spill their guts” about what worries them the most about the Future based on what they see today, combined with their extensive experience in the field.  While a great many ideas were mentioned, at least five major themes which I briefly summarise here.

IMG_4068

Re-Decentralisation and the Web as Critical Infrastructure (Tim BL) – With regard to the next 20 years of the Web, Sir Tim recommended that we start “bringing the web back to its people” instead of  “looking at upticks in advert revenue’.  In particular, he called for web technologies centred about the individual – “re-decentralisation”, he called it, which would grant people more autonomy and  empower them to make choices among the various sources and service providers on the web.  This, he contrasted with the highly partitioned (“siloed”)” environment that has become of the originally decentralised information environment he designed.

The potential problems arising from a Web controlled entirely by powerful third-party data controllers (in particular,  monetisation platforms such as Facebook and Google) are that potentially many.  First, the Web has begun to lose its neutrality, and what was formerly symmetric space where everyone derived mutual benefit and equally controlled their share of an interaction, instead has changed into an environment where  an asymmetric power relationship has appeared and continue to widen between those that control the platform and end-users.  Moreover, he argued that users are inherently held back from being able to effectively combine their private, personal data with public and social data, due in part to the fragmentation amongst these platforms, as well as individuals’ privacy concerns with disclosing private information essentially to entities incentivised to exploit them.

IMG_4067

Based on a comment from Mary Hellen, Sir Tim also discussed thinking about the Web again as critical infrastructure.  If increasingly hostile forces (e.g. ISPs / governments) are going to compromise the communciation substrate of the Web, why don’t we simply use a different substrate? For example, imagine if we had a dozen different backup channels, such as HAM/Packet Radio for the Web – we could choose the best channel when conditions became hostile or unavailable.

IMG_4070

Making Social Science “Scalable” (R. Guha) – One of the first points that were made by Ramanathan Guha was the need for ‘scalable social science’ – that there was simply too much social ‘stuff’ going on, at ever increasing rates of change, for traditional methods to simply be effective. Specifically he referred to traditional methods requiring years of training, and months, if not years, to conduct; this fact, coupled with the  exceedingly few trained social scientists available, meant that an extremely challenging bottleneck would arise for acquiring critical, timely insight on social technologies.  He suggested the need for both research into new research methodologies as well as better technological support for “scaling” such investigations, such as an idea of applying Web based Citizen Science methods to democratising social science, as well as supporting tools and processes for assisting novices in conducting such investigations.

IMG_3917

Societal Divides and the Psychology of the Web (Y. Hak Kim, U. Huh)  – Several associated points were made by Yong Hak Kim and Unna Huh; these researchers described the open issues in the digital divide, and how significant problems remained in terms of access for those who were less privileged, in rural communities, or the elderly.  Unna Huh mentioned that she believed greater research investment needed to be focused on the psychological effects of the kinds of activities fostered on the Web, including studying the effects of constant interruptions, increased sociality, addiction, gaming behaviours, on well-being and mental health. Don’t Look Here for People People’s creative instincts often inspire them to use and apply tools, materials and technologies in ways unanticipated by the original designers or providers of such tools and materials,   Such appropriations often to give designers insight on the ways new tools could be designed, or existing things could be re-designed better to meet people’s idiosyncratic needs. No technology has better exhibited the the idiosyncrasies (and power) of human creativity than the Web; thanks to ingenious “end users”, the Web has been brought to many uses and vastly unanticipated, from starting political revolutions, fighting censorship and raising awareness for human rights issues, to producing history’s largest collections of fanfic ever made.  Thus you would think that a conference about the Web would be about studies of how it is used, appropriated, and its perceived liitations

IMG_4113

Web Security is Core to Societal Wellbeing (M.H. Zurko) – Mary Hellen pointed out that the ways that Web-driven technology is finding its way into the most intimate corners of our lives – heating our homes (e.g., Nest), managing our health (e.g., Healthbook), the security of the Web will more directly impact personal security and wellbeing.   In particular in  recent light of both Heartbleed and the Snowden revelations, the future is even more unpredictable than most of us have possibly imagined; the very infrastructure of the Web as a common, secure platform for people to exchange information on can be undermined by governments and other organisations.  This is something we as a community need to think about – our lives may depend on it.

IMG_3880

Needs first, increments later (Jim Hendler) – Among Jim’s points during the panel were that user needs are complicated, difficult to anticipate, but need to be identified in order to make the kinds of leaps that were characteristic early on in the Web.  We should prioritise investigation into fundamental changes that address these needs, over working on very slight incremental improvements to well established systems,

The Research Track

The Highlights

Among the presentations, two stood out as being exceptional, one for its focus on the rather challenging and timely topic of Differential Privacy; Liyue Fan‘s paper, Monitoring Web Browsing Behaviors with Differential Privacy went beyond merely a characterisation of how one might apply differential privacy ideas to privacy-preserving disclosure of real-time time-series activity streams, to also how the derived statistics could be used to derive optimal estimators of behaviours given the attackers.

Second, J. Cheng and Jon Kleinberg’s Can Cascades Be Predicted? was among the best presentations at the conference, for clarity and maturity of work; the first half of the talk described the authors’ justifications for various choices of metrics used to analyse networks, and an array of results that followed.  The clear explanation of _why_ each metric was chosen, what each metric measured, the mathematical properties of the metrics, and implications for what the results showed achieved a level of excellence really unmatched at the rest of the conference.

Finally, the only piece related to user experience / web interaction design I saw was CityBeat a ‘real time dashboard for the web’ that was extremely visually appealing, but also which used an interesting process to derive analytics – including constantly employing a pool of Mechanical Turk workers to perform HITs to verify new twitter trends. Although beautiful, it seems the system was in early stages of development and therefore had not been evaluated with real users – will it prove more informative than reddit/twitter for breaking news?  The users will be the ultimate judge!

Missing: People and Systems

Given the unanimous agreement among the senior academic panelists towards understanding peoples needs,  empowering people over their own data, identifying the impact of Web and Web systems on learning, development and access, one would quickly expect that the main track of the WWW conference would be full of studies of real Web users, and new ideas for systems to meet people’s needs.

Not so. Not so at all. A quick look at the acceptance statistics paints a very different picture of what we might expect, :

Topic Submitted Accepted
Behavioral Analysis and Personalization 74 14
Content Analysis 71 8
Crowd Phenomena 43 8
Internet Economics & Monetization 35 8
Security, Privacy, Trust, and Abuse 55 7
Semantic Web 51 6
Social Networks and Graph Analysis 126 13
Software Infrastructure, Performance, Scalability, Availability 23 3
User Interfaces, Human Factors, and Smart Devices 33 3
Web Mining & Web Search 43 6
Total 645 84

Don’t Look Here for People

The first problem is the lack of studies of people. Note the singular category for “User interfaces, human factors and smart devices”, three very different topics (smart devices — what? really? do you mean ubicomp? Are you really shoehorning ubicomp and HCI together?), in which a very paltry 3 papers were accepted.

Would a Count of Nouns Predicted Lolcats? – Among the remaining categories, one might think (hope?) that “Crowd Phenomena”, and “Social Networks and Graph Analysis”, or that “Security Privacy Trust and Abuse” would involve /understanding/ users, their motivations and their needs.  Alas, a vast majority of these papers treated people as merely a single data point in a large set of user activity traces; questions such as “Can we predict when a post will go viral” were answered not by understanding the psychological aspects of content that drove people to make things popular, but merely of the statistical probability that things went viral given its basic extrinsic features; e.g., when it was posted, its length, the number of nouns and verbs in the post (?!),  popularity of the person (gauged, again, crudely through the number of followers/friends etc) over a large scale, and so on.  Would such an analysis ever have predicted why lolcats are so appealing? I would like to know whether such shallow features might /ever/ predict the next meme.   While merely a hunch, I would predict that things that make successful memes have certain subjective, intrinsic properties – humour, novelty, cuteness, memorability and simplicity – all features that require subjective analysis,   However, such intrinsic or subjective features seem to be ignored by the community at large.  Most frustratingly, the question of /why/, was merely overlooked – in favour of instead figuring out appropriate distributions/predictive models of large scale behaviour.

Don’t Look Here for Systems

Equally frustrating was the lack of new web /systems/. Given the huge interest in  of Bitcoin, WebP2P technologies such as WebRTC, the pressing need for better ways to keep users safe on the Web, the problems of preserving privacy and anonymity in an ever-instrumented Web, one would hope some of them would make a debut at the Web conference.

Not so. The W3C track had a nice set of tutorials and descriptions of upcoming recommendations. However, I noticed an overall decline in the number of systems papers in the main track of the conference this year; systems that fundamentally enable new kinds of _things_.  Where have all the systems gone?

What I Think Has Happened

My hypothesis is that this problem has arisen from the fact that, in an effort to maintain its high standards of scholarship among accepted papers, the WWW Conference programme committee/community has tended to favour papers that have presented extensive challenging  algorithmic analyses and problems, over those that have practical significance to the Web and the world around it.  A large fraction  of the papers in this year’s proceedings have been complex extensions that present incremental improvements, or formal derivations of properties of some interaction.

While  characterisations of difficult and challenging algorithmic problems, when warranted, can be extremely useful, I am opposed to rewarding merely challenging papers over ones that are insightful, informative, and relevant to the overall goals of the entire research community.

I think the WWW Committee next year should work to get this trend under control and to have reviewers carefully consider whether papers are 1) likely to have an impact 2) provide useful insight on needs surrounding the Web, 3) present novel ideas or architectures.  The CHI committee a number of years ago established a set of guidelines for reviewers, after witnessing a similar phenomenon pertaining to ‘user study fascism’ – the obsession with large user studies without regard to whether they produced particularly insightful results.  All reviewers are reminded every year of how they should evaluate papers – first and foremost by relevance and impact, and second by sound methodology and evaluation — in its  CHI’s guidelines for reviewers..

Reactions? Comments?

The viewpoint I have is one of a self-admitted human-computer interaction researcher/social scientist. Therefore, I am naturally biased to studies that lend insight into how technology is used by humans and that shed light on human psychology and behaviour. Moreover, since WWW is a very busy multi-track conference it is very likely I missed other great papers, and it is almost certain that I overgeneralised here above.

Therefore, I want to hear from you. Did you attend WWW2014? Do you agree or disagree with what I’ve said? Post comments below or write me privately via e-mail or DM me on twitter at @emax.

IMG_4202

Footnote: South Korea

South Korea is currently the world’s most digitally connected country, with more smartphones per user and a greater penetration of internet access throughout all segments of the population than any other country.  It is a country of much complexity, one that has experienced an immense tech-boom relatively late in its history, which propelled them from a purely industrialised to a post-industrialised (information) economy in the past 10 years. This makes it one of the most interesting settings for a high-tech research conference – a society which in the past 10-15 years has become so comfortable that every single person – regardless of age or social status, has a smartphone in their pocket, and Web and messaging-literate.

It is beautiful to see a society where elders whip out massive smartphones on the metro to catch up on television with their partners, to send a few messages to their grandchildren, or even to play Candy Crush, comfortably without hestiation.  Metros are safe, clean, on time, and reliable; people are courteous, quiet, and generally content. DIgital technology, being so incredibly pervasive, has become virtually invisible – sensors embedded in every wall, staircase, access points embedded in all subway tunnels and above park benches. Many cafés I visited in Seoul had private rooms for people to catch up to studying or work without interruption, high speed free wifi and power outlets throughout.  Being embedded within this hyper-modern society riding on a tech economic boom gave me insights that may have taught me more than anything at the conference itself!

IMG_4226

One thought on “Notes from WWW 2014”

Leave a comment