Recently, we launched the Emotion Sense app for Android on the Google Play Market. This app combines experience sampling with the passive data that modern smartphones can collect, and gives people feedback about how their reported mood compares to this data. Of course, this app was designed and built to support our research into how daily mood relates to the behavioural signals that phones can capture.

To support the launch, this press release was published, which has since been picked up by a number of newspapers and blogs (a sample of them are listed here). Overall, the response has been overwhelming and we have learned a number of lessons which I should dedicate a separate blog post to.

Naturally, since this app collects data from a wide variety of sensors, there have been a number of concerns about privacy. This issue has and continues to be very important to us. However, a number of comments that have been made are misleading and so this post aims to clarify how and why we collect sensor data.

Why do we collect sensor data? What happens to this data?

The aim of collecting sensor data is to support academic research into daily moods and smartphone sensors. We are academic researchers, not marketers: we are ethically bound and fully committed to never sell or share the data the app collects. We are not interested in advertising or making money off your data: we are interested in progressing the state-of-the-art in Computer Science (sensing and data mining) and Psychology (studying daily life). If any of us leave the University, to move on to other projects, we will leave the data behind. We will also never make the data available to anyone, except the person who generated it. If you want your data, you can simply get in touch.

Our research is not about prying into individual’s lives. We are looking for broad patterns, which emerge from many people doing the same thing (e.g., using an app). We really have no interest in looking at anything other than aggregate patterns in the data.

How is the data anonymised?

The app does a fair bit of data collection on your phone, but then anonymises it: we do not receive your raw data. In particular:

  • The app does not send us conversations,or any audio recordings at all. All it does is measure the ambient volume, which is a number (e.g., “23″). We do not and cannot track the web sites you visit, your eye movement, or how you touch your phone screen (in fact, other researchers have shown that this is impossible on a number of Android devices!).
  • The app does not record any text message content or clear text phone numbers. In fact, it uses a one way hash function to convert a phone number into a indecipherable string. So, for example, we will see that a phone texted another phone, identified as “abdjasdfkjqwercsdsdsaqt2″ and sent 3 words.

How is this research funded?

We have not paid anyone to write/blog about our work, and our project has no commercial partners. Our work is funded by the Engineering and Physical Sciences Research Council: details of the project are available here.

We are aware that people may still have some questions about how the app works. If you or any of your readers has any questions, please feel free to contact me: neal.lathia@cl.cam.ac.uk

Or visit our FAQs page.

Update: Potential Reasons for this Confusion

As pointed out to me, some of the confusion about what we are doing with data may be due to this earlier research paper that was published in 2010 and used the same Emotion Sense name. I would like to point out that the app used in this paper is very different from the app that we released to the general public.

The publicly available app does not perform any speaker recognition or emotion detection. This is for a number of reasons:

  • Privacy. Since an audio recording should have informed consent, and not everyone around your phone may have given it, we chose to not store an audio that is recorded beyond (as above) the ambient volume.
  • Technical Challenges. While audio processing research is progressing, an earlier trial of the techniques we used in previous research were inconclusive and overly cumbersome: the sounds that people are surrounded with when their phones are in their bags, pockets, etc. go well beyond the controlled trial that was done in the lab.
  • Moving Beyond the Microphone. There is, naturally, more to a smartphone than its microphone. The design of the publicly available app therefore seeks to learn about daily life using a more holistic approach (i.e., combining the data from different sensors).

I was recently pointed to this excellent blog post that argues how the ideas of ‘big data’ and ‘quantified self’ do not fit well together. The title here comes directly from that post: “Big Data and Quantified Self, just like chocolate and champagne, do not pair together well.” In the true spirit of online blogging, I thought I’d reply here instead of via e-mail.

The key idea is that ‘big data’ tends to focus on the ‘average’ person: the aggregate of many noisy data points that, when put together, give an indication of behaviour that is the sum of everyone, but manifested by no one. Self-tracking, or quantified-self, data comes from a self-selecting sample of the population and therefore is not representative of everyone: “self-trackers  are different from other people with regard to mentality, psychological traits, lifestyles, behaviors, etc. So even if we derive a certain pattern based on a data from a hundred, thousand or even five thousand self-trackers with diabetes, that pattern won’t necessarily hold for all other people with diabetes.”

I mostly agree with this: my thoughts only differ in terms of the conclusions.

First, this problem is increasingly emerging/actively discussed in all ‘big data’ research. Studying how people move around cities based on foursquare check-ins only looks at people who like foursquare, researching how twitter predicts elections only looks at the sample of people who use twitter, and 96% of brain research has been conducted on westerners. Psychologists agree that they have been mostly studying people who are WEIRDos (Western, Educated, Industrialized, Rich, and Democratic). While something certainly has to be done to address this, I would posit that throwing away everything we have learned is not one of those things: there are many domains (take, for example, medicine) where ‘small’ tests have led to methods that have successfully scaled to all. Instead, we need to increase our awareness about how much of a sub/self-selecting-sample we are dealing with when making our conclusions.

By being full of people, ‘big data’ also has one key advantage: it can help overcome the data sparsity that any single self tracker will face, and finding links between people’s behaviours is the only way to do that. While tracking my mood, I know that I cannot accurately record it every minute, since I am otherwise engaged. However, your actions and mood may have something to teach me.

Mathematically speaking (see the other blog post), I’m saying that while Y_me = f_me(X_me), and Y_you = f_you(Y_you), since we are all human there are bound to be some people in the world where f_me ~ f_you: and we can learn from one another. So one of the goals of the quantified self movement should be to facilitate this process: putting people together in a room where they each talk about their lessons learned is a first step in this direction.

The only difference I see between QS and big data? By looking at your own data, QS seems to encode the ideas of mindfulness (beyond just self-experimentation). When I look at my QS data, I stop and think about my life. When I’m running my ‘big data’ experiments, I don’t!

A couple of weeks ago I was invited to participate in a workshop at NYU’s CUSP, or Center for Urban Science and Progress. As they describe themselves:

The Center for Urban Science + Progress (CUSP) is a unique public-private research center that uses New York as its laboratory and classroom to help cities around the world become more productive, livable, equitable, and resilient. CUSP observes, analyzes, and models cities to optimize outcomes, prototype new solutions, formalize new tools and processes, and develop new expertise/experts. These activities will make CUSP the world’s leading authority in the emerging field of “Urban Informatics.”

The theme of the workshop was ‘mobile sensing’ – with, of course, a particular focus on how it may support urban science.

Talks. The invited speakers were from diverse backgrounds and institutions, making for a very interesting line up. I did not take any notes, so my summary here is vastly unfair to each talk:

  • Rob van Kranenburg (@robvank) naturally spoke about the Internet of Things, and how it fits into the broader ecosystem of cities.
  • Mischa Dohler (@mischadohler)’s talk covered urban sensors, and gave rise to a big debate about where the boundary between crowd-sourcing and urban sensing should lie.
  • Jacqueline Lu (NYC Parks and Recreation) spoke about how data is supporting efforts to maintain and promote green spaces in the city. This was particularly interesting since it made me realise how a seemingly ‘trivial’ problem (maintaining trees) is actually vastly complex when placed into urban settings.
  • Vivek Singh (MIT) spoke about ongoing mobile sensing experiments that investigate how behaviours can be promoted between social groups.
  • Margaret Martonosi (Princeton) spoke about her work using Call Detail Record data (e.g., see this paper). The data gives fantastic geographical coverage and potential to study many facets of mobility, while presenting very difficult challenges with regards to inference and privacy.
  • Weisi Guo (Warwick University), the workshop organiser, spoke about his research about understanding cities through mobile sensors.
  • Jarlath O’Neil-Dunne (University of Vermont, @jarlathond) gave a talk about geographic analysis using satellite data – I learned about how LiDAR data (e.g., this blog post) can be used to, for example, find trees that would otherwise be hidden by shade: a very challenging data feature extraction task.
  • Raz Schwartz (Rutgers, @razsc) is the co-creator of the Livehoods project, which is a great example of attempts to uncover the structure of the city via social media analysis.
  • Eiman Kanjo (King Saud University) discussed her work with mobility and affective sensing using smartphones (see her publications here)
  • Andrew Eckford (York University – Canada not UK!) gave a very interesting talk about molecular communication for harsh environments (say, flooded subway tunnels!) – something I had never heard of.
  • Graham Cormode (AT&T) discussed his work on distributed data monitoring and mining. See his personal page here.
  • Lin Zhang (Tsinghua University) talked about his work with sensors on Beijing’s taxi cabs for pollution monitoring (MobiSys paper here). The dataset is available on request.

Finally, I briefly talked (with very sparse slides) about open challenges in mobile sensing – ranging from energy efficiency to data inference and behaviour change measurement.

Open Ideas. The fact that this broad range of researchers all agreed to come to a workshop on ‘urban sensing’ shows how this field is still in its infancy; I very much enjoyed the fact that everyone spoke about very different things. In fact, we even differed on the basics:

  • What is urban? It is one of those words that could mean tunnels in a subway, parking sensors on a road, or community driven tree maintenance. The ‘where’ of all the research above certainly agreed on cities: but within this context, there is a hierarchy than ranges from metropolitan-scale analysis down to individual citisen’s sensors.
  • What is sensing? It seems that ‘sensor’ is quickly becoming a term to mean ‘a source of data;’ while this is consistent with the past, my gut tells me that historically this would not have been the case. Both tweets, accelerometers, and satellites are sensors, albeit very different in nature: and there is ample space for research both within the scope of individual sensors and finding links/building systems that bridge between them.

 

I was invited to Gent recently to give a talk at a workshop on mobile research. Smartphones are at the intersection of a variety of domains… from hardware to usability and machine learning. The slides below tried to capture this, and look at some recent lessons and application areas that my research is supporting.

I recently attended and presented at a workshop on shared bicycle systems in Paris, France. The workshop, called “Spatio-temporal Data Mining for a Better Understanding of People’s Mobility: The Bicycle Sharing System (BSS) Case Study” was organised by Latifa Oukhellou from IFSTTAR; the program and slides are online here (and the presentation that I gave on my recent paper is embedded below).

I personally think that it was an excellent opportunity to collect a variety of people who have been working with data from shared bicycle systems; particularly since the work spanning this ‘nascent’ field is from people with very diverse academic backgrounds. The day uncovered surprising similarities in the techniques that people are using to analyse a variety of city’s data (e.g., clustering stations), and, more broadly, what the few (but growing) researchers in this field have been trying to solve.

The day also really served to expose a number of key problems:

  1. Data Acquisition. There is a blatant tension between researchers who have and want to continue studying these systems and those practitioners who run shared bicycle system web sites. A vast majority of researchers in the group have obtained their data by regularly crawling a shared bicycle map, like this one of London. This has allowed researchers to collection time-varying station capacity data, and is useful for training algorithms that seek to predict how many bikes a station will have. However, this is clearly not an ideal way to collect data, and I hope to see a closer collaboration between transport operators and researchers in this domain in the future.
  2. Data Quality. The data that is collected by scraping web sites is prone to inaccuracies and noise, and this can lead to errors in our analysis. For example, Ollie (who has made all the great online maps of shared bike systems) pointed out that one of the differences that I uncovered in my recent work was not due to a change in activity, but in the fact that the station that I thought had changed patterns had, instead, simply been moved closer to a train station. While, in hindsight, I don’t think this completely deconstructs the work I did (!) I wonder how much of the broader research is somewhat affected by similar hidden changes (or, at least, changes that a web scraper would not be seeking).
  3. Data Granularity. More importantly, web-scraped data does not capture important features of the system, such as origin-destination pairs or the actual habits of the systems’ users. As researchers, we know that the value of data is often proportional to its granularity. For example, all of the recent work that I have done using Oyster card data would not have been possible if all I had was station gate counts, which is the rough equivalent of the data that most shared bicycle researchers have. How are people responding to incentives? What is the variety of behaviours that the system users are exhibiting? All these questions are currently beyond our (data’s) reach.
  4. Limits of the Data. A very important point was raised during the day: any data that the transport authority holds will inherently only capture the “satisfied” part of the travel demand. All public transport operators do not currently have means of gauging how many passengers they have failed to transport; whether that be because the person has made the (healthy) choice to walk, or (in the shared bicycle case) that person has found an empty station when they sought a bicycle, or a station where all the bike’s tires have been punctured.
  5. Motivation for Mining Shared Bike Data. As researchers, I don’t think that we have fully uncovered the entire family of problems that data from shared bicycle systems can address; I felt that some propositions were lacking in a grounded motivation. There are a wide range of problems that could be addressed, if the right data were at hand. For example, can the data be used to discover bicycles that are broken? Can real-time data mining guide a load balancing truck to best suit current and predicted travel demand? This is where perhaps a closer relationship with transport operators may again be helpful.

Overall, I think it was a great workshop, and I encourage you to look at the presentations that are online. If you have an interest in this area, I would also encourage you to join the Google Group that I set up for researchers in this domain to share their findings.

London Shared Bicycles: Measuring Intervention Impact from Neal Lathia

Data Science London hosted a meetup on recommender systems at the end of the Strata London Conference. To kick-off the presentations, I was asked to give an overview of what is happening in the #recsys research community, with a particular focus on what happened recently in Dublin at ACM RecSys 2012 (which was then followed by talks by Tamas Jambor, Dinesh Vadhia, and Sean Owen). It was certainly a daunting task to give a 15-minute summary of a week-long conference, so I chose to do so by giving as many pointers to people and research topics as possible, within some kind of coherent story line.

This is what I went for (slides below):

  1. Why do we need recommender systems? While the older papers in the field talks about information overload, a recent alternative idea is that the web, which has facilitated the quick and large-scale publication and distribution of all kinds of goods, has removed the financial, editorial, or other kinds of filters that we previously used (filter failure). But, really, this is old news too: we now implement recommender systems to foster engagement and community, and the web has become an ecosystem of personalisation (see Daniel Tunkelang’s talk about LinkedIn recommendations).
  2. What are recommender systems? They are collaborative, query-less discovery engines. They are machine learning applied to preference signals. And while the Netflix prize always comes to mind in this context, there is actually a thriving and growing research community that meets annually at the ACM RecSys conference (and I managed to find a photo of me at RecSys 2007!)
  3. Don’t Reinvent the Wheel. Many times that I talk to start-ups that are building recommender systems, they tell me about problems that they are having which have been regularly visited by the rich research literature (cold-start, scalability, etc). So, what are some of the problems that the research community is looking at today? I made up 5 points, based on looking at the sessions and program from this year’s conference:
  4. Problem 1: Predictions. The research community has become very aware of the fact that there is more to recommendation than predicting ratings. There was an entire workshop dedicated to evaluation beyond accuracy (proceedings are now online here). How can you make recommendations novel, diverse and serendipitous? How do you deal with conflicting objectives?
  5. Problem 2: Algorithms. Related to the above, there was a nice discussion on the balance between the effort required (imposed) on users to rate things in order to improve recommendations vs. improving algorithms that can deal with few ratings. This topic fits well into the general theme of defining just what algorithms need to do: as above, while the traditional focus has been on prediction, recent shifts (including the best paper at the conference) were about ranking.
  6. Problem 3: Users and Ratings. The traditional mode of thinking about recommender systems has been “users” and “items,” who are linked by “ratings.” This paradigm is slowly being shown to be incomplete. What about context? What about groups of users? What about the platform you are delivering recommendations on (tablets, mobiles, PCs, televisions)? There was a related discussion at the mobile workshop I co-organised: is there a difference between capturing preference (what I like) vs. capturing intent (what I want)? As a side note, many times that I hear that the Netflix prize come up in conversation, people echo the widely publicised fact that the challenge solutions were not implemented due to their engineering constraints. But it is worth reinforcing a broader point that Xavier presented: Netflix has moved on to “other issues” that are more important.
  7. Problem 4: Items. The idea of having tangible “things” that you recommend is also slowly shifting. There was a whole workshop dedicated to recommender systems for lifestyle change, which I sadly missed. If “items” can now subsume decisions, behaviours, and processes – what are they, and are they worth thinking about as items?
  8. Problem 5: Measurement. The most recurring conversation at ACM RecSys is about understanding how to measure progress. I’ve already touched on it above; however, this year there were three clear groups: (a) algorithm-people, who present their results with empirical metrics performed on offline experiments, (b) usability-people, who perform experiments by means of user studies, and (c) the industry – which was clearly advocating online, large-scale A/B testing (see this great keynote). Sadly, academic researchers don’t have access to (c). Moreover, the real problem is that nobody really knows how (a), (b), and (c) relate to one another.
  9. So, to end: 3 key take-aways. First, recommender systems are an ensemble… of disciplines. This is clearly recognised as not being a exclusive machine learning topic. Second, the idea of black-box recommenders is slowly fading. Long live the domain! (and check out Paul’s keynote on music). Finally, the recsys research community clearly differentiates itself from others by having always been highly involved with the industry and start-ups who are building and running these systems, and there are tons of great open-source projects (e.g., MyMediaLite and Lenskit), backed by open, intelligent, and collaborative people which are there for you to explore and learn from.

20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (which will happen in November 2012 in California). Geographic Info Systems now also subsume recommender systems! Naturally, no movies here: these papers are all about location and mobility.

  • Location-based and Preference-Aware Recommendation Using Sparse Geo-Social Networking Data. Jie Bao; Yu Zheng; Mohamed Mokbel
  • Personalized Trip Recommendation with Multiple Constraints by Mining User Check-in Behaviors. Eric Hsueh-Chan Lu; Ching-Yu Chen; Vincent Shin-Mu Tseng
  • Probabilistic Sequential POIs Recommendation via Check-In Data. Jitao Sang; Tao Mei; Jian-Tao Sun; Changsheng Xu; Shipeng Li

Reminds me of a talk I gave on location vs. people. Be sure to also check out all the other interesting papers that don’t have recommendation in the title.

Titles from the INTERACT 2011: 13th IFIP TC13 Conference on Human-Computer Interaction

  • Looking for “Good” Recommendations: A Comparative Evaluation of Recommender Systems Paolo Cremonesi, Franca Garzotto, Sara Negro, Alessandro Vittorio Papadopoulos and Roberto Turrin
  • All the News That’s Fit to Read: Finding and Recommending News Online Juha Leino, Kari-Jouko Räihä and Sanna Finnberg
  • Helping Users Sort Faster with Adaptive Machine Learning Recommendations Steven M. Drucker, Danyel Fisher and Sumit Basu

Titles from SocialCom 2012: The International Conference on Social Computing:

  • Epidemic Trust-based Recommender Systems Stefan Magureanu, Nima Dokoohaki, Shahab Mokarizadeh and Mihhail Matskin
  • A Random Walk Around the City: New Venue Recommendation in Location-Based Social Networks. Anastasios Noulas, Salvatore Scellato, Neal Lathia and Cecilia Mascolo

Looking through the accepted papers at KDD 2012 (Beijing, China). As always, recommendation and personalization is a great application for all sorts of data mining work.. here is a list of titles that caught my eye:

  • Circle-based Recommendation in Online Social Networks
    Author(s): Xiwang Yang*, ECE department, Polytechnic In; Harald Steck, Bell Labs, Alcatel-Lucent Murray Hill, NJ; yong Liu, ECE department, Polytechnic Institute of New York University
  • Cross-domain Collaboration Recommendation
    Author(s): Jie Tang*, Tsinghua University; Sen Wu, Tsinghua University; Jimeng Sun, IBM; Hang Su, Beihang University
  • Incorporating Heterogenous Information for Personalized Tag Recommendation in Social Tagging Systems
    Author(s): Wei Feng*, Tsinghua University; Jianyong Wang, Tsinghua University
  • Learning Binary Codes for Collaborative Filtering
    Author(s): Ke Zhou*, Georgia Tech; Hongyuan Zha, Georgia Tech
  • Learning Personal+Social Latent Factor Model for Social Recommendation
    Author(s): Yelong Sheng*, Kent State University; Ruoming Jin, Kent State University
  • RecMax: Exploiting Recommender Systems for Fun and Profit
    Author(s): Laks Lakshmanan, The University of British Columbia; Amit Goyal*, University of British Columbia
  • Transparent User Models for Personalization
    Author(s): Khalid El-Arini*, Carnegie Mellon University; Ulrich Paquet, Microsoft Research; Ralf Herbrich, Facebook, Inc.; Jurgen Van Gael, Rangespan Ltd.; Blaise Aguera y Arcas, Microsoft Corp.
  • Finding Trending Local Topics in Search Queries for Personalization of a Recommendation System
    Author(s): Ziad Al Bawab; George Mills; Jean-Francois Crespo
  • GetJar Mobile Application Recommendations with Very Sparse Datasets
    Author(s): Kent Shi ; Kamal Ali
  • [Tutorial] Factorization Models for Recommender Systems and Other Applications (slides: link)
    (Lars Schmidt-Thieme, Steffen Rendle)

Happy reading!

I gave a talk at DTU Copenhagen about recent work we have been doing on the EmotionSense system.

Follow

Get every new post delivered to your Inbox.