The attractive and risky realm of Data Analytics

June 2, 2016

Data is big, fast, real time, on the cloud, comprises the starting point for insights that can improve decision making and subsequently people’s lives. 

 The data hype
Few topics dominate business news as much as analytics and its applications these days. Data is big, fast, real time, on the cloud, comprises the starting point for insights that can improve decision making and subsequently people’s lives. The hype is also noticeable in the abundance of ‘become a data scientist’ courses and the emphasis on the unique skill of ‘data story-telling’. More recently, In the UK top leading universities, private and public sector have pooled together impressive financial and human resources in the Alan Turing Institute [1].

The appeal explained
There are several reasons that can explain this great appeal. Analytics is based on figures and data. As such it can be seen as promoting the meritocracy that comes with solid facts and provable arguments. Secondly, it has benefited from technical and scientific innovation and a collaborative ecosystem that allows open technologies and datasets to be combined and reused in iterative improvement cycles. Moreover, analytics means quantifying and optimising. There is something uniquely attractive about the idea of such advanced measuring and ‘optimal’ resolution. Perhaps because it is reminiscent and promising of the discovery of the “fair value” of things as economists would say, and justice will always be a worthwhile ideal. 

The good, the bad
Many applications of analytics can spring to mind where people have experienced positive change. Take for example ridiculously low cost of off-season airfares allowing you to fly to Rome for the price of a cappuccino, tube trains that run every 90 seconds at peak times, health insurance schemes that reward you for being active and staying fit, highly desirable residential areas with all amenities and popular stores close at hand.  But at the same time you have to ask yourself how much do you have to pay for a holiday airfare when everyone knows they want to travel well in advance (e.g. Christmas period), how long does the occasional casual passenger have to wait on the platform late on a weekday evening, what is the insurance premium for persons just starting their effort to lose weight.

There lie some of the dangers of the analytics practice known as price and statistical discrimination, that is, if we abide too much by the 80-20 rule and over exploit data patterns, we run the risk of proliferating services targeted at the most affluent (and lucrative) segments of society while neglecting the rest. Alternatively, we risk penalising the ‘outlier’, the user outside of the norm. An additional risk is the failure to protect the privacy of personal data, now massively being used to produce customized (or at least categories of) offerings and digital content. 

Randomness for a not random win
One potential remedy can come from introducing randomness at different levels. For example, the end users of an analytics application can be more conscious about diversifying their preferences, demand and behaviour when they can conveniently afford to do so. So next time you are coming back from a weekend escape, consider taking some time off to catch a morning flight. Randomness is also a crucial building block in cryptography and privacy enhancing technologies. These can protect against both discrimination and deanonymisation and have received increasing academic attention in the context of participatory sensing, where self-selected participants contribute data to build up collective knowledge. The success of such initiatives depends largely on the level of participation, in turn greatly affected by privacy concerns of candidate participants. 

The non-negotiable answer to analytics related risks however has to be an ethical conduct of the field practitioners. They are ultimately responsible for setting in place processes that protect anonymity, balancing between poor and over fit models but also framing the right problem to solve. In transport for example, instead of simply designing services to satisfy current demand patterns - with all their rush-hour peaks and off-peak troughs- as a given, a better answer is attempting to change the underlying behaviour with proper incentives like adjusted prices. A current trend in congested urban centres is for city authorities’ policy makers to incentivise companies to relax or adjust employee schedules [2]. Along the same lines, it is reassuring to notice that professional certificates like the CAP by INFORMS [3] include explicit code of ethics requirements.

In conclusion, analytics is a fast growing and attractive domain with the promise of producing great value for businesses and society. To deliver, risks of discrimination and privacy will have to be addressed by deliberately exploiting randomness and above all promoting the importance of ethical conduct among its professionals.

At Movement Strategies we work with strictly anonymised data and only disseminate aggregate data. A diverse and buzzy team of transport planners, engineers, computer scientists and psychologists we challenge traditional approaches about crowd movement while advising transport operators and public authorities on how to improve operations and investment in infrastructure. 

[1] "Alan Turing Institute," [Online]. 
[2] L. T. A. -. Singapore, "Travel Smart measures for organisations," [Online].
[3] "INFORMS CAP Ethics," [Online].