Oxford Answers is pleased to bring you an exclusive excerpt from the recent book, The Business of Big Data: Ho To Create Lasting Value in the Age of AI, by Martin Schmalz and Uri Bram. The Business of Big Data explains the impact of big data on business models based on real economic logic instead of wild speculation, to help you understand its immediate impact on your decisions as an investor, an entrepreneur, a manager or a regulator, and enable you to thrive in the age of AI.
It’s probably not controversial to note that human beings are incredibly good at judging each other. The moment you set eyes on someone you can tell all kinds of things about them from subtle cues you might not even notice yourself noticing. Their choice of clothes, shoes, jewellery, makeup, tattoos, or hairstyle gives you lots of information not only about their taste but also about their personality, their income, their work and more.
In the age of Big Data, however, humans have lost their edge when it comes to judginess. Machine learning really excels at finding patterns, regularities and predictive factors in very large data sets. Wherever these conditions apply, AI can be applied to make highly accurate forecasts.
Think about your reaction when you first see somebody’s email address. You’re inevitably going to make subtle inferences based on what you see, and will judge her differently if her handle is email@example.com versus firstname.lastname@example.org. To be clear, those inferences are not necessarily inappropriate: if you’re hiring for a job, the fact that someone would list jenmeistergeneral on her CV really might reveal relevant information about her as a candidate. But you’ll also, more subtly, think differently about email@example.com versus firstname.lastname@example.org, or email@example.com: you might not be able to articulate the differences, but in each case you’ll associate the user of a service with a vast trove of cumulative experience with it.
So it’s probably not a surprise that computers, guided by smart humans, can be even better than humans at judging people based on their email host. A study of consumer default rates on purchases from an e-commerce platform showed that people with Hotmail and Yahoo addresses really are different:
“Customers [whose email host was] T-online, a service that mainly sells to affluent customers at higher prices but with better service ... are significantly less likely to default (0.51% versus the unconditional average of 0.94%). Customers from shrinking platforms like Hotmail (an old Microsoft service) and Yahoo exhibit default rates of 1.45% and 1.96%, almost twice the unconditional average.”
For a machine learning algorithm, though, judging you by your email address is only the beginning. What about judging you on the way you type your email address?
“There are [few] customers who make typing mistakes while inputting their email addresses (roughly 1% of all orders), but these customers are much more likely to default (5.09% versus the unconditional mean of 0.94%)... Customers who use only lower case when typing their name and shipping address are more than twice as likely to default as those writing names and addresses with first capital letters.”
Finally, it goes without saying that (like humans), the algorithms are judging you by the device you use:
"For example, orders from mobile phones (default rate 2.14%) are three times as likely to default as orders from desktops (default rate 0.74%) and two-and-a-half times as likely to default as orders from tablets (default rate 0.91%). Orders from the Android operating systems (default rate 1.79%) are almost twice as likely to default as orders from iOS systems (1.07%) — consistent with the idea that consumers purchasing an iPhone are usually more affluent than consumers purchasing other smartphones."
The important thing to note is that these kinds of factors can have predictive power over and above the traditional characteristics they correlate with. Sure, your email provider is partly just a proxy for your age, and companies already know your age anyway. But the power of machine learning is that it can identify weird, non-linear interactions between (for example) age and email address and everything else under the sun. Your email host has additional predictive power because it helps the system get at all kinds of personal traits that aren’t revealed by existing data. Not only are Hotmail users and Gmail users different in general, but two 50-year-old neighbours are (on average) going to be different from each other if one is on Hotmail and one is on Gmail.
While us humans judge each other haphazardly, the best computers are judging us with precision and nuance. To thrive in the age of AI, humans need to find ways to complement the machines’ superpowers with our own specifically human skills. “Judge not, lest ye be judged” can perhaps be modernised: “Judge not, because the AI is judging better.”
: Berg, Tobias and Burg, Valentin and Gombović, Ana and Puri, Manju, On the Rise of FinTechs – Credit Scoring Using Digital Footprints (July 15, 2019). Michael J. Brennan Irish Finance Working Paper Series Research Paper No. 18-12. Available at SSRN: https://ssrn.com/abstract=3163781 or http://dx.doi.org/10.2139/ssrn.3163781