Fintech’s dirty little secret? Lenddo, Facebook and the challenge of identity
Photo Credit: Max Pixel
The fintech sector, with its data-intensive approach to financial services, faces a looming problem. Scandals such as Cambridge Analytica have brought public awareness about abuses involving the use of personal data from Facebook and other sources. Many of these are the same data sets that the fintech sector uses. With the growth of the fintech industry, and its increase in power and influence, it becomes essential to interrogate this use of data by the fintech industry, to see how it has been making use of the data of both customers and non-customers, today – and in the past.
Concerns are developing over the use of data by fintechs. For example, the “alternative credit scoring” field purports to further financial inclusion by using new, multiple, and ever-updating data sources to offer financial services to those who did not have much of a credit history in traditional files. The consequences of these systems are increasingly being critiqued; the amount of data collected, and its analysis, may reinforce existing discrimination, or promote an unanticipated change in behaviour. This increase in the scope of data use also applies to other aspects of risk-based calculations in the financial sphere; for example the use of social media posts to price insurance.
However, the problematic use of new data sources, particularly social media, applies beyond the scope of the decisions to more fundamental aspects of our being sand relationships with financial services: our identity. The sources of data used by some fintechs to establish whether we are who we claim to be are surprisingly broad, and can include Facebook profiles and the entire contents of people’s Gmail inboxes. The use of data in this way runs a number of risks, including discrimination, and people looking to game the system by changing their behaviour. Just as these problems apply to credit scoring, so they apply to the use of this data to establish identity.
Because of this, we need to be sure that the sources of data that the fintech industry is using are gathered, stored and processed legitimately. Furthermore, in a post- Cambridge Analytica world, it also raises the issue of whether social media data was misused in the past, and whether the current analysis done by companies makes use of this data.
Privacy International has written to one of the long-standing players in the industry, LenddoEFL, who we believe have questions to answer about their current and historic use of data from sources including Facebook. LenddoEFL operates in the development space, with the goal of offering financial services to populations who previously did not have access. Financial inclusion means, as the World Bank describes, that “individuals and businesses have access to useful and affordable financial products and services that meet their needs …. delivered in a responsible and sustainable way.”
New technologies provide an opportunity to broaden access to financial products; they also pose a risk of creating new exclusions and discrimination. As Privacy International’s research in this area has shown, fintechs can be blind to these risks. That is why it is essential to interrogate the industry and its use of data. Privacy is essential not only because it’s a human right, and necessary to the fulfilment of other rights: it’s also essential for a fair, inclusive, and sustainable financial system.
For Lenddo, Privacy International’s questions are:
- To what extent have they been gathering, storing and analysing information about people who were not their customers?
- Have they been generating “shadow profiles” of others?
- Is their current business, the algorithms and techniques they use to establish people’s identities, a product of this?
We wrote to Lenddo to answer these questions, but have yet to receive a response.
Who are Lenddo?
Lenddo (who form one half of LenddoEFL, following a merger) was started in 2011 – making it one of the pioneers of the still-young fintech industry. With its headquarters in Singapore, Lenddo started in March 2011 to offer loans in the Philippines, soon spreading to Mexico and Colombia. Lenddo initially offered loans themselves, yet the long-term strategy was about developing their algorithmic decision-making through the gathering of data.
By 2014, they were offering the “world’s first” Facebook-only loan platform. At the start of 2015 - four years and £20 million later - they began to offer a credit scoring and identity verification service to other financial institutions, and no longer offered loans themselves. Lenddo claims to have scored more than five million people and to operate in over 15 markets, including Kenya, South Africa, India, and Australia.
Their current offering is twofold, both based on a broad range of data. Lenddo’s products are integrated into the ‘onboarding’ process for a lender – i.e. when an individual applies to a lender for a product, Lenddo’s scoring will be integrated into the processs. The LenddoScore offers credit scoring based on data from a broad range of sources: these include broad categories like “telecom data”, “browser data”, “mobile data”, and “social networks”. The use of this data is to, as Lenddo puts it, “make lending scalable, profitable and improve loan economics using non-traditional data”. The second product that Lenddo offers is Lenddo Verification: this is a service that uses data sources, including from social media, to verify the name, date of birth, and certain personal details (employer, university) of a potential borrower.
At the end of 2017, Lenddo merged with the Harvard-founded scoring company The Entrepreneurial Finance Lab (EFL). One of the tools EFL uses for credit scoring is psychometric testing – for example, analysing applicants' answers to an online quiz, as well as factors such as how long they take to answer the questions. EFL describes this as “collateraliz[ing] individuals’ personalities” to enable them to obtain loans.
Concerns surrounding psychometric profiling to offer financial products have previously been raised by Privacy International. EFL has a similar audience to Lenddo’s, and their #include1billion project indicates the scale of their joint ambition. It also illustrates how essential it is that Lenddo’s use of data is questioned.
The data Lenddo uses
The range of data that feeds into Lenddo’s decision-making process is massive. For the developers at a lender, they provide an API (Application Programming Interface) that allows a lender’s app to make use of Lenddo’s products – i.e. by sending data from the user’s phone (and social media accounts) to Lenddo. The documentation of Lenddo’s API describes the data that is gathered:
- Contacts
- SMS
- Call History
- User's Location
- User's Browsing History
- User’s Installed Apps
- Calendar Events
- Phone Number, Brand and Model
Thus, the data sent back to Lenddo can include the precise location of the lender, the content of their text messages, and their browsing history.
On top of this comes the other sources of data that Lenddo can use to access data from users’ account with these services, including Facebook, LinkedIn, Yahoo, Google, Twitter and the Korean messaging app KakaoTalk. In other words, Lenddo wants to access your most personal information from other sources as well – including access to your Gmail emails.
All of these sources of data - from phone data to the contents of social media and email providers – take place with ‘user permission’. But it is challenging to suggest that this is true “informed consent”. These are products marketed to people with little or no alternative; they can either hand over access to vast amounts of data about themselves, or not receive access to a service. Consider the fact that, in a survey in Colombia and Kenya, the content of emails was considered the most private of data. Combine that with the difficulty in understanding the full meaning of terms and conditions. Research by the LSE on UK consumers found that very few of them read the terms and conditions of financial apps, and even fewer remember or understand the consequences of the app’s use of data.
There is a major challenge, when it comes to services like Lenddo, for informed user understanding of their company policies. The user has multiple actors that they have to attempt to understand: the lender, Lenddo, the social media platforms – all of which have separate permissions and privacy policies, as well as policies for the ways in which they interact. The onboarding process in fintech is touted as being ‘frictionless’ – i.e. producing a simple and smooth experience for the customer, to avoid potential customers dropping out due to an onerous paperwork process. But this smooth and easy customer experience also serves to hide the complexity behind the scenes. Having multiple actors and data sources involved adds massively to the complexity for the customer to understand what data is held by whom, and how it is used.
However, there is an additional dimension that we must begin to address: when someone has access to your email, they don’t just learn about you, but also the people that you’re communicating with. They can read the emails of the people who contact you, or potentially the messages sent by social media. This is data that, if we look to understand the ways in which Lenddo makes decisions, raises the question: what data is Lenddo gathering and analysing about people who are not Lenddo customers?
The company you keep: how Lenddo judges you for credit scoring and identity verification
Lenddo’s methods of decision-making was described by its founder and CEO, Jeff Stewart: “At the end of the day, I think back to my mother who told me growing up that you will be judged by the company you keep and I think that what we’ll see in the data and as society evolves that just like hundreds of years ago or just like when I grew up, who you hang out with and how you interact with them is going to be part of how you’re judged.”
Stewart referred to this as “PageRank for people”, a reference to Google’s method of determining the importance of a webpage by analysing the other pages that link to it. Thus, just as a webpage that has many other high-quality pages linking to it would appear high in Google search results, so would a person with many ‘high-quality’ friends receive a higher score from Lenddo. The company argued that this could be graphed, with bad borrowers linked together by the strength of their connection.
But it wasn’t just the existence of a connection that underlay Lenddo’s analysis, it was also how the person communicated with your contacts. Stewart:“ What... is [also] important is the behaviour of the community and how they interact with you, so you and your friends and your friends’ friends and even your friends’ friends’ friends, are predictive”.
Lenddo also offers to verify customers’ identity. To verify a person’s identity, and the information that they provide (name, date of birth, employer, university, phone number), Lenddo makes use of social media data. Verifying a person’s identity can seem like a yes/no type of question, but in fact it’s more complicated than that.
Let us leave aside the more complex issues of identity – how something like someone’s name is more complicated than it may appear. Consider how Lenddo determines whether a person has been telling them the truth about their identity. For example, to verify a person’s birthday using social media, Lenddo doesn’t just check the date of birth on a person’s profile. This would be Lenddo assuming that people are honest on social media. Rather, they check when a person is wished “happy birthday”; furthermore, they check what relationship the person has with those people. Are they wished “happy birthday” every year, for instance?
Furthermore, Lenddo has developed some sophistication in analysing what Facebook accounts are real. They claim that, while it is very easy to create a Facebook account, it is very difficult to create the long history of interactions between that account and potentially hundreds of others. For example, being wished “happy birthday” on the same day every year is also an indicator that an account is genuine.
The question becomes: How did they learn this? How have they developed the sophistication in the algorithms that they use today?
Lenddo’s original sin?
Given Lenddo’s model of analysing an individual’s social network to analyse their identity or creditworthiness, it’s no surprise that Facebook has played an important part. While the platform has become more restrictive in what information can be accessed by third-party apps, it becomes important to ask: what was Lenddo gathering in the past, in particular with regard to information from profiles of people who were not customers of Lenddo, but were rather Facebook friends of customers? What data was Lenddo gathering about these people; what was stored and analysed; was this data ever deleted; and what role does it play in Lenddo’s algorithmic decision-making today? There is reason for concern about what Lenddo was gathering.
As was made prominent in the Cambridge Analytica case, prior to April 2014, Facebook apps could gain access to vast amounts of information from the profiles of friends of people who signed up to apps. This is why the thisisyourdigitallife app, with its 270,000 downloads, enabled Cambridge Analytica to have access to 87 million Facebook profiles. Of course, this case seems to be different from Lenddo’s, insofar as there is no suggestion of fraud on the part of Lenddo or that Lenddo ever used asomeone's data for a purpose other than what they signed up for. Indeed, Lenddo has always made it clear that it does not share Facebook data with the lending institution. Yet this is hardly the central issue: after all, the friends of a user of the app could have had their profiles accessed and processed without their permission.
There are, however, strong indications to believe that Lenddo was collecting not only the names of people’s Facebook friends, but also further information about their accounts. In 2015, Lenddo claimed that it was storing and analysing “about 113 million relationships between more than 120 million profiles”. It seems very likely that this 120 million figure includes non-customers: Lenddo, as of 2017, has only processed around 5 million credit applications. This is 100 million profiles more than the number of customers.
There is further evidence of the extent of Lenddo’s storage needs. In 2014, Naveen Agnihorti, Lenddo’s Chief Technology Officer, gave a technical account of how the company’s internal databases were set up. As Agnihoti described it, user data and social data were formerly in the same database, which worked fine for tens of thousands of members. As they grew and got more members, “the social data just grows a lot faster exponentially because especially in our markets and in the countries where we operate, more than half of our members have a thousand-plus friends on Facebook. Right, so the amount of data is just huge, and far dwarfs the classic data, the member data that you would store.” If the data held in the social database was indeed increasing exponentially, this suggests that Lenddo was storing data gathered from people’s friends profiles, not just the profiles of Lenddo customers.
In this context, it is important to ask: what data was collected and stored for people who were not customers of the company? What analysis was done on these profiles? What happened to that data? Is it still being held today?
Conclusion
The mentality of the fintech industry seems to be to collect as much data as possible about people, without problematizing the nature of the data they’re holding, and how it is used. There is a challenge here that is essential for fintechs to address.
With the fintech sector growing in importance, and the techniques developed by fintech firms being adopted by mainstream financial institutions, it becomes increasingly important that the sector addresses its use of data. A Cambridge Analytica-style scandal in this area would not only damage particular players in the sector, but also potentially damage trust in the fintech sector as a whole and the financial sector more generally. The worst-affected would be those in the most vulnerable financial situations that the fintechs often claim to help.
Of course, credit scoring must remain a key concern, from both the perspective of the huge amounts of data gathered as well as the associated harms: the risks of discrimination, changes in customer behaviour, and more. Yet the use of these data sources extends beyond credit scoring to other aspects of the fintech industry as well: particularly identity.
The regulatory challenge of fintech, with the convergence of financial and data protection regulation, must be met in a way that places the right to privacy at its heart. This is the only way that we can begin to create the future that fintech promises: one in which we have an open, inclusive financial system that is a tool for empowerment rather than exploitation.