Why EU regulators are pushing for more explainable AI

Written by Christian Mangold, co-CEO, Fraugster
19th May 2021

EU regulators are seeking to establish stricter rules around the use of artificial intelligence (AI) in areas like crime prediction, credit scoring, employee performance management and border control systems. In particular, the bloc is seeking to mitigate undesirable outcomes and risks arising from AI-generated decisions.

Draft legislation recently published proposes that AI systems will need to meet specific “transparency obligations” that allow human beings reviewing a decision made by an AI to establish how that decision was reached and what data points were used in the process. This is what AI developers call “explainability” developed along “white box” principles. (These are the opposite of so-called AI black boxes where decision processes are hard to understand, to the point of being opaque.)

So, what does this mean for online retailers and payment providers using AI models? And what are the implications for technology vendors developing AI systems that approve or decline customers transacting online?

The short answer is threefold:

Expect more scrutiny. Regulators will start looking more closely, but so will your employees and customers.
Expect higher standards of transparency.
Expect the end-users of your product to be more clued up and to demand transparency and explainability to be baked into your AI solutions.

All three of these, in our opinion, are positive developments that create more trust and a better understanding of a much-misunderstood technology. For companies who have not designed explainability into their systems, this will require additional investment to establish white box standards, and in many cases, a lot of reverse engineering.

Greater transparency creates trust

For those already operating on white box principles, it is an opportunity to communicate with partners and customers on how important decisions – like whether to approve or decline a transaction – are made and what factors are considered in this process. Although e-commerce and payments are unlikely to be designated as high-risk services (in the same bracket as crime prediction algorithms), they play a more important role than we may think.

Everyone should have the right to buy what they want, when they want and how they want – and any technology preventing this, based on opaque reasons, should be scrutinised.

And both analysts and customers need to know why an AI has declined or approved a transaction and what signals influence the AI score.

How an AI risk score works

At the moment of a transaction attempt, an advanced AI-engine looks at over 2,500+ indicators and behavioural identifiers to generate a score, and it is critical to know what these are and trace them back. The score narrative (the text that appears below the score itself) has explainability hardwired into it.

From this AI Risk Score, it is now possible to establish the factors that make a transaction more probable to be fraudulent. In the example above, it is possible to trace it back to mismatches between the email address, shipping address and names listed on the payment instrument. In addition, there is a location mismatch suggesting the transaction is coming from an IP address that is different from where the card is registered.

More and more merchants are needing more explainability and transparency. Here are some common questions that require reliable answers:

Why has a potentially high-value customer been declined? False positives represent the biggest source of value loss in e-commerce; being able to investigate individual transactions where an AI score has made the decision and find answers is key to boosting revenues.
What additional data points can make AI decisioning more accurate? We are seeing that device data, AML sanction lists and IP address data are playing an increasingly important role in data enrichment and decision accuracy – but this can only be established if machine learning and AI models are explainable, and we can measure the uplift in performance these additional data points deliver.
Should I make all payment methods available to a buyer at the point of checkout? Different payment methods carry different risk profiles. Understanding how risk profiling and AI scores are calculated is vital to being able to determine what payment methods to offer.

It was only a matter of time

This development should not come as a surprise. The moral hazards and adverse consequences of AI-based decisions have been known for years, and recent examples have put pressure on public institutions to respond. Here are just a few:

Goldman Sachs and Apple were found to be extending up to 20 times more credit to men than women after it was discovered “latent variables” were inadvertently discriminating based on gender.

Source: Twitter

In the UK, an exam prediction algorithm led to 40% of students receiving exam scores downgraded from their teachers’ predictions. Thankfully, the algorithm was explainable, which allowed developers to establish that it was trained on the past entrance-exam performance of the student’s school, thereby downgrading good students from historically poor-performing schools. A Fail for the Department of Education, who eventually scrapped the algorithm and faced an embarrassing climbdown. An A+ for explainability.

In the USA, facial recognition accuracy was proven to be lower amongst people of colour. Although these algorithms boast high classification accuracy on average (over 90%), the poorest accuracy was consistently found in subjects who are female, Black, and 18-30 years old. Why? Because predominantly white male developers had not given enough consideration to testing cohorts with darker skin tones or factored in the underrepresentation of these cohorts in training data sets.

Source: Gender Shapes Project (2018)

PredPol, a predictive policing algorithm used in a number of American states, was found to be fatally flawed. Its use of runaway feedback loops, borrowed from earthquake models that led police officers to be repeatedly dispatched to the same neighbourhoods – typically ones with a high number of racial minorities – regardless of the true-crime rate in that area. An extensive study by Cornell University was able to uncover its flaws.

Fit for the future

As a European company whose core values are integrity, trust, excellence, and care (for our people, customers and communities), we fully support the spirit of this draft legislation. Ensuring that AI applications remain trustworthy, even after they have been placed on the market, is a hallmark of responsible technology adoption. As is the need for ongoing quality and risk management by providers, with the necessary framework in place to explain how important decisions are being made.

One thing is clear: now is not the time to rest on our laurels. We all need to ensure that we are fit for a future where more AI explainability will not only be demanded by our customers but by our governments and regulators as well.

About the author

Christian Mangold is the co-CEO of Fraugster, a fraud prevention and payment security company based in Berlin.

He is a lawyer by background and has spent the last 20 years in payments and cybersecurity as managing director of Sofort and later Klarna (which acquired Sofort) in DACH.

In his spare time he is an outdoorsman who likes to ski and sail with his wife and three children.