Image default
Staff Articles

Why Data is to be Blamed for AI Bias

AI bias occurs when a computer algorithm makes prejudiced decision based on data or programming rules. As these data have been collected by humans – they are subjective and often unrepresentative.

AI helps us find insurance policies, work out if we are credit-worthy, and even diagnose our illness. So in many ways, these systems have improved our lives, saved our time on repetitive data entry and making accurate recommendations. In the last few years, however organizations from Apple to Google have come under scrutiny for apparent bias in their AI systems. Since most of the AI technology is designed to take the human guesswork out of decision-making, this bias is an ironic turn and one that can have sinister results. As more and more organizations began to implement AI, it is going to be crucial for tomorrow’s managers to recognize and stamp out their unfair biases.

“It is becoming more and more important to make certain that AI applications are unbiased, because AI is taking over and making decisions about people’s lives.”
– Dr. Anton Korinek,Associate Professor of Business Administration at Darden School of Business

AI is Bias: It’s in the data and not the algorithm

According to Gartner Research, approximately 85% of all AI projects will deliver erroneous outputs as the data is inserted and maintained by humans.

And if you deep dive into the concept of bias in machine learning, then you will realize that the question to your answer would be data.

There are four types of bias:

  1. Sample Bias/Selection Bias:

Sample Bias is when the distribution of the training data fails to reflect the actual environment in which the machine learning model will be running. Like if the training data covers only a small set of things you have inserted, then if you test it on something outside that will get it wrong. It is biased based on the sample it is given. So the algorithm is not wrong, the data fed to the system is wrong or can say was not given enough to cover the space it is going to be applied in.

One of the biggest factors to point poor performance to machine learning algorithms is to get the data right.

  • Prejudices and Stereotypes:

Even if the data has been selected correctly but still errors are occurring then it might be because of prejudices and stereotypes which emerge while running the differentiation process. Like, if the data is inserted like there are more women homemakers and more men computer programmers, and the model is actually taking that distribution into account, then you will end up having biases in the results. The indiscriminant acceptance of any speech as equally appropriate can lead to the propagation of hate speech. This is one of the insidious types that are hard to track.

  • Systematic Value Distortion:

This kind of bias happens when a device returns measurements or observations which are imprecise, like a camera providing image data that has a color filter that’s off. It is when your measuring device is causing your data to be systematically skewed in a way that doesn’t represent reality. And the outcome would be biased.  

  • Model Insensitivity:

Model Insensitivity arises from the data and algorithm itself, it is the result of the way an algorithm is used for training on any set of data even on an unbiased set as well. A lot more people who are using models or are interested in how models perform without being in charge of actually creating them are seeing the bias.

There are multiple examples we can relate to which arises and happens in our day to day life – knowingly or unknowingly. Would like to point out few prominent cases here

  • In 2019, a tech giant Apple, was accused of sexism when the company’s new credit card seemed to offer men more credit than women, even though women had better credit scores.
  • In 2016, it came to public knowledge that the computer program, used to calculate the likelihood of prisoners reoffending, was unfairly biased against African-American defendants. The program predicted that the black people would reoffer at twice the rate of white people, despite evidence to the contrary.
  • In 2015, one more another case that speckled AI’s history – Google’s photo app labeling black people as ‘Gorillas’. And the report says Google has not resolved the issue, it has just blocked its image recognition algorithms from identifying gorillas altogether preferring, presumably, to limit the service rather than risk another miscategorization.

How to tackle Bias AI?

One of the best and potential solutions to tackle and keep bias AI in check is by using Interdisciplinary Teams. A lot of training that psychologists, social scientists and political scientists are given to provide unbiased outcomes similar way the AI-based software and programs need to be trained that to make sure – after certain exercises, it goes towards getting the bias out of the data.

According to the data scientist and experts in the field, biased algorithms are created unintentionally. One of the reasons is that – it doesn’t get spotted easily, and for the most part, high profile algorithms are designed by teams of white men. According to a 2019 study by New York University, 80% of AI professors are men, while AI giants Google, Microsoft and Facebook – all employ a workforce that is over 90% white.

To minimize or adjust the AI system’s predictions for possible bias – special care needs to be taken on inserting the data that reflects the diversity of the phenomena under study. As AI technology is growing speedily and organizations are adapting to the technology at the speed of lightyear, it is going to be crucial that developers and implementers make a conscious effort to compensate for biased data. It is suggestible to consult a diverse group of people while developing the algorithms and get tested.

Related posts

The Transition from DMaaS to DCIM – A Perspective!

AI TechPark

Natural Language Generation- Support for NLP and Structured Data

AI TechPark

Modernizing Data Management with Data Fabric Architecture

AI TechPark