What Is ‘Equity As Code,’ And How Can It Eliminate AI Bias?

This article was originally published in Forbes.

Engineers unleashed artificial intelligence (AI) bias, and it will be engineers who design the solutions that eliminate it. Authors of an article published by McKinsey Global Institute assertย that โ€œmore human vigilance is needed to critically analyze the unfair biases that can become baked in and scaled by AI systems.โ€ Thatโ€™s an important start. The industry can also adopt a proactive, process-oriented approach to addressing AI bias. We have the tools to create data analytics workflows that address AI bias. When our work processes for creating and monitoring analytics contain built-in controls against bias, data analytics organizations will no longer be dependent on individual social awareness orย heroism.

What Is AI Bias?

Machine learning (ML) models are computer programs that draw inferences from data โ€” usually lots of data. ML is part of artificial intelligence (AI), which is a broader term used to describe computers and software that perform tasks in a way that we humans consider โ€œintelligent.โ€ ML and AI are being applied to tasks in nearly every industry to help companies more effectively and efficiently execute tasks and achieve goals. Youโ€™ve probably encountered AI innumerable times in the course of being an average consumer.

ML models are being used to aid disease diagnosis like detecting cancer cells, tag photographs in social media, understand speech, identify credit card fraud, increase customer engagement with movie/TV recommendations on video streaming services and much more. The global AI market is projected to grow at a compound annual growth rate (CAGR) ofย 33% through 2027, drawing upon strength in cloud-computing applications and the rise in connected smart devices.

One way to think of ML models is that they instantiate an algorithm (a decision-making procedure often involving math) in software and then, at relatively low cost, deploy it on a large scale. The problem is that algorithms can absorb and perpetuate racial, gender, ethnic and other social inequalities. There are, unfortunately, many examples โ€” hereโ€™s a well-known one that was caught early:

Amazon developersย disclosedย that an AI model, designed to screen job candidates, favored men over women. The algorithm had been trained using a database of its engineering hires over a 10-year period. Since the training data contained a majority of male developers, the AI model taught itself that men were preferable and downgraded references such as โ€œwomenโ€™s team captainโ€ or mentions of an all-female educational institution in a resume. If Amazon had not recognized the problem, the AI algorithm might have been deployed on a large scale, further perpetuating existing gender biases.

Addressing AI Bias With DataOps

Many in the data industry recognize the serious impact of AI bias and seek to take active steps to mitigate it. As the industryโ€™s understanding of AI bias matures, model developers are getting better at defining and measuring bias. Data teams should formulate equity metrics in partnership with stakeholders. Once targets are defined, data professionals can iterate on eliminating bias from machine learning models. Armed with a comprehensive set of metrics and target goals, data scientists can address AI bias like other performance requirements.

The data industry can begin the process of mitigating bias by viewing AI systems from a manufacturing process perspective. Machine learning systems receive data (raw materials), process data (work in progress), make decisions or predictions, and output analytics (finished goods). We call this process flow the โ€œdata factory,โ€ and like other manufacturing processes, it should be subject to quality controls. The data industry needs to treat AI bias as a quality problem.

If you walk into any modern manufacturing facility, you will see automation and quality controls at every step. When you buy a car, you can be sure that the factory has tested every component and subsystem. Additionally, the vehicle contains built-in computers that diagnose issues and control dashboard warning alerts. The car is tested before it is sold and then monitored while in operation.

AI systems should be subject to this same level of process control. The data industry employs a new term, โ€œDataOps,โ€ when describing the application of manufacturing quality methods like lean manufacturing and Six Sigma to data and analytics. Letโ€™s discuss how DataOps can address AI bias.

Equity As Code

In a traditional software development lifecycle process, new code undergoes DevOps automated testing before deployment. Tests defined in the continuous integration and deployment pipeline check if the code is ready for production.

AI models can be tested for AI bias as part of their pre-deployment testing. In the example above, Amazon developed a test showing that its model favored male resumes. The Amazon example also specifically illustrates the value of testing training data for bias before model development. An AI model and training data should undergo a battery of equity tests and measurements at every lifecycle stage. Anti-bias controls and metrics can be instantiated in tests applied to AI model performance to determine whether the AI model is adhering to equity requirements. A quality test suite may enforce โ€œequity,โ€ like any other performance metric.

Machine learning systems differ from traditional software applications in that ML systems depend on data and data changes continuously. As data flows, a deployed model may drift out of the target range of accuracy. A deployed model must be continuously monitored for bias and other quality issues while in operation. Each time a model is updated, it must undergo testing before being deployed. Continuous testing, monitoring and observability prevent biased models from deploying or continuing to operate. We call this new approach to mitigating AI bias โ€œequity as codeโ€ because the tests that enforce equity are built into automated software applications that test, deploy and monitor the model 24/7.

DataOps โ€œequity as codeโ€ provides the approach and methodological tools to impose equity controls on AI algorithms. A program of automated testing and continuous monitoring can help avoid deploying AI systems that instantiate and perpetuate inequities at scale.

About the Author

Chris Bergh
Chris is the CEO and Head Chef at DataKitchen. He is a leader of the DataOps movement and is the co-author of the DataOps Cookbook and the DataOps Manifesto.

Sign-Up for our Newsletter

Get the latest straight into your inbox

Open Source Data Observability Software

DataOps Observability: Monitor every Data Journey in an enterprise, from source to customer value, and find errors fast! [Open Source, Enterprise]

DataOps Data Quality TestGen: Simple, Fast Data Quality Test Generation and Execution. Trust, but verify your data! [Open Source, Enterprise]

DataOps Software

DataOps Automation: Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change. [Enterprise]

recipes for dataops success

DataKitchen Consulting Services


Assessments

Identify obstacles to remove and opportunities to grow

DataOps Consulting, Coaching, and Transformation

Deliver faster and eliminate errors

DataOps Training

Educate, align, and mobilize

Commercial Pharma Agile Data Warehouse

Get trusted data and fast changes from your warehouse

 

dataops-cookbook-download

DataOps Learning and Background Resources


DataOps Journey FAQ
DataOps Observability basics
Data Journey Manifesto
Why it matters!
DataOps FAQ
All the basics of DataOps
DataOps 101 Training
Get certified in DataOps
Maturity Model Assessment
Assess your DataOps Readiness
DataOps Manifesto
Thirty thousand signatures can't be wrong!

 

DataKitchen Basics


About DataKitchen

All the basics on DataKitchen

DataKitchen Team

Who we are; Why we are the DataOps experts

Careers

Come join us!

Contact

How to connect with DataKitchen

 

DataKitchen News


Newsroom

Hear the latest from DataKitchen

Events

See DataKitchen live!

Partners

See how partners are using our Products

 

Monitor every Data Journey in an enterprise, from source to customer value, in development and production.

Simple, Fast Data Quality Test Generation and Execution. Your Data Journey starts with verifying that you can trust your data.

Orchestrate and automate your data toolchain to deliver insight with few errors and a high rate of change.