How to prevent AI bias in university systems: Best practices

Here are a few best practices that have emerged for preventing AI bias in these early days of the technology.
Reinhard Landes
Reinhard Landes
Reinhard Landes is the global solution director for SAP, a higher education and research software.

Say I hop into a large language model (LLM) to do cursory research for a University Business piece and the query, “What can universities do to eliminate bias in AI systems they use?” returns results that are irrelevant, off the mark or inherently biased.

No harm done, probably.

But what if a university uses a customized LLM-based AI to filter thousands of student applications and even a small percentage of the AI’s assessments are irrelevant, off the mark or inherently biased? It can cost an institution some excellent students and it can cost some excellent students educational opportunity.

My LLM search on eliminating AI bias wasn’t harmful because a human being—I, the writer—vetted the results, recognizing nuggets of potential value to follow up on while disregarding the irrelevant (actually, the ChatGPT results were pretty good). That same sort of human vetting must happen with any university AI system. But who are the humans? And when and where does the AI vetting happen?

Read more from UB: Are latest layoffs at this university a first look at the FAFSA fiasco’s consequences?

These are ultimately questions of governance, and governing AI behaviors is new for everyone. Here are a few best practices that have emerged in these early days.

AI policies and leadership committees

First, establish an institutional artificial intelligence policy. It serves as a constitution for the ethical application of AI of all sorts. Having a document with guiding principles is critical in setting expectations and providing a foundation for the organizations and activities associated with AI development, implementation, and operation.

The key tenets in a global AI policy include:

  • A mandate for human oversight throughout both development and operations of AI systems
  • Transparency that AI is in play and explainability of AI-derived results
  • The avoidance of AI involvement in surveillance, deception, and environmental damage
  • Keeping bias and discrimination out of AI systems

On the governance end, establish an AI ethics office. It should convene an AI advisory board for big-picture direction on matters concerning AI. In addition, it should be home to an AI ethics steering committee of leaders to consult regularly on AI matters escalated from across the institution.

Red-line versus high-risk cases

So, what sorts of matters get escalated? Your institution’s AI ethics policy helps determine that. What we call “red-line” cases shouldn’t need escalation—they should be well understood as nonstarters throughout the organization. These include AI involving human surveillance, discrimination, the deanonymization of already-anonymized data, deception and manipulation, the undermining of human debate, and environmental harm.

In contrast, “high-risk” cases can rise to the attention of the AI ethics steering committee. These include cases where AI may drive automated decision making or process personal data. Other high-risk cases include the possible intrusion of fundamental rights or freedoms or damage to individuals’ social wellbeing. The building of AI into high-risk IT applications also fit into this rubric, including, among others, applications involved in the management and operation of critical infrastructure, employment and HR, and health care.

Your AI ethics policy should cover human validation and interventions that should take place throughout the AI development process, from ideation through operations. Universities also must be able to vet the ethics of vendor-built AI systems.

Questions to ask to identify AI bias

Some key questions to ask:

  • Was the data used to train the AI inclusive, representing a diverse cross-section of population or past situations and as free as possible of historic or socially constructed biases, inaccuracies, and errors—and can the vendor demonstrate as much?
  • Were prospective users involved in the development process?
  • What technical or organizational steps did the vendor take to prevent prejudice, discrimination, or marginalization of groups or individuals by, for example, reducing bias in training data?

The focus on data is not accidental. The black-box algorithms of LLM developers that enable generative AI get a lot of attention, and these LLMs are the rocket engines of today’s AI boom. But data is the rocket fuel, and the “garbage-in, garbage-out” truism still applies regardless of how sophisticated the subsequent digital manipulation and refinement. Said differently, unbiased AI depends overwhelmingly on unbiased data sets, and, for the foreseeable future, human vetting will be indispensable to ensuring that data sets are unbiased.

For human vetting to be done as extensively and systematically as it will need to be given the broad application of AI in higher education, universities must establish deliberate governance based on clearly delineated ethical principles. The provenance and quality of training data should be the primary focus.

ChatGPT didn’t write that, but it would surely agree.

Most Popular