Data ethics, an introduction
In a world where technology is increasingly inextricably linked to humans, legislation is more important than ever. It allows us to harness all the potential of technology, while safeguarding our important values and standards. However, to properly arm ourselves against the risks of new innovations, we must be able to understand what they are capable of.
The main drawback of artificial intelligence (AI) is that most algorithms/methods operate under a so-called "black box”. This means that while we can study the incoming and outgoing flows of information, we cannot study the processes and mechanisms that tie the two together (e.g. we know the outcome of an AI decision, but not how that decision came about). Add to that the fact that computers are getting exponentially faster due to innovation in the coming years, which ensures that those opaque internal processes can continue to evolve even more complex and opaque.
Nevertheless, AI does evolve into an integral part of our society, so regulation logically follows. However, this regulation must assume a reactive role, partly because of this increasing speed we cannot possibly foresee the future potential and/or risks of AI. In fact, this leads to regulation always coming into use too late, leaving room for abuse; technology companies can exploit AI without going against the law in the process. Fortunately, data ethics can provide us with timeless tools in this technological anarchy.
WHAT IS DATA ETHICS?
Data ethics focuses on studying and evaluating moral dilemmas related to data and AI. Consider, for example:
- Does my AI represent the interests of all parties, including those outside my organization?
- Are there biases in my data that my AI uses to predict?
- Can I explain the process and outcomes to the user in an understandable way?
- Do users know they are being judged by AI, rather than a person?
Each situation is unique, so the above list could be infinitely expanded. Each context requires its own consideration, accompanied by a good deal of responsibility. Logically, these considerations are time-consuming and difficult, but nonetheless essential for protecting everyone's values and standards. This does not mean that in-depth knowledge of ethical theories is required to start the discussion; with simple, practical tools, anyone can facilitate this debate.
#1 Document all choices made, including motivation
Discussion around designing, building and deploying data and AI is crucial. However, to have this discussion it is vital to have all the information. It is inevitable that your AI will make mistakes, but this only becomes a real problem if you cannot correct them. By documenting all the choices that are made, you can more easily trace where the error came from and then fix it. In addition, it provides users of the AI with tools to explain the results.
#2 Be aware that your model is never perfect
Data is the only information AI can work with, the only reality it knows. This information contains biases that are not always apparent. In addition, AI is made by humans, and humans make mistakes. Therefore, devise and implement mechanisms that encourage feedback from users, which can then be incorporated into the data or the AI.
#3 Involve as many different stakeholders in the discussion as possible
The definition of terms such as "fairness" or "transparency" depend enormously on the perspective used. A company will define certain things differently than a user, technologist or legislator. Not only because interests differ, but also because everyone's considerations are shaped by different backgrounds. As a company, by basing your design choices not only on the opinions of data scientists, but also on those of domain experts, end users and even random citizens, you include all different perspectives in an informed decision.
This article was written in Dutch by Max Roeters, co-founder of Brush AI.