- Nederlands
- English
We should not only train AI, but also educate it
-
New technology
-
Think tanks
-
Blog
The massive wave of investment in AI infrastructure is creating volatility in financial markets and putting pressure on electricity supplies. At the same time, the variable quality of output, including hallucinations, falsehoods, or objectionable results, raises questions. Simply training AI doesn't seem to be enough.
Yet, there is progress. DeepSeek trains models faster and with less energy consumption. The quality of the data used to train models is improving. The controversy between "boomers" and "doomers," where the former see limitless possibilities and the latter believe computers will enslave humanity, has been dispelled by "zoomers." They realize the genie is out of the bottle and are seeking a better framework for dealing with AI. For this, we need to go back to basics: training models.
From training to educating
Training means repetition to achieve a predictable outcome in a variable process. This leads to impressive performance, but has limitations. Training is only one aspect of parenting , a broader concept for better utilizing artificial intelligence. Values and norms could guide the output of AI models. Society has many parenting mechanisms to reward or discourage behavior. We can implement these for AI models as well. Robots already have a reward function: a clock that runs as long as they remain upright. This way, they learn to translate arbitrary movements into coordinated actions, because they remember which movements maximized the time on the clock.
Generative models face a more complex challenge: there are more dimensions. Values and norms are subject to subjectivity and evolution. However, a multidimensional reward function for factual accuracy, appropriate language use, and so on, provides a calibration of quality. A score, or even a report, can therefore have consequences. A time delay for presenting results when scores decrease is an example of a corrective mechanism. High-quality models thus gain popularity , while lesser-quality models lose support. This is analogous to what some electronic marketplaces do: they list suppliers in order of decreasing quality scores.
Users can evaluate AI models according to the dimensions of the reward function, with mathematical correction to eliminate extreme input. The design and monitoring of these models is perhaps the responsibility of a public body, because the education of digital models actually serves the public interest, just as a government agency monitors the quality of food or medicine.
De maatschappij kent veel opvoedingsmechanismen om gedrag te belonen of te ontmoedigen. Die kunnen we ook voor AI-modellen invoeren.
The digital confession
The evolution from training to nurturing also implies new concepts such as digital confession. Confessing questionable results gives models the opportunity to reinitialize their reward function , a kind of digital absolution. The central entity that fundamentally monitors the reward function acts as the confessor. This prevents investments in model building from being lost because a poor reward score pushes the model out of the market. But collected confessions are also a useful source of test cases for improving the reward algorithm.
Such an organization, acting as arbiter and confessor, naturally wields enormous power. History shows that influencing behavior through assessments can influence large groups of people. Consider, for example, the 1930s, when extreme ideologies were taught in schools in some countries.
This problem becomes even more difficult because AI innovates quickly and profoundly, and the public debate on desirable regulations always lags behind the latest developments. This could involve a process of Peer review helps. Scientific research is constantly publishing articles with new insights. Peer review determines the quality . Analogously, a panel of AI operators could evaluate new models and thus determine the initial reward score, or suggest adjustments to promote appropriate behavior. As insights evolve, traditional regulators and the bodies they charge with quality assurance can take over.
AI offers us unprecedented possibilities through extremely thorough training. In the future, we need to evolve from training to a broader developmental concept like parenting. Developing the tools and organisms that steer AI models in the right direction will be one of the most important components in truly realizing the promise of this technology.
Author: Johan Kestens, co-lead ADM Think Tank New Technology