Generative AI, the technology behind ChatGPT, is going supernova, as astronomers say, outshining other innovations for the moment. But despite alarmist predictions of AI overlords enslaving mankind, the technology still requires human handlers and will for some time to come.
While AI can generate content and code at a blinding pace, it still requires humans to oversee the output, which can be low quality or simply wrong. Whether it be writing a report or writing a computer program, the technology cannot be trusted to deliver accuracy that humans can rely on. It’s getting better, but even that process of improvement depends on an army of humans painstakingly correcting the AI model’s mistakes in an effort to teach it to ‘behave.’
Humans in the loop is an old concept in AI. It refers to the practice of involving human experts in the process of training and refining AI systems to ensure that they perform correctly and meet the desired objectives.
In the early days of AI research, computer scientists were focused on developing rule-based systems that could reason and make decisions based on pre-programmed rules. However, these systems were tedious to construct – requiring experts to write down the rules – and were limited by the fact that they could only operate within the constraints of the rules that were explicitly programmed into them.
As AI technology advanced, researchers began to explore new approaches, such as machine learning and neural networks, that enabled computers to learn on their own from large volumes of training data.
But the dirty little secret behind the first wave of such applications, which are still the dominant form of AI used today, is that they depend on hand-labeled data. Tens of thousands of people continue to toil at the mind-numbing task of putting labels on images, text and sound to teach supervised AI systems what to look or listen for.
Then along came generative AI, which does not require labeled data. It teaches itself by consuming vast amounts of data and learning the relationships within that data, much as an animal does in the wild. Large language models, which use generative AI, learn the world through the lens of text and the world has been amazed by these models ability to compose human-like answers and even engage in human-like conversations.
ChatGPT, a large language model trained by OpenAI, has awed the world with the depth of its knowledge and the fluency of its responses. Nevertheless, its utility is limited by so-called hallucinations, mistakes in the generated text that are semantically or syntactically plausible but are, in fact, incorrect or nonsensical.
The answer? Humans, again. OpenAI is working to address ChatGPT’s hallucinations through reinforcement learning with human feedback (RLHF), employing, yes, large number of workers.
RLHF has been employed to shape ChatGPT’s behavior, where the data collected during its interactions are used to train a neural network that functions as a “reward predictor.” The reward predictor evaluates ChatGPT’s outputs and predicts a numerical score that represents how well those actions align with the system’s desired behavior. A human evaluator periodically checks ChatGPT’s responses and selects those that best reflect the desired behavior. This feedback is used to adjust the reward-predictor neural network, which is then utilized to modify the behavior of the AI model.
Ilya Sutskever, OpenAI’s chief scientist and one of the creators of ChatGPT, believes that the problem of hallucinations will disappear with time as large language models learn to anchor their responses in reality. He suggests that the limitations of ChatGPT that we see today will diminish as the model improves. However, humans in the loop are likely to remain a feature of the amazing technology for years to come.
This is why generative AI coding assistants like GitHub’s CoPilot and Amazon’s CodeWhisperer are just that, assistants working in concert with experienced coders who can correct their mistakes or pick the best option among a handful of coding suggestions. While AI can generate code at a rapid pace, humans bring creativity, context, and critical thinking skills to the table.
True autonomy in AI depends on trust and reliability of AI systems, which may come as those systems improve. But for now, humans are the overlords and trusted results depend on collaboration between humans and AI.
Follow me on LinkedIn.