Researchers try to make AI safer by having algorithms learn from human feedback

Researchers from OpenAI and DeepMind have been working on an artificial intelligence (AI) algorithm that learns from human feedback, as a way to make AI safer. The algorithm is trying to address problems associated with the concept of reinforcement learning – an area of machine learning that rewards agents if they take the right actions to complete a task under a given environment. As explained by TheRegister, this method can be dangerous if the algorithm is wrong or produces undesirable effects. To prevent such problems, researchers proposed a method in which the reward predicted is based on human judgement, which is fed back into the reinforcement learning algorithm to change the agent’s behaviour. In time, the agent learns to narrow down on the reward function that best explains the human’s judgement to learn its goal.