Genima brings AI image-based learning to robotics

With the new Genima system, robots can learn faster and more accurately, using image data to guide their movements.

Genima, developed in London, uses Stable Diffusion to help robots perform tasks like folding laundry and picking up objects.

A team of researchers from the Robot Learning Lab in London has developed an innovative way to train robots using AI-generated images. The system, named Genima, fine-tunes Stable Diffusion to map out robot movements, guiding them in both virtual and real-world environments. This research is set to be presented at the Conference on Robot Learning next month.

Genima aims to improve robots’ ability to complete tasks, such as picking up objects or folding laundry. The system uses images as both input and output, helping robots better understand the tasks they’re performing and reducing errors, like moving into walls. It could revolutionise training for a wide range of robots, from mechanical arms to driverless cars.

Researchers successfully tested Genima on 25 simulated and nine real-world tasks, with average success rates of 50% and 64% respectively. While these numbers aren’t perfect, the team is optimistic that the use of video-generation AI models could boost speed and accuracy, making future applications more efficient.

The versatility of Genima is promising, with the potential to be applied to many different kinds of robots. Its ability to use image data for decision-making could lead to smarter, more capable machines in everyday life and industry.