Skip to main content

2 posts tagged with "ai"

View All Tags

Controlling Robots using a Large Language Model

· 22 min read
Michael Hart
Mike Likes Robots

In the past, a human controlling a robot has depended entirely on what commands the human can send the robot, either through structured messages or through user interfaces. That's changing with Large Language Models (LLMs), because we can provide a few tools for controlling robots to an LLM and then give commands in natural language, and the LLM works out how to send the right commands to the robot. This gives us a lot more flexibility in how we can interact with robots.

In this post, I show an example project that runs a Gazebo simulation of a robot in a small house. The project uses Claude, an LLM from Anthropic, to interact with the robot, which allows the user to give the robot natural language commands and see those changes happen in simulation.

After the demo, I show how to set it up, then talk about how the code interfaces with Claude.

With Thanks

The demonstration project in this post is from a university-based robotics course. One of the groups on the course kindly released their submission on Github so that I could show it in this post. Therefore, I'd like to thank Senior Lecturer Giovanni Toffetti at Zurich University of Applied Sciences and his students Alexander Kolenaty, Jan Affeltranger, and Ilimea Gall.

This post is also available in video form. If you'd prefer to watch, click the link below:

Exploring AI on the RDK X3 Sunrise

· 16 min read
Michael Hart
Mike Likes Robots

We need robots to be able to react to change, to perceive their environment rather than just sense it, and to be able to make their own decisions when they can't follow the normal program. Giving robots these capabilities is known as Artificial Intelligence (AI).

Recent years have seen steady improvement in vision models, such as YOLO, which can help a robot understand the objects in its environment. One of the biggest leaps in AI from the past few years is the advent of Large Language Models (LLMs), such as ChatGPT and Google Gemini. These models can give very realistic responses to text inputs, including answering questions and offering suggestions of actions. By combining vision models and LLMs, we can build more intelligent robots than ever before.

In this post, I'll show the performance of three different AI models running on a single board computer - the RDK X3 Sunrise.

This post is also available in video form. If you'd prefer to watch, click the link below: