Skip to main content

4 posts tagged with "iot"

View All Tags

· 15 min read
Michael Hart

Congratulations! You have a whole lab full of robots running your latest software. Now you want to start looking at an overall view of your robots. It's time to build a fleet overview, and this post will show you how to use Fleet Indexing from AWS IoT Device Management to start your overview.

Fleet Indexing is a feature of AWS IoT Core that collects and indexes information about all of your selected Things and allows you to execute queries on them and aggregate data about them. For example, you can check which of your Things are online, giving you an easy way to determine which of your robots are connected.

I want to walk you through this process and show you what it looks like in the console and using Command Line Interface (CLI) commands. We'll be using the sample code from aws-iot-robot-connectivity-samples-ros2, but I've forked it to add some helper scripts that make setup a little easier for multiple robots. It also adds a launch script so multiple robots can be launched at the same time.

This guide is also available in video form - see the link below!

Fleet Indexing

Fleet Indexing is a feature of IoT Device Management that allows you to index, search, and aggregate your device data from multiple AWS IoT sources. Once the index has been enabled and built, you would be able to run queries such as:

  • How many devices do I have online?
  • How many devices have less than 30% battery left?
  • Which devices are currently on a mission?
  • What is the average metres travelled for my fleet of robots at this location?

To perform these queries, AWS has a query language which can be used for console or CLI commands. There is also the option to aggregate data over time, post that aggregated data to CloudWatch (allowing for building dashboards), and enabling alarms on the aggregate state of your fleet based on pre-defined thresholds.

Device Management has a suite of features, including bulk registration, device jobs (allowing for Over The Air updates, for example), and secure tunnelling. I won't go into depth on these - Fleet Indexing is the focus of this post. If you are interested in these other features, you can read more in the docs, or let me know directly!

Pricing

Fleet Indexing is a paid service at AWS, and it's worth understanding where the costs come from. The Device Management pricing page has the most detail. In short, there is a very low charge for registering devices, then additional charges for updating the index and querying the index.

Fleet Indexing is opt-in. Every service added to the index, such as connectivity or shadow indexing, will increase the size of the index and the operations that will trigger an index update. For example, if connectivity is not enabled, then a device coming online or offline will not update the index, so there will be no additional fleet indexing charge. If shadow indexing is enabled, then every update of an indexed shadow will incur a charge. At time of writing, update charges are measured in a few USD per million updates, and queries cost a few cents per thousand queries.

Overall, we can keep this in mind when deciding which features to include in the Fleet Index given a particular budget. Frequent shadow updates from a large robot fleet will have a larger associated cost.

If you'd like to learn more about pricing estimates and monitoring your usage at AWS, take a look at my video on the topic:

Setup Guide

Now we know a bit more about what Fleet Indexing is, I want to show you how to set it up. I'll use a sample application with multiple ROS2 nodes posting frequent shadow updates so we can see the changes in the shadow index. Each pair of ROS2 nodes is acting as one robot. I'll show you how to enable Fleet Indexing first only with Connectivity data, then add named shadows.

Sample Application Setup

For the sample application, use a computer with Ubuntu 22.04 and ROS2 Humble installed. Install ros-dev-tools as well as ros-humble-desktop. Also install the AWS CLI and the AWS IoT SDK - instructions are in the repository's README and my video on setting up the sample application:

Once the necessary tools are installed, clone the repository:

cd ~
git clone https://github.com/mikelikesrobots/aws-iot-robot-connectivity-samples-ros2.git

Make sure that the application builds correctly by building the ROS2 workspace:

cd ~/aws-iot-robot-connectivity-samples-ros2/workspace
source /opt/ros/humble/setup.bash
colcon build --symlink-install
source install/setup.bash

Next, set the CERT_FOLDER_PATH so that the certificates are written to the right place:

export CERT_FOLDER_LOCATION=~/aws-iot-robot-connectivity-samples-ros2/iot_certs_and_config/
echo "export CERT_FOLDER_LOCATION=$CERT_FOLDER_LOCATION" >> ~/.bashrc

At this point we can set up a few robots for the sample application. For this, we can use the new script in the forked repository:

cd ~/aws-iot-robot-connectivity-samples-ros2
# Install python dependencies for the script
python3 -m pip install -r scripts/requirements.txt
# Create certificates, policies etc for robot1, robot2, and robot3
python3 scripts/make_robots.py robot1 robot2 robot3

This will create Things, certificates, policies and so on for three robots.

Cleaning Up Resources

Once you're finished working with the sample application, you can clean all of these resources up using the corresponding cleanup script:

cd ~/aws-iot-robot-connectivity-samples-ros2
python3 scripts/delete_robots.py robot1 robot2 robot3

With your robots created, you now need to launch the shadow service launch script. Execute the command:

ros2 launch iot_shadow_service crack_all_safes.launch.py certs_path:=$CERT_FOLDER_LOCATION

This should find all the robots you've created in the certificates folder and launch an instance of the sample application for each one. In this case, three "safe cracker" robots will start up. The logs will show shadow updates for each one. We now have three robots in our fleet, each connecting to AWS IoT Core to post their shadow updates.

You can now leave this application running while following the rest of the instructions.

Enabling Fleet Index with Connectivity

For each step, I'll show how to use the console and CLI for enabling Fleet Indexing.

Enabling in Console

In the console, navigate to the AWS Console IoT Core page, and scroll to the bottom of the navigation bar. Open the Settings page.

IoT Settings Navbar Menu

Fleet indexing is partway done the page. Click the Manage indexing button.

Manage Fleet Indexing

In the title bar of the Thing indexing box, there's a check box to enable indexing. Check this box.

Enable Fleet indexing

At this point, you could confirm the update and see the Fleet Index include the names of your Things. However, to get some utility from it, we can also include Connectivity to see whether devices are online or not. To do this, scroll down and check the box next to "Add thing connectivity".

Enable Connectivity

Now scroll to the bottom of the page and click Update.

Confirm Indexing Update

Once enabled, it may take a few minutes to build the Index. You can proceed to the next section, and if the searches are not returning results, wait for a few minutes before trying again.

Enabling using CLI

To enable Fleet Indexing with only Connectivity, execute the following command:

aws iot update-indexing-configuration --thing-indexing-configuration '{
"thingIndexingMode": "REGISTRY",
"thingConnectivityIndexingMode": "STATUS"
}'

The index may take a few minutes to build. If the following section does not immediately work, wait for a few minutes before trying again.

Querying the Fleet Index

Now that connectivity is enabled, we can query our Fleet Index to find out how many devices are online.

In the AWS Console, go to the IoT Core page and open the Things page.

Things Navigation Menu Item

In the controls at the top, two buttons are available that require Fleet Indexing to work. Let's start with Advanced Search.

Advanced Search Button

Within the Query box, enter connectivity.connected = true. Press enter to add it as a query, then click Search.

Query Connected Devices

This should give all three robots as connected devices.

Connected Device Results

Success! We have listed three devices as connected. You can also experiment with the query box to see what other options are available. We can also increase the options available by allowing other IoT services to be indexed.

caution

The console uses queries with a slightly different syntax to the CLI and examples page. If you use an example, be careful to translate the syntax!

As a next step, we can see how to aggregate data. Go back to the Things page and instead click the "Run aggregations" button.

Run Aggregations Button

Search for "thingName = robot*", and under aggregation properties, select "connectivity.connected" with aggregation type "Bucket". This search will return that 3 devices are connected. Explore the page - in particular the Fleet metrics section - to see what else this tool is able to do.

Count Connected Devices

You can use the console to explore the data available.

CLI Queries and Aggregations

To search for the names of connected devices, the CLI command is:

aws iot search-index \
--index-name "AWS_Things" \
--query-string "connectivity.connected:true"

This will return a json structure containing all of the things that are currently connected. This can further be processed, such as with the tool jq:

sudo apt install jq
aws iot search-index \
--index-name "AWS_Things" \
--query-string "connectivity.connected:true" \
| jq '.things[].thingName'

This returns the following lines:

"robot1"
"robot2"
"robot3"

Success! We can even watch this whole command to view the connected devices over time:

watch "aws iot search-index --index-name 'AWS_Things' --query-string 'connectivity.connected:true' | jq '.things[].thingName'"

We could get a number of connected devices by counting the lines in the response:

aws iot search-index \
--index-name 'AWS_Things' \
--query-string 'connectivity.connected:true' \
| jq '.things[].thingName' \
| wc -l

Or, we can use an aggregation. To search for Things with a name starting with robot that are connected, we can use this command:

aws iot get-buckets-aggregation \
--query-string "thingName:robot*" \
--aggregation-field connectivity.connected \
--buckets-aggregation-type '{"termsAggregation": {"maxBuckets": 10}}'

This gives the following response:

{
"totalCount": 3,
"buckets": [
{
"keyValue": "true",
"count": 3
}
]
}

In both cases, we can see that 3 devices are connected. The latter command is more complicated, but can be altered to provide other aggregated data - you can now explore the different aggregation types to perform more queries on your Fleet Index!

Indexing Named Shadows

Next, we will see how to add Named Shadows to the Fleet Index. Again, we will see both the console and CLI methods of accomplishing this.

caution

As described in the pricing section, anything that increases the number of updates to the index will incur additional charges. In this sample application, each robot updates its shadow multiple times per second. Be careful with your own system design if you plan to index your shadow with how many shadow updates are required!

Enabling named shadows is a two-step process. First, you must enable indexing named shadows; then, you must specify the shadows to be indexed. This is to allow you to optimize the fleet index for cost and performance - you can select which shadows should be included to avoid unnecessary updates.

Enabling Named Shadows using the Console

Back in the Fleet Indexing settings, check the box for Add named shadows, then add the shadow names robot1-shadow, robot2-shadow, robot3-shadow.

Adding Shadows to Index

Click Update. This will result in a banner on the page saying that the index is updating. Wait for a couple of minutes, and if the banner persists, continue with the next steps regardless - this banner can remain on screen past when the indexing is complete.

Enabling Named Shadows using the CLI

To add named shadows to the index, you need to both enable the named shadow indexing and specify the shadow names to be indexed. This must be done all in one command - the configuration in the command overwrites the previous configuration, meaning that if you omit the connectivity configuration argument, it will remove connectivity information from the index.

The command to add the shadows to the existing configuration is as follows:

aws iot update-indexing-configuration --thing-indexing-configuration '{
"thingIndexingMode": "REGISTRY",
"namedShadowIndexingMode": "ON",
"thingConnectivityIndexingMode": "STATUS",
"filter": {"namedShadowNames": ["robot1-shadow", "robot2-shadow", "robot3-shadow"]}
}'

The command keeps the indexing mode and connectivity mode the same, but adds named shadow indexing and the specific shadow names to the index.

Querying shadow data

Shadow data can be indexed using the console and CLI - however, the use of wildcards with named shadows is not supported at time of writing. It is possible to query data using a particular shadow name, but currently can't query across the whole fleet.

Querying Shadow via Console

To query for robot1-shadow, open the Advanced search page again. Enter the query shadow.name.robot1-shadow.hasDelta = true, then click search. It may take a few searches depending on the shadow's delta, but this should return robot1 as a Thing.

Shadow Query Results

You can then select any of the Things returned to see more information about that Thing, including viewing its shadow.

Querying Shadow via CLI

With the console, searching for your robot* things will return a table of links, which means clicking through each Thing to see the shadow. The query result from the CLI has more detail in one place, including the entire indexed shadow contents for all things matching the query. For example, the following query will return the thing name, ID, shadow, and connectivity status for all robot* things.

aws iot search-index --index-name 'AWS_Things' --query-string 'connectivity.connected:true'

Example output:

{
"things": [
{
"thingName": "robot1",
"thingId": "13032b27-e770-4146-8706-8ec1249b7015",
"shadow": "{\"name\":{\"robot1-shadow\":{\"desired\":{\"digit\":59},\"reported\":{\"digit\":59},\"metadata\":{\"desired\":{\"digit\":{\"timestamp\":1718135446}},\"reported\":{\"digit\":{\"timestamp\":1718135446}}},\"hasDelta\":false,\"version\":55137}}}",
"connectivity": {
"connected": true,
"timestamp": 1718134574557
}
},
{
"thingName": "robot2",
"thingId": "7e396088-6d02-461e-a7ca-a64076f8a0ca",
"shadow": "{\"name\":{\"robot2-shadow\":{\"desired\":{\"digit\":70},\"reported\":{\"digit\":70},\"metadata\":{\"desired\":{\"digit\":{\"timestamp\":1718135462}},\"reported\":{\"digit\":{\"timestamp\":1718135463}}},\"hasDelta\":false,\"version\":55353}}}",
"connectivity": {
"connected": true,
"timestamp": 1718134574663
}
},
{
"thingName": "robot3",
"thingId": "b5cf7c73-3b97-4d8e-9ccf-a1d5788873c8",
"shadow": "{\"name\":{\"robot3-shadow\":{\"desired\":{\"digit\":47},\"reported\":{\"digit\":22},\"delta\":{\"digit\":47},\"metadata\":{\"desired\":{\"digit\":{\"timestamp\":1718135446}},\"reported\":{\"digit\":{\"timestamp\":1718135447}},\"del
ta\":{\"digit\":{\"timestamp\":1718135446}}},\"hasDelta\":true,\"version\":20380}}}",
"connectivity": {
"connected": true,
"timestamp": 1718134574573
}
}
]
}

We can filter this again using jq to get all of the current reported digits for the shadows. The following command uses a few chained Linux commands to parse the desired data. The purpose of the command is to show that the data returned by the query can be further parsed for particular fields, or used by a program to take further action.

aws iot search-index \
--index-name 'AWS_Things' \
--query-string 'connectivity.connected:true' \
| jq -r '.things[].shadow | fromjson | .name | to_entries[] | {name: .key, desired: .value.desired.digit}'

The above command searches for all connected devices, then for those devices, retrieves the desired digit field from all of its shadows and places them into a map of the following format:

{
"name": "robot1-shadow",
"desired": 24
}
{
"name": "robot2-shadow",
"desired": 54
}
{
"name": "robot3-shadow",
"desired": 8
}

We can again wrap this in a watch command to see how frequently the values update:

watch "aws iot search-index --index-name 'AWS_Things' --query-string 'connectivity.connected:true' | jq -r '.things[].shadow | fromjson | .name | to_entries[] | {name: .key, desired: .value.desired.digit}'"

Success! We can see the desired digit of our three safe cracking robots update live on screen, all from the fleet index.

CloudWatch Metrics and Device Management Alarms

Fleet Indexing is a launching point for sending specific data to CloudWatch Metrics or activating Device Management Alarms. AWS allows you to set up queries and aggregated data that you're interested in, such as the average battery level across a fleet of mobile robots, and perform actions based on that. You could set an alarm to get an alert when the average battery drops too low, emit metrics based on queries that are tracked in CloudWatch, and even start building a CloudWatch Dashboard to view a graph of the average battery level over time.

These are just a few examples of how valuable it is to connect robots to the cloud. Once they are connected and data starts flowing, you can set up monitoring, alarms, and actions to take directly in the cloud, without needing to deploy any of your own infrastructure.

Summary

Overall, Fleet Indexing from IoT Device Management is a useful tool for collecting data from your fleet into one place, then allowing you to query and aggregate data across the fleet or even subsections of the fleet. This post shows how to index device connectivity and named shadow data, plus how to query and aggregate data, both using the console and using the command line. Finally, I briefly mentioned some of the future possibilities with the Fleet Index data, such as setting up alarms, metrics, and dashboards - all from data already flowing into AWS!

Have a try for yourself using the sample code - but remember, frequent updates of the index will incur higher costs. Be careful with your system design and the IoT services you select to index to optimize costs and performance.

· 14 min read
Michael Hart

This post shows how to build two simple functions, running in the cloud, using AWS Lambda. The purpose of these functions is the same - to update the status of a given robot name in a database, allowing us to view the current statuses in the database or build tools on top of it. This is one way we could coordinate robots in one or more fleets - using the cloud to store the state and run the logic to co-ordinate those robots.

This post is also available in video form - check the video link below if you want to follow along!

What is AWS Lambda?

AWS Lambda is a service for executing serverless functions. That means you don't need to provision any virtual machines or clusters in the cloud - just trigger the Lambda with some kind of event, and your pre-built function will run. It runs on inputs from the event and could give you some outputs, make changes in the cloud (like database modifications), or both.

AWS Lambda charges based on the time taken to execute the function and the memory assigned to the function. The compute power available for a function scales with the memory assigned to it. We will explore this later in the post by comparing the memory and execution time of two Lambda functions.

In short, AWS Lambda allows you to build and upload functions that will execute in the cloud when triggered by configured events. Take a look at the documentation if you'd like to learn more about the service!

How does that help with robot co-ordination?

Moving from one robot to multiple robots helping with the same task means that you will need a central system to co-ordinate between them. The system may distribute orders to different robots, tell them to go and recharge their batteries, or alert a user when something goes wrong.

This central service can run anywhere that the robots are able to communicate with it - on one of the robots, on a server near the robots, or in the cloud. If you want to avoid standing up and maintaining a server that is constantly online and reachable, the cloud is an excellent choice, and AWS Lambda is a great way to run function code as part of this central system.

Let's take an example: you have built a prototype robot booth for serving drinks. Customers can place an order at a terminal next to the robot and have their drink made. Now that your booth is working, you want to add more booths with robots and distribute orders among them. That means your next step is to add two new features:

  1. Customers should be able to place orders online through a digital portal or webapp.
  2. Any order should be dispatched to any available robot at a given location, and alert the user when complete.

Suddenly, you have gone from one robot capable of accepting orders through a terminal to needing a central database with ordering system. Not only that, but if you want to be able to deploy to a new location, having a single server per site makes it more difficult to route online orders to the right location. One central system in the cloud to manage the orders and robots is perfect for this use case.

Building Lambda Functions

Convinced? Great! Let's start by building a simple Lambda function - or rather, two simple Lambda functions. We're going to build one Python function and one Rust function. That's to allow us to explore the differences in memory usage and runtime, both of which increase the cost of running Lambda functions.

All of the code used in this post is available on Github, with setup instructions in the README. In this post, I'll focus on relevant parts of the code.

Python Function

Firstly, what are the Lambda functions doing? In both cases, they accept a name and a status as arguments, attached to the event object passed to the handler; check the status is valid; and update a DynamoDB table for the given robot name with the given robot status. For example, in the Python code:

def lambda_handler(event, context):
# ...
name = str(event["name"])
status = str(event["status"])

We can see that the event is passed to the lambda handler and contains the required fields, name and status. If valid, the DynamoDB table is updated:

ddb = boto3.resource("dynamodb")
table = ddb.Table(table_name)
table.update_item(
Key={"name": name},
AttributeUpdates={
"status": {
"Value": status
}
},
ReturnValues="UPDATED_NEW",
)

Rust Function

Here is the equivalent for checking the input arguments for Rust:

#[derive(Deserialize, Debug, Serialize)]
#[serde(rename_all = "UPPERCASE")]
enum Status {
Online,
}
// ...
#[derive(Deserialize, Debug)]
struct Request {
name: String,
status: Status,
}

The difference here is that Rust states its allowed arguments using an enum, so no extra code is required for checking that arguments are valid. The arguments are obtained by accessing event.payload fields:

let status_str = format!("{}", &event.payload.status);
let status = AttributeValueUpdate::builder().value(AttributeValue::S(status_str)).build();
let name = AttributeValue::S(event.payload.name.clone());

With the fields obtained and checked, the DynamoDB table can be updated:

let request = ddb_client
.update_item()
.table_name(table_name)
.key("name", name)
.attribute_updates("status", status);
tracing::info!("Executing request [{request:?}]...");

let response = request
.send()
.await;
tracing::info!("Got response: {:#?}", response);

CDK Build

To make it easier to build and deploy the functions, the sample repository contains a CDK stack. I've talked more about Cloud Development Kit (CDK) and the advantages of Infrastructure-as-Code (IaC) in my video "From AWS IoT Core to SiteWise with CDK Magic!":

In this case, our CDK stack is building and deploying a few things:

  1. The two Lambda functions
  2. The DynamoDB table used to store the robot statuses
  3. An IoT Rule per Lambda function that will listen for MQTT messages and call the corresponding Lambda function

The DynamoDB table comes from Amazon DynamoDB, another service from AWS that keeps a NoSQL database in the cloud. This service is also serverless, again meaning that no servers or clusters are needed.

There are also two IoT Rules, which are from AWS IoT Core, and define an action to take when an MQTT message is published on a particular topic filter. In our case, it allows robots to publish an MQTT message saying they are online, and will call the corresponding Lambda function. I have used IoT Rules before for inserting data into AWS IoT SiteWise; for more information on setting up rules and seeing how they work, take a look at the video I linked just above.

Testing the Functions

Once the CDK stack has been built and deployed, take a look at the Lambda console. You should have two new functions built, just like in the image below:

Two new Lambda functions in the AWS console

Great! Let's open one up and try it out. Open the function name that has "Py" in it and scroll down to the Test section (top red box). Enter a test name (center red box) and a valid input JSON document (bottom red box), then save the test.

Test configuration for Python Lambda function

Now run the test event. You should see a box pop up saying that the test was successful. Note the memory assigned and the billed duration - these are the main factors in determining the cost of running the function. The actual memory used is not important for cost, but can help optimize the right settings for cost and speed of execution.

Test result for Python Lambda function

You can repeat this for the Rust function, only with the test event name changed to TestRobotRs so we can tell them apart. Note that the memory used and duration taken are significantly lower.

Test result for Rust Lambda function

Checking the Database Table

We can now access the DynamoDB table to check the results of the functions. Access the DynamoDB console and click on the table created by the stack.

DynamoDB Table List

Select the button in the top right to explore items.

Explore Table Items button in DynamoDB

This should reveal a screen with the current items in the table - the two test names you used for the Lambda functions:

DynamoDB table with Lambda test items

Success! We have used functions run in the cloud to modify a database to contain the current status of two robots. We could extend our functions to allow different statuses to be posted, such as OFFLINE or CHARGING, then write other applications to work using the current statuses of the robots, all within the cloud. One issue is that this is a console-heavy way of executing the functions - surely there's something more accessible to our robots?

Executing the Functions

Lambda functions have a huge variety of ways that they can be executed. For example, we could set up an API Gateway that is able to accept API requests and forward them to the Lambda, then return the results. One way to check the possible input types is to access the Lambda, then click the "Add trigger" button. There are far too many options to list them all here, so I encourage you to take a look for yourself!

Lambda add trigger button

There's already one input for each Lambda - the AWS IoT trigger. This is an IoT Rule set up by the CDK stack, which is watching the topic filter robots/+/status. We can test this using either the MQTT test client or by running the test script in the sample repository:

./scripts/send_mqtt.sh

One message published on the topic will trigger both functions to run, and we can see the update in the table.

DynamoDB Table Contents after MQTT

There is only one extra entry, and that's because both functions executed on the same input. That means "FakeRobot" had its status updated to ONLINE once by each function.

If we wanted, we could set up the robot to call the Lambda function when it comes online - it could make an API call, or it could connect to AWS IoT Core and publish a message with its ONLINE status. We could also set up more Lambda functions to take customer orders, dispatch them to robots, and so on - the Lambda functions and accompanying AWS services allow us to build a completely serverless robot co-ordination system in the cloud. If you want to see more about connecting ROS2 robots to AWS IoT Core, take a look at my video here:

Lambda Function Cost

How much does Lambda cost to run? For this section, I'll give rough numbers using the AWS Price Calculator. We will assume a rough estimate of 100 messages per minute - that accounts for customer orders arriving, robots reporting their status when it changes, and orders are being distributed; in all, I'll assume a rough estimate of 100 messages per minute, triggering 1 Lambda function invocation each.

For our functions, we can run the test case a few times for each function to get a small spread of numbers. We can also edit the configuration in the console to set higher memory limits, to see if the increase in speed will offset the increased memory cost.

Edit Lambda general configuration

Edit Lambda memory setting

Finally, we will use an ARM architecture, as this currently costs less than x86 in AWS.

I will run a valid test input for each test function 4 times each for 3 different memory values - 128MB, 256MB, and 512MB - and take the latter 3 invocations, as the first invocation takes much longer. I will then take the median billed runtime and calculate the cost per month for 100 invocations per minute at that runtime and memory usage.

My results are as follows:

TestPython (128MB)Python (256MB)Python (512MB)Rust (128MB)Rust (256MB)Rust (512MB)
1594 ms280 ms147 ms17 ms5 ms6 ms
2574 ms279 ms147 ms15 ms6 ms6 ms
3561 ms274 ms133 ms5 ms5 ms6 ms
Median574 ms279 ms147 ms15 ms5 ms6 ms
Monthly Cost$5.07$4.95$5.17$0.99$0.95$1.06

There is a lot of information to pull out from this table! The first thing to notice is the monthly cost. This is the estimated cost per month for Lambda - 100 invocations per minute for the entire month costs a maximum total of $5.17. These are rough numbers, and other services will add to that cost, but that's still very low!

Next, in the Python function, we can see that multiplying the memory will divide the runtime by roughly the same factor. The cost stays roughly the same as well. That means we can configure the function to use more memory to get the fastest runtime, while still paying the same price. In some further testing, I found that 1024MB is a good middle ground. It's worth experimenting to find the best price point and speed of execution.

If we instead look at the Rust function, we find that the execution time is pretty stable from 256MB onwards. Adding more memory doesn't speed up our function - it is most likely limited by the response time of DynamoDB. The optimal point seems to be 256MB, which gives very stable (and snappy) response times.

Finally, when we compare the two functions, we can see that Rust is much faster to respond (5ms instead of 279 ms at 256MB), and costs ~20% as much per month. That's a large difference in execution time and in cost, and tells us that it's worth considering a compiled language (Rust, C++, Go etc) when building a Lambda function that will be executed many times.

The main point to take away from this comparison is that memory and execution time are the major factors when estimating Lambda cost. If we can minimize these parameters, we will minimize cost of Lambda invocation. The follow-up to that is to consider using a compiled language for frequently-run functions to minimize these parameters.

Summary

Once you move from one robot working alone to multiple robots working together, you're very likely to need some central management system, and the cloud is a great option for this. What's more, you can use serverless technologies like AWS Lambda and Amazon DynamoDB to only pay for the transactions - no upkeep, and no server provisioning. This makes the management process easy: just define your database and the functions to interact with it, and your system is good to go!

AWS Lambda is a great way to define one or more of these functions. It can react to events like API calls or MQTT messages by integrating with other services. By combining IoT, DynamoDB, and Lambda, we can allow robots to send an MQTT message that triggers a Lambda, allowing us to track the current status of robots in our fleet - all deployed using CDK.

Lambda functions are charged by invocation, where the cost for each invocation depends on the memory assigned to the function and the time taken for that function to complete. We can minimize the cost of Lambda by reducing the memory required and the execution time for a function. Because of this, using a compiled language could translate to large savings for functions that run frequently. With that said, the optimal price point might not be the minimum possible memory - the Python function seems to be cheapest when configured with 1024MB.

We could continue to expand this system by adding more possible statuses, defining the fleet for each robot, and adding more functions to manage distributing orders. This is the starting point of our management system. See if you can expand one or both of the Lambda functions to define more possible statuses for the robots!

· 15 min read
Michael Hart

In this post, I'll show how to use two major concepts together:

  1. Docker images that can be privately hosted in Amazon Elastic Container Registry (ECR); and
  2. AWS IoT Greengrass components containing Docker Compose files.

These Docker Compose files can be used to run public Docker components, or pull private images from ECR. This means that you can deploy your own system of microservices to any platform compatible with AWS Greengrass.

This post is also available in video form - check the video link below if you want to follow along!

What is Docker?

Docker is very widely known in the DevOps world, but if you haven't heard of it, it's a way of taking a software application and bundling it up with all its dependencies so it can be easily moved around and run as an application. For example, if you have a Python application, you have two options:

  1. Ask the user to install Python, the pip dependencies needed to run the application, and give instructions on downloading and running the application; or
  2. Ask the user to install Docker, then provide a single command to download and run your application.

Either option is viable, but it's clear to see the advantages of the Docker method.

Docker Terminology

A container is a running instance of an application, and an image is the result of saving a container so it can be run again. You can have multiple containers based on the same image. Think of it as a movie saved on a hard disk: the file on disk is the "image", and playing that movie is the "container".

Docker Compose

On top of bundling software into images, Docker has a plugin called Docker Compose, which is a way of defining a set of containers that can be run together. With the right configuration, the containers can talk to each or the host computer. For instance, you might want to run a web server, API server, and database at the same time; with Docker Compose, you can define these services in one file and give them permission to talk to each other.

Building Docker Compose into Greengrass Components

We're now going to use Docker Compose by showing how to deploy and run an application using Greengrass and Docker Compose. Let's take a look at the code.

The code we're using comes from my sample repository.

Clone the Code

The first step is to check the code out on your local machine. The build scripts use Bash, so I'd recommend something Linux-based, like Ubuntu. Execute the following to check out the code:

git clone https://github.com/mikelikesrobots/greengrass-docker-compose.git

Check Dependencies

The next step is to make sure all of our dependencies are installed and set up. Execute all of the following and check the output is what you expect - a help or version message.

aws --version
gdk --version
jq --version
docker --version

The AWS CLI will also need credentials set up such that the following call works:

aws sts get-caller-identity

Docker will need to be able to run containers as a super user. Check this using:

sudo docker run hello-world

Finally, we need to make sure Docker Compose is installed. This is available either as a standalone script or as a plugin to Docker, where the latter is the more recent method of installing. If you have access to the plugin version, I would recommend using that, although you will need to update the Greengrass component build script - docker-compose is currently used.

# For the script version
docker-compose --version

# For the plugin version
docker compose --version

More information can be found for any of these components using:

  1. AWS CLI
  2. Greengrass Development Kit (GDK)
  3. jq: sudo apt install jq
  4. Docker
  5. Docker Compose

Greengrass Permissions Setup

The developer guide for Docker in Greengrass tells us that we may need to add permissions to our Greengrass Token Exchange Role to be able to deploy components using either/both of ECR and S3, for storing private Docker images and Greengrass component artifacts respectively. We can check this by navigating to the IAM console, searching for Greengrass, and selecting the TokenExchangeRole. Under this role, we should see one or more policies granting us permission to use ECR and S3.

Token Exchange Role Policies

For the ECR policy, we expect a JSON block similar to the following:

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer"
],
"Resource": [
"*"
],
"Effect": "Allow"
}
]
}

For the S3 policy, we expect:

{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetObject"
],
"Resource": [
"*"
],
"Effect": "Allow"
}
]
}

With these policies in place, any system we deploy Greengrass components to should have permission to pull Docker images from ECR and Greengrass component artifacts from S3.

Elastic Container Registry Setup

By default, this project builds and pushes a Docker image tagged python-hello-world:latest. It expects an ECR repository of the same name to exist. We can create this by navigating to the ECR console and clicking "Create repository". Set the name to python-hello-world and keep the settings default otherwise, then click Create repository. This should create a new entry in the Repositories list:

ECR Repository Created

Copy the URI from the repo and strip off the python-hello-world ending to get the BASE_URI. Then, back in your cloned repository, open the .env file and replace the ECR_REPO variable with your base URL. It should look like the following:

ECR_REPO=012345678901.dkr.ecr.us-west-2.amazonaws.com

Any new Docker image you attempt to create locally will need a new ECR repository. You can follow the creation steps up to the base URI step - this is a one-time operation.

Building the Docker Images and Greengrass Components

At this point, you can build all the components and images by running:

./build_all.sh

Alternatively, any individual image/component can be built by changing directory into the component and running:

source ../../.env && ./build.sh

Publishing Docker Images and Greengrass Components

Just as with building the images/components, you can now publish them using:

./publish_all.sh

If any image/component fails to publish, it should prevent the script execution to allow you to investigate further.

Deploying Your Component

With the images and component pushed, you can now use the component in a Greengrass deployment.

First, make sure you have Greengrass running on a system. If you don't, you can follow the guide in the video below:

Once you have Greengrass set up, you can navigate to the core device in the console. Open the Greengrass console, select core devices, and then select your device.

Select Greengrass Core Device

From the deployments tab, click the current deployment.

Select current deployment

In the top right, select Actions, then Revise.

Revise Greengrass Deployment

Click Select components, then tick the new component to add it to the deployment. Skip to Review and accept.

Add Component to Deployment

This will now deploy the component to your target Greengrass device. Assuming all the setup is correct, you can access the Greengrass device, and list the most recent logs using:

sudo vim /greengrass/v2/logs/com.docker.PythonHelloWorld.log

This will show all current logs. It may take a few minutes to get going, so keep checking back! Once the component is active, you should see some log lines similar to the following:

2024-01-19T21:46:03.514Z [INFO] (Copier) com.docker.PythonHelloWorld: stdout. [36mpython-hello-world_1  |^[[0m Received new message on topic /topic/local/pubsub: Hello from local pubsub topic. {scriptName=services.com.docker.PythonHelloWorld.lifecycle.Run.Script, serviceName=com.docker.PythonHelloWorld, currentState=RUNNING}
2024-01-19T21:46:03.514Z [INFO] (Copier) com.docker.PythonHelloWorld: stdout. [36mpython-hello-world_1 |^[[0m Successfully published 999 message(s). {scriptName=services.com.docker.PythonHelloWorld.lifecycle.Run.Script, serviceName=com.docker.PythonHelloWorld, currentState=RUNNING}
2024-01-19T21:46:03.514Z [INFO] (Copier) com.docker.PythonHelloWorld: stdout. [36mpython-hello-world_1 |^[[0m Received new message on topic /topic/local/pubsub: Hello from local pubsub topic. {scriptName=services.com.docker.PythonHelloWorld.lifecycle.Run.Script, serviceName=com.docker.PythonHelloWorld, currentState=RUNNING}
2024-01-19T21:46:03.514Z [INFO] (Copier) com.docker.PythonHelloWorld: stdout. [36mpython-hello-world_1 |^[[0m Successfully published 1000 message(s). {scriptName=services.com.docker.PythonHelloWorld.lifecycle.Run.Script, serviceName=com.docker.PythonHelloWorld, currentState=RUNNING}
2024-01-19T21:46:05.306Z [INFO] (Copier) com.docker.PythonHelloWorld: stdout. [36mcomdockerpythonhelloworld_python-hello-world_1 exited with code 0. {scriptName=services.com.docker.PythonHelloWorld.lifecycle.Run.Script, serviceName=com.docker.PythonHelloWorld, currentState=RUNNING}

From these logs, we can see both "Successfully published" messages and "Received new message" messages, showing that the component is running correctly and has all the permissions it needs.

This isn't the only way to check the component is running! We could also use the Local Debug Console, a locally-hosted web UI, to publish/subscribe to local topics. Take a look at this excellent video if you want to set this method up for yourself:

Congratulations!

If you got to this point, you have successfully deployed a Docker Compose application using Greengrass!

Diving into the code

To understand how to extend the code, we need to first understand how it works.

From the top-level directory, we can see a couple of important folders (components, docker) and a couple of important scripts (build_all.sh, publish_all.sh).

components contains all of the Greengrass components, where each component goes in a separate folder and is built using GDK. We can see this from the folder inside, com.docker.PythonHelloWorld. docker contains all of the Docker images, where each image is in a separate folder and is built using Docker. We have already seen build_all.sh and publish_all.sh, but if we take a look inside, we see that both scripts source the .env file, then goes through all Docker folders followed by all Greengrass folders, for each one executing the build.sh or publish.sh script inside. The only exception is for publishing Greengrass components, where the standard gdk component publish command is used directly instead of adding an extra file.

Let's take a deeper dive into the Docker image and the Greengrass component in turn.

Docker Image (docker/python-hello-world)

Inside this folder, we can see the LocalPubSub sample application from Greengrass (see the template), with some minor modifications. Instead of passing in the topic to publish on and the message to publish, we are instead using environment variables.

topic = os.environ.get("MQTT_TOPIC", "example/topic")
message = os.environ.get("MQTT_MESSAGE", "Example Hello!")

Passing command line arguments directly to Greengrass components is easy, but passing those same arguments through Docker Compose is more difficult. It's an easier pattern to use environment variables specified by the Docker Compose file and modified by Greengrass configuration - we will see more on this in the Greengrass Component deep dive.

Therefore, the component retrieves its topic and message from the environment, then published 1000 messages and listens for those same messages.

We also have a simple Dockerfile showing how to package the application. From a base image of python, we add the application code into the app directory, and then specify the entrypoint as the main.py script.

Finally, we have the build.sh and publish.sh. The build simply uses the docker build command with a default tag of the ECR repo. The publish step does slightly more work by logging in to the ECR repository with Docker before pushing the component. Note that both scripts use the ECR_REPO variable set in the .env file.

If we want to add other Docker images, we can add a new folder with our component name and copy the contents of the python-hello-world image. We can then update the image name in the build and publish scripts and change the application code and Dockerfile as required. A new ECR repo will also be required, matching the name given in the build and publish scripts.

Greengrass Component (components/com.docker.PythonHelloWorld)

Inside our Greengrass component, we can see a build script, the Greengrass component files, and the docker-compose.yml that will be deployed using Greengrass.

The build script is slightly more complicated than the Docker equivalent, which is because the ECR repository environment variable needs to be replaced in the other files, but also needs to be reset after the component build to avoid committing changes to the source code. These lines...

find . -maxdepth 1 -type f -not -name "*.sh" -exec sed -i "s/{ECR_REPO}/$ECR_REPO/g" {} \;
gdk component build
find . -maxdepth 1 -type f -not -name "*.sh" -exec sed -i "s/$ECR_REPO/{ECR_REPO}/g" {} \;

...replace the ECR_REPO placeholder with the actual repo, then build the component with GDK, then replace that value back to the placeholder. As a result, the built files are modified, but the source files are changed to their original state.

Next we have the GDK configuration file, which shows that our build system is set to zip. We could push only the Docker Compose file, but this method allows us to zip other files that support it if we want to extend the component. We also have the version tag, which needs to be incremented with new component versions.

After that, we have the Docker Compose file. This contains one single service, some environment variables, and a volume. The service refers to the python-hello-world Docker image built by docker/python-hello-world by specifying the image name.

warning

This component references the latest tag of python-hello-world. If you want your Greengrass component version to be meaningful, you should extend the build scripts to give a version number as the Docker image tag, so that each component version references a specific Docker image version.

We can see the MQTT_TOPIC and MQTT_MESSAGE environment variables that need to be passed to the container. These can be overridden in the recipe.yaml by Greengrass configuration, allowing us to pass configuration through to the Docker container.

Finally, we can see some other parameters which are needed for the Docker container to be able to publish and subscribe to local MQTT topics:

environment:
- SVCUID
- AWS_GG_NUCLEUS_DOMAIN_SOCKET_FILEPATH_FOR_COMPONENT
volumes:
- ${AWS_GG_NUCLEUS_DOMAIN_SOCKET_FILEPATH_FOR_COMPONENT}:${AWS_GG_NUCLEUS_DOMAIN_SOCKET_FILEPATH_FOR_COMPONENT}

These will need to be included in any Greengrass component where the application needs to use MQTT. Other setups are available in the Greengrass Docker developer guide.

If we want to add a new Docker container to our application, we can create a new service block, just like python-hello-world, and change our environment variables and image tags. Note that we don't need to reference images stored in ECR - we can also access public Docker images!

The last file is the recipe.yaml, which contains a lot of important information for our component. Firstly, the default configuration allows our component to publish and subscribe to MQTT, but also specifies the environment variables we expect to be able to override:

ComponentConfiguration:
DefaultConfiguration:
Message: "Hello from local pubsub topic"
Topic: "/topic/local/pubsub"

This allows us to override the message and topic using Greengrass configuration, set in the cloud.

The recipe also specifies that the Docker Application Manager and Token Exchange Service are required to function correctly. Again, see the developer guide for more information.

We also need to look at the Manifests section, which specifies the Artifacts required and the Lifecycle for running the application. Within Artifacts, we can see:

- URI: "docker:{ECR_REPO}/python-hello-world:latest"

This line specifies that a Docker image is required from our ECR repo. Each new private Docker image added to the Compose file will need a line like this to grant permission to access it. However, public Docker images can be freely referenced.

- URI: "s3://BUCKET_NAME/COMPONENT_NAME/COMPONENT_VERSION/com.docker.PythonHelloWorld.zip"
Unarchive: ZIP

This section specified that the component files are in a zip, and the S3 location is supplied during the GDK build. We are able to use files from this zip by referencing the {artifacts:decompressedPath}/com.docker.PythonHelloWorld/ path.

In fact, we do this during the Run lifecycle stage:

Lifecycle:
Run:
RequiresPrivilege: True
Script: |
MQTT_TOPIC="{configuration:/Topic}" \
MQTT_MESSAGE="{configuration:/Message}" \
docker-compose -f {artifacts:decompressedPath}/com.docker.PythonHelloWorld/docker-compose.yml up

This uses privilege, as Docker requires super user privilege in our current setup. It is possible to set it up to work without super user, but this method is the simplest. We also pass MQTT_TOPIC and MQTT_MESSAGE as environment variables to the docker-compose command. With the up command, we tell the component to start the application in the Docker Compose file.

tip

If we want to change to use Compose as a plugin, we can change the run command here to start with docker compose.

And that's the important parts of the source code! I encourage you to read through and check your understanding of the parameters - not setting permissions and environment variables correctly can lead to some confusing errors.

Where to go from here

Given this setup, we should be able to deploy private or public Docker containers, which paves the path for deploying our robot software using Greengrass. We can run a number of containers together for the robot software. This method of deployment gives us the advantages of Greengrass, like having an easier route to the cloud, a more fault-tolerant deployment with roll-back mechanism and version tracking, and allowing us to deploy other components like the CloudWatch Log Manager.

In the future, we can extend this setup to build ROS2 containers, allowing us to migrate our robot software to Docker images and Greengrass components. We could install Greengrass on each robot, then deploy the full software with configuration options. We then also have a mechanism to update components or add more as needed, all from the cloud.

Give the repository a clone and try it out for yourself!

· 7 min read
Michael Hart

This post shows how to build a Robot Operating System 2 node using Rust, a systems programming language built for safety, security, and performance. In the post, I'll tell you about Rust - the programming language, not the video game! I'll tell you why I think it's useful in general, then specifically in robotics, and finally show you how to run a ROS2 node written entirely in Rust that will send messages to AWS IoT Core.

This post is also available in video form - check the video link below if you want to follow along!

Why Rust?

The first thing to talk about is, why Rust in particular over other programming languages? Especially given that ROS2 has strong support for C++ and Python, we should think carefully about whether it's worth travelling off the beaten path.

There are much more in-depths articles and videos about the language itself, so I'll keep my description brief. Rust is a systems-level programming language, which is the same langauge as C and C++, but with a very strict compiler that blocks you from doing "unsafe" operations. That means the language is built for high performance, but with a greatly diminished risk of doing something unsafe as C and C++ allow.

Rust is also steadily growing in traction. It is the only language other than C to make its way into the Linux kernel - and the Linux kernel was originally written in C! The Windows kernel is also rewriting some modules in Rust - check here to see what they have to say:

The major tech companies are adopting Rust, including Google, Facebook, and Amazon. This recent 2023 keynote from Dr Wener Vogels, Vice President and CTO of Amazon.com, had some choice words to say about Rust. Take a look here to hear this expert in the industry:

Why isn't Rust used more?

That's a great question. Really, I've presented the best parts in this post so far. Some of the drawbacks include:

  1. Being a newer language means less community support and less components provided out of the box. For example, writing a desktop GUI in Rust is possible, but the libraries are still maturing.
  2. It's harder to learn than most languages. The stricter compiler means some normal programming patterns don't work, whcih means relearning some concepts and finding different ways to accomplish the same task.
  3. It's hard for a new language to gain traction! Rust has to prove it will stand the test of time.

Having said that, I believe learning the language is worth it for safety, security, and sustainability reasons. Safety and security comes from the strict compiler, and sustainability comes from being a low-level language that does the task faster and with fewer resources.

That's true for robotics as much as it is for general applications. Some robot software can afford to be slow, like high-level message passing and decision making, but a lot of it needs to be real-time and high-performance, like processing Lidar data. My example today is perfectly acceptable in Python because it's passing non-urgent messages, but it is a good use case to explore using Rust in.

With that, let's stop talking about Rust, and start looking at building that ROS2 node.

Building a ROS2 Node

The node we're building replicates the Python-based node from this blog post. The same setup is required, meaning the setup of X.509 certificates, IoT policies, and so on will be used. If you want to follow along, make sure to run through that setup to the point of running the code - at which point, we can switch over to the Rust-based node. If you prefer to follow instructions from a README, please follow this link - it is the repository containing the source code we'll be using!

Prerequisites

The first part of our setup is making sure all of our tools are installed. This node can be built on any operating system, but instructions are given for Ubuntu, so you may need some extra research for other systems.

Execute the following to install Rust using Rustup:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

There are further dependencies taken from the ROS2 Rust repository as follows:

sudo apt install -y git libclang-dev python3-pip python3-vcstool # libclang-dev is required by bindgen
# Install these plugins for cargo and colcon:
cargo install --debug cargo-ament-build # --debug is faster to install
pip install git+https://github.com/colcon/colcon-cargo.git
pip install git+https://github.com/colcon/colcon-ros-cargo.git

Source Code

Assuming your existing ROS2 workspace is at ~/ros2_ws, the following commands can be used to check out the source code:

cd ~/ros2_ws/src
git clone https://github.com/mikelikesrobots/aws-iot-node-rust.git
git clone https://github.com/ros2-rust/ros2_rust.git
git clone https://github.com/aws-samples/aws-iot-robot-connectivity-samples-ros2.git

ROS2 Rust then uses vcs to import the other repositories it needs:

cd ~/ros2_ws
vcs import src < src/ros2_rust/ros2_rust_humble.repos

That concludes checking out the source code.

Building the workspace

The workspace can now be built. It takes around 10m to build ROS2 Rust, which should only need to be done once. Following that, changes to the code from this repository can be built very quickly. To build the workspace, execute:

cd ~/ros2_ws
colcon build
source install/setup.bash

The build output should look something like this:

Colcon Build Complete

Once the initial build has completed, the following command can be used for subsequent builds:

colcon build --packages-select aws_iot_node

Here it is in action:

build-only-iot

Now, any changes that are made to this repository can be built and tested with cargo commands, such as:

cargo build
cargo run --bin mock-telemetry

The cargo build log will look something like:

cargo-build-complete

Multi-workspace Setup

The ROS2 Rust workspace takes a considerable amount of time to build, and often gets built as part of the main workspace when it's not required, slowing down development. A different way of structuring workspaces is to separate the ROS2 Rust library from your application, as follows:

# Create and build a workspace for ROS2 Rust
mkdir -p ~/ros2_rust_ws/src
cd ~/ros2_rust_ws/src
git clone https://github.com/ros2-rust/ros2_rust.git
cd ~/ros2_rust_ws
vcs import src < src/ros2_rust/ros2_rust_humble.repos
colcon build
source install/setup.bash

# Check out application code into main workspace
cd ~/ros2_ws/src
git clone https://github.com/mikelikesrobots/aws-iot-node-rust.git
git clone https://github.com/aws-samples/aws-iot-robot-connectivity-samples-ros2.git
cd ~/ros2_ws
colcon build
source install/local_setup.bash

This method means that the ROS2 Rust workspace only needs to be updated with new releases for ROS2 Rust, and otherwise can be left. Furthermore, you can source the setup script easily by adding a line to your ~/.bashrc:

echo "source ~/ros2_rust_ws/install/setup.bash" >> ~/.bashrc

The downside of this method is that you can only source further workspaces using the local_setup.bash script, or it will overwrite the variables needed to access the ROS2 Rust libraries.

Running the Example

To run the example, you will need the IOT_CONFIG_FILE variable set from the Python repository.

Open two terminals. In each terminal, source the workspace, then run one of the two nodes as follows:

source ~/ros2_ws/install/setup.bash  # Both terminals
source ~/ros2_ws/install/local_setup.bash # If using the multi-workspace setup method
ros2 run aws_iot_node mqtt-telemetry --ros-args --param path_for_config:=$IOT_CONFIG_FILE # One terminal
ros2 run aws_iot_node mock-telemetry # Other terminal

Using a split terminal in VSCode, this looks like the following:

Both MQTT and Mock nodes running

You should now be able to see messages appearing in the MQTT test client in AWS IoT Core. This will look like the following:

MQTT Test Client

Conclusion

We've demonstrated that it's possible to build nodes in Rust just as with C++ and Python - although there's an extra step of setting up ROS2 Rust so our node can link to it. We can now build other nodes in Rust if we're on a resource constrained system, such as a Raspberry Pi or other small dev kit, and we want the guarantees from the Rust compiler that the C++ compiler doesn't have while being more secure and sustainable than a Python-based version.

Check out the repo and give it a try for yourself!