Image default
Guest Articles

5 Lessons Learnt in Scaling Autonomous Mobile Robots from 10 to 10K

Joe Wieciek, Software Ops Manager at Brain Corp, shares the five most important lessons learned from powering the world’s largest fleet of autonomous mobile robots (AMRs) operating in commercial indoor public spaces

AI software technology today is used to build autonomous robots for the retail industry, malls, airports, hospitals and more. At Brain Corp, our groundbreaking work with our manufacturing partners has helped us build and sell several autonomous mobile robots (AMR) across several verticals and brands. But, our work doesn’t end there.

Once the robots are deployed, our software operations team works diligently to ensure that every BrainOS®-enabled robot performs well in the field, collecting data and insights via the cloud that we use to improve our software and systems, and ultimately create better user experiences. However, managing a handful of robots in the field is drastically different from managing a large global fleet.

We learned that the hard way on our path to powering the largest fleet in the world of autonomous mobile robots (AMRs) operating in commercial indoor public spaces. These are the five most important lessons we’ve learned from scaling our BrainOS-powered fleet from 10 to more than 10,000 over the last three years.

1. Build infrastructure early

We learned early on that we need to have the ability to access the robots remotely. When you’re working with just a few robots, it’s easy enough to be hands-on with updates and fixes. But, as your fleet passes 50 or so, it quickly becomes impossible to manually keep track of everything that’s happening with each robot. Collecting information and understanding precisely how, when, and where each robot is operating is crucial to providing good service and maintaining a good product. So, what’s the solution? Have a robust infrastructure.

Though we can’t be on the ground with every user, we can keep a close virtual eye on the state of the robots and quickly resolve any issues they are experiencing.

With global infrastructure, including proprietary robot performance telemetry, we can monitor every robot in near real-time and can deploy configuration changes or software updates in just a few hours. Proper infrastructure is what makes managing a high-performance fleet of robots possible.

2. Visibility is everything

In order to detect, investigate, and resolve issues with the robots in the field, as well as determine areas of improvement, we need full visibility at both individual robot and fleet levels. Without reliable performance monitoring tools, we wouldn’t be able to immediately understand how and why robots fail. This means users would be forced to wait for someone to diagnose the robot in person.

Infrastructure built for visibility also allows us to run analytics at scale and gather data from thousands of robots around the world in near real time. The insights we gain from that data helps us continually improve our software and ensure that robot performance gets better with every release.

3. Traceable configuration management can save the day

The need for visibility extends to our internal processes. As our fleet grows, we need to be able to test how different features or configurations perform without inadvertently causing problems for our users. Manually connecting to individual robots to make updates or edits is not only inefficient, it’s also not transparent. It’s crucial that our infrastructure enables visibility around what, when, where, and by whom configuration changes or software updates are made so that we can track their impact on robot performance.

By incorporating traceability into our infrastructure, we can quickly and easily audit any issues that arise with our users’ robots, trace the issues back to their source, resolve them, and prevent them from happening again. This ensures end-users have a more consistent experience, as well as add new features much faster, quick rollback of potential issues, and the ability to adjust robots based on environmental issues for better performance.

4. Small frequent software updates are better than big occasional ones

When we were first starting out, we did feature-based releases, meaning we only updated the robots via the cloud when a new feature was ready to be released. This approach was not only frustrating for our developers, who had to wait months before seeing results, but also detrimental to our users who had to wait months for new features, improvements, or bug fixes. Each new release entailed significant changes and, despite rigorous pre-release testing, there was always a chance that those changes could have bugs or unexpected effects on other parts of the system.

Rethinking our release cycle allowed us to minimize that risk and also make better iterative improvements. Just as cloud software providers push out constant tweaks, we began releasing minor software updates on a regular basis for the robots to automatically download via our distributed infrastructure. Our users now expect regular software updates that have very little chance of negatively impacting the performance of the robots. And, if there is a bug or if a feature doesn’t work well, our infrastructure will pick it up and we deploy a fix in a matter of hours or days, often without the user ever noticing a performance issue. This means that the robots are incrementally improving without users needing to do any work.

5. Robots need to be understandable to be useful

End-users also need visibility into how the robots they rely on are performing. But, robots are complex and inaccessible. How can we expect users to trust the robots, when they don’t understand how they work?

The only way for robots to actually be useful at scale is if we translate what they do into and easy-to-understand language. This doesn’t just mean that they should have an intuitive user interface – though, of course, that’s a must – they also need clear and easy-to-understand documentation, support tools, and manufacturing guidelines.

The robots’ accessibility impacts their serviceability and reliability for users, so we are constantly working to make the robots more accessible. For example, instead of displaying a coded error message, the robots display a pop-up that states the problem and what steps the user can take to resolve it. Because our robots are easy to understand, they are easier and faster to repair, more trustworthy, and more useful overall.

These five lessons are just the tip of the iceberg. Over the past decade, it’s safe to say that we’ve learned countless lessons that have allowed us to build better robots that better serve our users. The culmination of everything we’ve learned so far is exemplified by our “people-first” approach to safety and robotics. Our robots support people and improve productivity by taking over labor-intensive, joyless tasks so that human workers can focus on other things. We’re proud of our progress, and we’ll continue to strive to make our robots even better and easier to use.

This article has been republished with permission from the Brain Corp blog.

Related posts

Leaving The Chatbot Behind?

Andrei Papancea

Why regulating AI is a lost cause

Ala Shaabana

For growing game developers, the key to success is understanding what it looks like

Brent Dance