Software & Firmware

The brains of the operation! Main logic, RaspPi control, wireless communication, and the code architecture that brings it all together.

Software Architecture

System Overview

Web Interface

User Input

Python Backend

RaspPi Server

Firmware

MCU Control

Hardware

Motors & Sensors

The software architecture of Scribblz evolved over the course of the project, as many design changes and decisions called for overhauls or pivots. However, our consistent goal throughout the project was to develop a pipeline for users to input an image and have the robot draw said image.

Original Design Goal

Originally, the project vision had the user input exist through a website where someone may have the choice to select an image from their device and upload it, or draw via simple means, as one would see in the barebones Microsoft Paint application. For the robot to draw the image, this involved developing a custom path-following algorithm that would need to interpret the image as some physical entity and perform various calculations to optimize the traversal of the robot on the window.

Initial Development and Path Finding

To work towards our original goal, we needed to establish a connection between the user and the robot to allow for the uploaded image to be transformed into a robot drawing it. For the robot drawing, we needed to develop some software that would optimize for minimal erasure, as we determined that drawing over our lines would smudge and ruin the drawing, and minimal time, as the robot was currently battery-operated and needed to perform each drawing quickly so as to not run out of battery while drawing. As such, we developed this architecture design:

Inital Software Architecture

At this stage in time, we had also developed part of the path finding element. This simple script consisted of two elements: transforming an input image into a series of controllable paths, and executing robot movement along these paths. For this, we had a visualization that was semi-physically accurate, shown below: Inital Pathfinding Visualization

Challenges Encountered

Here, the user inputs an image of a square, showcasing the path to follow and eventually draw over. This implementation used Python libraries like Shapely to define polylines and OpenCV to extract contours from images. However, the physics were inaccurate as the robot's shape would simply teleport between each waypoint, with thousands of waypoints depicting each line.
After showing this demonstration in our design review, we received feedback that made us modify our choices. The original cloud computing design intended to calculate the path was too much overhead for our project. Additionally, the lack of intelligence on the docking station, which at first seemed like a good idea, meant that our localization efforts would be difficult. With this new information, we would plan to modify our architecture layout.

Simulation Development

Meanwhile, we developed the physical aspects of the simulator. For our purposes, we utilized WeBots as a simulator to develop and visualize features of our code, such as the inverse kinematics functionality and the wheel movement. With some hardcoded physical features, such as the length of the robot, maximum wheel velocities, and marker placement, we were able to visualize and confirm the movement of the vehicle from inputs.

Inital Pathfinding Visualization

Experimenting with some software decisions and advice from the PIE faculty, we realized we could classify the polylines before doing certain path following algorithmic decisions to gain more information. This meant we could determine the containment of lines and more consistently explain the inward-out behavior that we intuitively attempted to describe.

With this, we were also able to drive the robot.

Learnings and Pivot

However, this was all done via dead reckoning. We simply assumed the robot’s position would be accurate to where it was. Given that on the surface we would eventually drive on, wheel slippage was estimated to be a large issue, we determined new ways to do our localization, which required a technological overhaul and introduced the addition of the Raspberry Pi 4 to our docking station. As we added another computer, which requires additional communication, we did a slight pivot to a ROS architecture, as we determined it would be in line with our learning goals and give us experience with something that we expected to be important in the future.

Localization

Now that we had a processor on the docking station, we had to ideate on the ideal method for localization. We initially tried this by using a camera on the docking station and IR LEDs on the robot. The camera would have a fisheye lens to capture the full 90 degrees required to see all parts of the window. There was also an IR-pass filter placed in front of the camera to prevent the natural light from impacting the sensing. Eventually, we got something to work that produced results that look like this.

IR LEDs on the Camera

However, given the placement of the slip ring, which was necessary after we pivoted to a fully tethered system, we had to rely on the IMU data to determine which LED was being seen at which time. As a team, we determined this was not an elegant solution and concluded that something else must act as our localization.
With this quick action, we were able to borrow a LiDAR from another team that was no longer using it. We placed the LiDAR on the docking station, added a uniformly circular element to the slip ring of the robot, and elevated the LiDAR such that its sensing plane intersected this circular component. With this, we ran the RANSAC circle fitting algorithm on the points detected by the LiDAR, which was able to accurately calculate the center of the robot when placed on the window. The data looked like this: LiDAR Scan with Circle Fitting

Together with the IMU, we would have orientation and position, which was enough to utilize our inverse kinematics.

Microcontroller and RaspPi

Next, we had to set up our microcontroller and Raspberry Pi communication. As we were using ROS, we decided to install and run microROS on the ESP 32. We eventually got this testing setup to work with the ESP 32 publishing and subscribing to test topics. However, with the failure of the PCB, we did not have certain electrical components in stock, such as the motor drivers that only worked with our PCB. As such, we switched to the Arduino with a motor shield as used in previous PIE projects. Consequently, we could not run microROS on the Arduino as the computing power was not high enough. Therefore, we switched to direct UDP communication. During this, we worked on making the Raspberry Pi act as an access point to which the Arduino would connect, as this would minimize our wifi communication latency. System Components and Files Docking Station ROS Architecture State Machine As we developed our ROS architecture, we used development tools like Foxglove and Tailscale. Tailscale effectively let us connect to the Raspberry Pi via a static IP and view the machine details online. Foxglove provided realtime visualizations of certain ROS components, like a 3D view of our waypoints, global frame, and robot frame. Here is a screenshot of our Foxglove interface that shows important topics such as LiDAR data, IMU data, current robot state, logging, and keyboard input: Foxglove Interface At this stage, we were able to run the script that would wait for an image to be uploaded via the terminal, transform it into waypoints, and export those waypoints into the world frame. Waypoints

Pivot

However, this development process resulted in inaccurate docking positions, as we assumed this could be integrated into the docking and undocking states from within the waypoint extraction. Additionally, the extent to which our robot had localization updates did not mesh well with the large amount of waypoints generated. Thus, for the final demonstration, we did not use the image to path pipeline and instead had only a few predefined waypoints. Throughout this debugging process, our ability to diagnose problems was consistently halted by incessant networking crashes and instability in our SSH connection. We would like to note that there may have been around 4 total hours of downtime through the project.

Final Design Overview

At the end of our project, we had programmed and developed:

  1. An image-to-path pipeline that generated a series of waypoints to represent the process of docking, drawing, and undocking that was erasure-aware.
  2. A ROS robotics architecture that allowed the robot to function in several states, including a teleoperation and autonomous drawing mode, utilizing inverse kinematics and localization from sensor feedback.
  3. A communication network between our robot’s Arduino Nano microcontroller and the docking station’s Raspberry Pi over UDP that relays sensor information and commands.
For the physical integration, we continued under the design of a kiwi drive robot. This meant that each wheel essentially lives on the corner of a triangle and allows for holonomic motion, meaning that translation and rotation can occur independently of one another.

Path Pipeline

To properly construct a path for the robot to follow that would trace out an image, we had a strict requirement: the robot must not drive over any ink that had been placed already. Considering this, we determined this requirement would be solved in the path following phase as opposed to the path generation. As such, we adopted the following procedure: Path Finding Procedure For the first segment of our pipeline, we used Otsu’s method to binarize the image to separate what we would draw and what would be the background. We note that this had varying levels of success but proved to be an interesting method to extrapolate foreground and background in images. Operating with a now binarized, or black and white, image, we used OpenCV to skeletonize the image. This transforms thick lines into single-pixel representation, meaning that the size of lines will converge into a thin marker-sized representation. This process can be visualized with a sample image here: Skeletonization Process Note that via Otsu’s method, we can extract the border of the square we wish to draw, squish the multi pixel lines down to one, and extract a line for the robot to follow. As a result of using Otsu’s method, we were limited to not shading in images.
Once the finalized polylines had been extracted, the algorithm solves the translational ordering of the robot. This minimizes the overlap between robot and drawn lines. Initially, before robot movement, the polylines themselves go through an ordering based on containment. For a robot to intelligently draw lines that are contained within other lines, it must follow an inside-out pattern. This can be visualized below: Containment Ordering To draw this image without erasure, the robot cannot draw the outer circle, as it must then go inside to draw the inner circle. As it will eventually dock, it must return to home and must then run over a line it drew already. Visually, we can determine a line is contained within another line if it exists within the bounding box of that polyline, as shown above, where the blue bounding box contains two sub-polylines, and so on. Once this has been determined and each polyline has been labeled, the program solves a Traveling Salesperson Problem for each hierarchical level. It minimizes the drive time by greedily choosing the lowest cost of pen-up travel, which is the shortest distance with minimal rotation.

Since our wheels slip, we decided we would want to minimze rotation, hence adding it to the cost of travel. Along with the cost of rotation, the cost of erasure is far higher, as this is our main goal. As such, we need to account for the optimal rotation at each point so as to avoid overlapping with any drawn line. To do this, we represented the robot as a triangle in Shapely. Erasure was determined when doing an overlap call between all of our Shapely objects, being the polylines that were being drawn and the robot itself. We also determined it would be optimal for the robot to attempt to draw tangent to the polyline so as to avoid future rotations.

With all of this taken into account, the pipeline will run its algorithms and output the waypoints created for each polyline, which contain features like a boolean representing the marker being up or down and the orientation, in addition to just the x and y position relative to the canvas. This execution can be stored as a JSON or outputted directly, which was chosen for our ROS topic.

For this section, we have these dependencies:
Python 3 Environment:
  • NumPy
  • Shapely
  • NetworkX
  • Scikit-Image
  • OpenCV

If given more time, we would hope to modify the algorithm to contain more intelligent features. We feel as if distinguishing between translational and rotational movement was an unelegant solution, but were forced to maintain this in order to further the development of the project. Additionally, solving a global problem via a map-wide cost function as opposed to attempting to classify the lines themselves is a more efficient method one could explore. Ultimately, the path following was developed quickly and given less prioritization as opposed to finalizing the system and functionality regarding firmware and communication.

ROS Architecture

Our system functions with a ROS 2 graph where the combination of subscribing and publishing to topics via our nodes creates functionality within the robot.

The state_machine_node functions as the central part of the organization, as it controls the five main nodes of our system - teleop, docking, docked, undocking, and drawing. The system attempts to decouple each element such that it can be visualized and debugged in intermediary processes.

The aforementioned state_machine_node publishes a string corresponding to one of the five states to the /robot_state topic. Nearly all subsequent nodes will subscribe to this topic and behave in correspondence to certain features. ROS Architecture Nodes The system diagram shown above showcases the different nodes based on different colors, the topics they publish which are lighter colors, and subscriptions with arrows connecting topics to nodes.

The other nodes, which define each feature of the states, can be classified as localization and runtime nodes.

For our localization, we use the lidar_pose node, which listens to the UDP packets sent from the lidar for a 360 degree scan. With this data, we process it using the RANSAC circle fitting algorithm to determine the center of the robot, as explained in the process, and extract an x and y position. This node subscribes to the /imu/data topic published by the udp_command_sender node. Then, it publishes to /robot_pose, which contains all the physical details of the robot, like location and orientation.

For runtime, our movement is based around inverse kinematics via our waypoints. Thus, we need to extract velocities for our robot in terms of direction, and apply this to the inverse kinematics matrix for wheel velocities, shown below: Inverse Kinematics Matrix To do this, the path_follower node subscribes to /robot_pose and /execute_drawing_path to calculate the positional error, in terms of both orientation and translation. This uses the tf2_ros buffer so that the map frame error is transformed into the robot body frame error. To calculate these velocities, we used a simple PID control loop.

Each state has special features.
When the state becomes undocking, the undocking waypoints will be published and the system will activate the impeller ramp up sequence via sending PWM signals to the motor using the udp_command_sender node.
When the state becomes docking, the docking waypoints will be published and the /path_follower node, which subscribes to /execute_drawing_path, will begin the waypoint following.
When the state becomes drawing, the drawing waypoints will be published and the same occurs as in docking.
When the state becomes teleop, we use a Foxglove toolkit that allows us to use our keyboard as a joystick for teleoperation commands.

This stack is a comprehensive sensor feedback loop that lets our robot draw given waypoints effectively and accurately.

In our development, we used Foxglove, which does not impact our system architecture but involves an additional requirement. This did not modify the behavior of any program but allowed us to view the stream of information in realtime.

In this section, we have these new dependencies:
  • rclpy
  • std_msgs
  • geometry_msgs
  • sensor_msgs
  • nav_msgs
  • tf2_ros
  • ament_index_python
  • socket
  • threading

For the future, we would have planned certain design decisions out more and connected with experienced ROS users who could have provided insight into actions not taken. We would implement the use of actions as opposed to reusing certain segments over and over. We understand that our current implementation was quite messy (as seen in the diagram) and would optimize our cleanup.

Communication and Firmware

Our firmware existed mainly on the Arduino Nano on the robot. This was running one file that acts as a non-sophisticated bridge that executes and sends low-level commands, leaving intense computation to the docking station’s Raspberry Pi.

For networking, we used UDP. This is ideal for our case as we should continue moving regardless of if a packet is lost, as compared to another network communication protocol, such as TCP.

We hardcoded the Raspberry Pi’s IP into the Arduino. This was fine as we were using Tailscale in our development, meaning unless someone intentionally meddled with certain settings, the IP would not change no matter what.

For the packet structure, we employed a text-based parsing protocol. All messages had identifying tags at the beginning, where motor commands looked like “CMD 1.0 -0.5 0.0”, fan commands looked like “FAN 1500”, marker commands looked “MARKER 2000” and outgoing sensor data looked like “IMU 0.12 -0.04 0.98 0.1 0.2 -0.1”

We then had to convert the wheel commands into speeds for the motor drivers, which simply used the PWM pins and a function to convert a value into the corresponding PWM signal value we derived. Similarly, the impeller spins with the ESC which also expects an electrical signal for the speed, with 1000 being off and 2000 being 100% speed. Additionally, we had to precalibrate the values for which the marker would be in the on and off state and then would send those values. This was done experimentally.

These variables were all controlled via the UDP communication that happened between the Raspberry Pi and Arduino Nano as the Raspberry Pi was hosted and set up to be an access point. The Nano connected to the Raspberry Pi’s WiFi, which we set up with a custom name and password. This (theoretically) ensured a smooth setup and consistent connection.

For this section, we had these dependencies:
  • WiFiNINA.h
  • SPI.h
  • Wire.h
  • Servo.h
  • MPU6050.h
If given more time, we could have looked more into certain networking features and optimized the communication for minimal latency. Most of our research was done quickly and not fully fleshed out. It also would have been interesting to see if microROS could have been better, despite our initial hiccup and incompatibility.

← Back to Home