The design proposed here uses a prototyping platform to demonstrate a computer vision algorithm for identifying sidewalks and determining if the user is on path.
Power wheelchair navigation may be more difficult for those who have a fine motor or cognitive disability. Some people are not able to use a powered wheelchair on their own and must depend on a caregiver. The design proposed here uses a prototyping platform to demonstrate a computer vision algorithm for identifying sidewalks and determining if the user is on path. The set-up uses color conversion and morphological methods to manipulate live video. The cost for the system is under $100 and uses entirely open-source technology.
Powered wheelchairs can be a necessary improvement in an individual’s life and for their achievement of greater independence. Greater independence in daily mobility can provide opportunity for improved living conditions and a return to the workforce as well as a reduction in required caregiver assistance. Often, caregivers are generally only available at scheduled times and can be cost prohibitive for many potential users. If the user was able to go about their errands or activities without needing the assistance of a caregiver, they could enjoy greater freedom. This research aims to develop a simple camera-based system that can help navigate the user through unfamiliar public areas.
Previous Work
There have been a variety of ideas like the design presented here, but few have been implemented in commercial wheelchair set-ups.-ups. Often the cost of such technologies is not covered by insurance and, therefore, these systems are not found widely available in the marketplace and have limited use among the users who would benefit from them. As indicated in the literature, up to 40% of patients using power wheelchairs found it "difficult or impossible" to steer and up to half of patients would find it beneficial to have an added navigation system [1]. This demonstrates the demand for such a system. The research also indicates that people with spinal cord injuries or other cognitive disabilities may take as long as two years to learn how to steer a power wheelchair effectively. Those with fine motor control difficulties may find themselves navigating very slowly in order to safely travel through unknown territory. With the addition of steering assistance, these individuals would be able to confidently carry out their routines with less impact on their day.
Horn has summarized designs of “smart wheelchairs” through 2012. Technologies such as infrared, sonar, laser, radar, and physical sensors are combined to help the wheelchair gain awareness of its surroundings [2]. After the initial offering of chairs mounted to the top of robots, researchers branched their work to wheelchairs with embedded computation, and more recently, modular components. These systems are often expensive and must be coupled with a laptop or large computer. In addition, each sensor has drawbacks such as sample rate, coverage volume, or error rate. Of course, with the integration of all these technologies, a robust navigational system could be produced. However, the more components in a system, the more complex and expensive it becomes, making it more difficult to implement [3] . This is especially true if such designs are installed aftermarket. None of the early designs have made it to market due to their complicated design and cost.
Just recently have we seen consumer wheelchair navigation. In summer 2017, Haneda Airport in Tokyo has enlisted the help of autonomous wheelchairs to transport users to their destination [4]. Using LIDAR and smartphone control, the robot chooses a route and avoids obstacles. The author notes that the price is still out of range for the market, especially those who rely on the support of insurance [5]. Additionally, the robot must have pretrained navigation and knowledge of its surroundings and destinations. Applying a generalized control strategy to someone with unique needs may not serve them in the long run.
As the user likely spends a majority of their day using their powered wheelchair, they must be comfortable with it’s operation. This means that the interaction must be intuitive and flexible. To investigate how a user interacts with the wheelchair and surroundings, Wenlong et al. have developed a virtual prototype to analyze movements and reactions of wheelchair users [6]. By modeling the physical parameters of a smart wheelchair and its sensors, they were able to create a realistic simulation of wheelchair navigation. The user sat in a real chair, but the output of the chair’s control would move a user through a virtual world. Since the computation was done off the chair, the researchers could tune the control laws to fit the user’s preference. This way, unlike the chairs in Tokyo’s airport, the user will feel comfortable and connect with their device.
Interaction with the device is crucial for efficient transportation. Ruiz-Serrano et al. has designed an interface that leverages ultrasonic sensors and voice control [7]. This is beneficial for people who do not have sensorimotor control. Because the sensors provide obstacle avoidance, voice control does not need to handle fine steering capabilities. However, ultrasonic sensors work best indoors, with prominent vertical features like walls, doorways, and ramps. Outdoors, the desired path is not so clear. Sidewalks may end abruptly or take tight turns; crosswalks have little to bound an ultrasonic sensor’s recognition. Again, we see that individual technologies cannot encapsulate the many requirements of mobile users.
With the size and cost of computers shrinking, wheelchair designers have a broad range of technologies to use. No longer do they rely on tethered computers and bulky additions. Prototyping tools like Arduino and Raspberry Pi play a crucial role in the development of robotic systems. These computers have negligible footprint compared to the size of the wheelchair. Additionally, the power requirements are easily satisfied with the on-board battery. With computing and battery technology improving exponentially, smart wheelchairs will soon be commonplace.
The paths that an average wheelchair user will navigate are likely to be relatively consistent in terms of appearance and thus a possible vision-based control strategy such as is found in lane keeping systems used in modern cars may be employed. In general, these systems use normal cameras and processors that analyze lane markings [8]. These systems are frequently limited to highway use because of consistent lane markings and unlikely path blockages. The output of these systems rarely control the car for the user, this would introduce a control system that may have false readings and steer the car off course. Rather, they have passive alerts and haptic feedback. The design discussed in this paper will employ a similar strategy, only limiting the user from steering off course. This minimizes the effect of any error in the system while allowing for direct intervention in the future if appropriate.
Colorspace and Morphology
Computer vision is the backbone of this design. It relies on the translation of the physical world to arrays of values. These values represent the color of a pixel on a screen. These pixels come together to form shapes that can be modified using morphological transformations.
Images are typically stored in three dimensional arrays. The index of the first two dimensions are the coordinates of the pixel in the image and the third dimension is which channel of the colorspace the pixel describes. RGB color space is often used because it translates easily to digital screens and is popular to define colors in graphic design. However, RGB is not exceptional for finding difference in color because the hues are not perceptually related. Meaning, the value of a color in RGB may not be close to another similar color.
Lab color space was developed to fix this. It uses three channels as well: "lightness", "a", and "b". Like its name suggests, lightness describes how bright the color is from 0 to 100, where 0 is black and 100 is white. The "a" channel shows a range of color from -128 to 127, where negative values are green and positive are red. The "b" channel is similar except negative values are blue and positive values are yellow. These values can be expressed in a 3 dimensional space, as shown in Fig. 2. In this case, colors can be considered "similar" if they have a small Euclidean distance. This idea will be used in this design to determine which pixels of the image are the same color and identify them as a sidewalk, the pathway being used to define the desired trajectory of the wheelchair.
In the field of image processing, morphology is important when altering general shapes in an overall picture. Two important functions are "erosion" and "dilation". They decrease or increase the size of the border of the image, respectively. When used in conjunction, they are valuable for removing noise or filling in holes of binary images. Applying erosion then dilation "opens" an image, while applying the opposite "closes" the image. Figure 3 shows the result of opening and closing and how they are effective in both instances.
System Configuration
The Raspberry Pi 3 (RPi) was chosen as the processing unit for this design because the system needed to be small and portable, yet powerful enough to manipulate images in real time. The board, including the case is 6x9x2.5cm. It runs from a 1.2GHz 64-bit quad-core ARMv8 CPU. Other relevant features include 1GB of RAM, GPIO pins and a camera interface. The camera is an RPi designed 8MP sensor that interfaces directly with the board, so the processor can handle the frames directly. The camera sensor fits snugly inside the case, so no additional space is used (see the small hole in Fig. 4). For a wheelchair, the case is mounted with a 3d printed bracket (see Appendix A2) to a rail attached to the side of the power base (Fig. 4). The RPi runs the Raspbian operating system "Jessie", which permits both a terminal and GUI. The whole system is powered by a 7.8Ah, 2.5A battery bank. The battery bank can be upgraded with any off the shelf cell phone charger.
Python was chosen as the coding language because it was pre-compiled on the Raspbian system and supported the OpenCV library. OpenCV provides many computer vision functions that were optimized to work with the chosen camera and manipulations that were desired.
In order to demonstrate the concept, an LED array was constructed in a horseshoe pattern (Fig. 5). When in transit the LED in the direction of the intended path will light. There is also a tactile button that controls a menu and turns the system off. This is mounted to the arm of the wheelchair with velcro tape. The array is connected to the RPi with jumper cables to their designated GPIO pins.
Methods
The full code and updated versions can be found on GitHub. The script is organized into functions operated under a menu that is displayed via LED array. First, the necessary packages need to imported.
import RPi.GPIO as GPIO #used to access GPIO pins
import time #functions that construct a timer
import io #used to set up stream from raspi camera
import numpy as np #optimized array functions
import cv2 #OpenCV computer vision functions
import picamera #retrieves stream from camera.
import logging #allows logging of various errors and info
To finish initialization, the GPIO pins are set up. These use the board numbering system, which counts the pin number based on physical position (Fig. 5).
Next, The LEDs flash in a pattern to indicate the system is ready, then the script enters the menu loop. The loop has an option to start the navigation, start the navigation with recording, or a calibration method. Calibration runs through the navigation process, but collects the centroid position then exits. This is to compensate for off-center mounting on the wheelchair.
def menu():
center = 320 #the center of the screen if the resolution is VGA, the default
global menuv #menuvariable: controlled by the button
global buttonhold
buttonhold = 0
while(True):
while(menuv == 0):
if(buttonhold): #start
buttonhold = 0
logging.info('running...')
ledflash(.1)
main(center,0)
blink(red_r)
while(menuv == 1): #start, with recording
if(buttonhold):
buttonhold = 0
logging.info('running (recorded)...')
ledflash(.1)
main(center,1)
blink(yellow_r2)
while(menuv == 2): #calibrate center
if(buttonhold):
center = 0
while(center == 0):
buttonhold = 0
center = calibrate()
blink(yellow_r)
while(menuv == 3): #exit the script
blink(green_r)
if(buttonhold):
logging.info('exiting...')
ledflash(.01)
ledflash(.01)
menuv = 10
if(buttonhold):
break
else:
menuv = 0
When the main method is entered, it starts by defining constants and then setting up video recording.
#video recording set up
fourcc = cv2.VideoWriter_fourcc(*'XVID')
filename = '/dir/path/Output/' + time.strftime('%Y%m%d_%H%M%S') + '.avi'
out = cv2.VideoWriter(filename,fourcc, 6.0, (640,480))
The script then enters a loop, capturing a still every iteration and converts them to an array.
buttonpress = 0
stream = io.BytesIO()
with picamera.PiCamera(resolution=resolution) as camera:
#camera.start_preview() #enable to see camera on screen, doesn't work with CLI
while(buttonpress == 0):
camera.capture(stream, format='jpeg')
e1 = cv2.getTickCount() #timer begins to record frames/second
# Construct a numpy array from the stream
data = np.fromstring(stream.getvalue(), dtype=np.uint8)
# "Decode" the image from the array
image = cv2.imdecode(data, 1)
stream = io.BytesIO()
To manipulate the image according to human perception, it needs to be converted to Lab format. Next, the area in front of the wheelchair is sampled. Then the rest of the image is filtered depending on how far the color values are away from the mean of the sample. The range was determined beforehand in the "mode" variable, according to manual tests (see Appendix A3). Then using an opening operation, the smaller pieces are removed from the image.
lab = cv2.cvtColor(image, cv2.COLOR_RGB2Lab)
rows, columns, c = lab.shape
middleroi = lab[(int(rows*.8)):rows , (int(columns*.33)):(int(columns*.66))]
#establish range of acceptable colors
mean = cv2.mean(middleroi)
lowermean = np.array([mean[0]-mode[0],mean[1]-mode[1],mean[2]-mode[2]])
uppermean = np.array([mean[0]+mode[0],mean[1]+mode[1],mean[2]+mode[2]])
#check which part of image is in range, then filter out smaller pieces
mask = cv2.inRange(lab, lowermean, uppermean)
res = cv2.bitwise_and(image,image, mask= mask)
emask1 = cv2.erode(mask,None, iterations=8)
emask = cv2.dilate(emask1,None, iterations=12)
The largest of the remaining shapes is found and the centroid is calculated. The ledarray() function blinks the corresponding LED with the horizontal position of the centroid (Fig. 6).
areas = [] #clears the previous areas array
_, contours, hierarchy = cv2.findContours(emask.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for i, c in enumerate(contours): #find the area of each shape
area = cv2.contourArea(c)
areas.append(area)
biggest = np.argmax(areas)
# compute the center of the contour
M = cv2.moments(contours[biggest][:])
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])
ledarray(cX,cY,center)
The loop concludes by calculating the average framerate over 10 second. If the option was chosen the frame is written to the video file.
Finally, the objects are closed so that they initialize smoothly the next run. An example image showing a typical sidewalk with surrounding features is shown in Fig. 7. After applying the previously described image manipulation script, the result is shown in Fig. 8. Here it can be seen that the majority of the features excluding the sidewalk have been removed from the image. The sidewalk is identified as the largest remaining feature and it’s centroid location is identified.
The device would be deemed successful if it was able to navigate around the block with only user intervention following LED output. In addition, the device should be able to record at a suitable framerate to address reliability. A test of the image processing is shown in Fig. 9. The areas without the mask were considered when analyzing the sidewalk area. The pink circle shows the centroid of the sidewalk shape. The x coordinate of this point determines which LED to light up signaling which way the wheelchair needed to turn. Some parts of the video have large shadowed areas that clutters the available data. However, this is only for a short time and will be fixed in subsequent versions. In the second test (Fig. 10), the wheelchair was repeatedly driven towards the edge of the sidewalk. This was done to ensure the LEDs lit with enough reaction time. The closer the wheelchair was to the edge of the sidewalk, the further the LEDs should light from the center. This result is seen in the video of Fig. 11 which was recorded simultaneously with the clip from Fig. 10.
The device reported more than 6 frames per second on average. Moving at a wheelchair speed of 1.3 meters/second, that allows 4.4 frames/meter.meter.
Conclusion
This navigation system represents a tool that can be integrated into a wheelchair to improve steering capabilities. By taking the output from the RPi and diverting it to a control system rather than the LED array, it can ensure the user stays on track. Implementing passive control, where the wheelchair limits undesirable paths, can give the user the freedom to choose which route to take. It is important that the user has primary control of the route, because a false positive could lead to a dangerous situation such as driving over curbs, into traffic, or other undesirable locations.
There are some drawbacks to this vision algorithm, however. It will not work at night or over inconsistent terrain. The current prototyping platform is not very robust in terms of user friendliness and performance. For commercial viability, future versions will require better user interface with more dedicated hardware. A boost to the capacity of the hardware would enable the user to travel faster while maintaining a high framerate. In addition, the system could be more inconspicuous when mounted onto the chair, as the protruding rail extends the width of the chair and could hinder movement through door frames and other small spaces (Fig. 12). In the future, other features such as object detection and route planning could turn a simple technology into an autonomous navigation system. With optimization of the hardware and software, this design can help those who need it, get to where they need to go.
Acknowledgements
This work was completed as part of the Interdisciplinary Research Experiences in Robotics for Assistive Technology REU funded through NSF Award # CNS-1560219.
Citations
Appendices
A1 Bill of Materials
A2 Bracket Design
Multiple views of the bracket for attaching the RPi to this specific wheelchair. A solidworks file can be found in the GitHub repository.
A3 Parameter Calibration
We hypothesized that the standard deviation of the sample part or the image (region of interest (ROI)) would correlate to the parameter bounds specified to control the masking of the sidewalk. There is little correlation with outliers and only slightly better without (r2~=.42, .22, .37). To justify this, we decided the values in the L, a, and b columns subjectively. We adjusted the settings until we felt that the sidewalk was as small as possible while being one shape, without regions missing. Since this was a subjective process, removing outliers seems okay in a sense. In addition, there was a wide range of acceptable values that made the sidewalk clear, therefore, the outliers may fall into the range anyway. Additional testing may be done to pinpoint an acceptable range. Although an acceptable function couldn't be found, static ranges can be used since the data is relatively consistent.
The article presents a camera-based system that can help navigate the powered wheelchair users through unfamiliar public areas. The design focuses on helping people that are not able to use a powered wheelchair on their own and must depend on a caregiver. The proposal is relevant for disabled people, offers a low cost for the system (under $100), and uses entirely open-source technology. Based on it the paper is suitable to be published in The Journal of Open Engineering (TJOE)).
TECHNICAL SOUNDNESS Technical information seems to be precise, showing all necessary steps to build the assistance device. The main idea is well sustained by computer vision approach, such as the translation of the physical world to arrays of values.
CLARITY The writing is clear and concise. Even considering it, I highly recommend a grammatical/reading revision to eliminate any chance of errors. I am not an English native speaker, and so some details may have been unperceived. As the authors apparently are English natives, this will not spend so many efforts.
The paper has eleven figures, and all are essential in my opinion to elucidate any technical doubt to the readers.
The programming lines presented should be numbered as figures as well. Presenting them as tables are an alternative way to solve this issue. As showed it is not adequate because the codes are spread in the text without any reference. If it acceptable by TJOE, please do not consider this comment.
COMPLETENESS The section “Previous Work” cites only four previous investigations on wheelchair navigation systems. This is very poor, and the authors must add more citations. Typically, I have seen 25 to 30 references as a guideline for an adequate literature review. The authors have to be careful to add information in a concise way, grouping the main ideas (03 citations or more per paragraph for example).
“Colorspace and Morphology” and “System Configuration” could be placed after “Methods”. The First describes how the system reads the real world and interpret it. The second presents the system architecture.
Authors should rewrite the “Methods” section. I suggest to put “How” they developed the system, presenting programming language choice and the following needed steps to build the navigation device. The programming lines presented can be a new section called, for example, “System Description and Programming”. This section can appear just after “Methods”.
OPENNESS AND REPRODUCIBILITY As I previously stated technical information seems to be precise, showing all necessary steps to build the assistance device. Openness and Reproducibility are in my opinion the high point of this paper.