Saturday 30 January 2016

Development of Machine Vision Module for Small Robots (Updated 25 Dec 2021)

First Prototype - Dec 2015
Currently working on a machine vision module for robotic systems, on a part-time basis.  I am inspired by the Pixy CMUcam5 Image Sensor ( http://www.cmucam.org/projects/cmucam5). Pixy uses an asymmetric dual-core ARM micro-controller, the NXP's LPC4300 which contains ARM Cortex M4 and M0 cores each.  The ARM Cortex M4 core is used for the digital image processing tasks, while the M0 core in Pixy is probably used to stream the image data through the USB port to a computer for real-time viewing.  I am thinking of something simpler, a system with only a single-core micro-controller.  Using a single-core controller would require some compromise in the the image resolution and the frame rate the micro-controller can handle, and the speed with which it can stream an image to a computer.  The machine vision should be able to process a low-resolution color image at a rate of 10 fps (frame-per-second).

- Selecting the Image Sensor -
I have obtained a CMOS VGA camera (TCM8230) from Sparkfun back in 2013.  There are a lot of online resources on using TCM8230 image sensor in 2015, for example here.  This camera can also output QVGA, QQVGA, CIF and sub-CIF image formats (by sub-sampling the native VGA image) which are ideal for small robots. I reason that the image resolution need not be high for tasks such as color recognition, edge or shape detection.  A low resolution image frame allows the usage of a 'small' micro-controller as the image processor.   This camera is now quite hard to find but any VGA camera in principle can be used, for instance those camera based on OV7670 image sensor, which is popular with Arduino.

- Selecting a Suitable Micro-Controller - 
I settled with using ARM Cortex M4 as the image processor after experimenting with the dsPIC33 and PIC32MX families from Microchip Technologies, and also the Altera's Cyclone IV FPGA for more than a year.  The latest PIC32MZ series from Microchip Technologies is probably suitable too, but the earlier chip (pic32MZ1024ecg) that I tested with in Dec 2014 was full of bugs that I decided it was not worth investing the time (The newer revisions of PIC32MZ chips in 2015 should be a better choice).  Many versions of ARM Cortex M4 chips are available, and after looking around the offering from ST Microelectronics, Freescale (now part of NXP), Atmel and Texas Instruments, I decided on using Atmel's SAM4S family micro-controller (http://www.atmel.com/products/microcontrollers/arm/sam4s.aspx) due to the availability of a good IDE for software development, the Atmel Studio 7 (as of this writing) (http://www.atmel.com/microsite/atmel-studio/) which is based on Microsoft Visual Studio. Atmel Studio 7 supports the free GCC ARM toolchain besides booting up pretty fast.  Also I reckon using Atmel's chip might allows me to hack into Arduino boards with ARM micro-controllers in future.  Like most micro-controller manufacturers nowadays, Atmel Studio provides Atmel's Software Framework (ASF), a collection of C-compatible routines to access the micro-controller peripheral.  ASF helps a user to built a working software for ARM Cortex quickly. Moreover for advanced user Atmel Studio also support ARM's CMSIS (Cortex-M Software Interface Standards) framework, which is a set of C-style declarations to access the device pins and registers in the Cortex-M family processors, and start-up codes.  Note that the ASF actually builds on-top of the CMSIS framework.

- Prototype Development - 

I started by ordering the Atmel's SAM4S Xplained Pro Evaluation board to try out.  Eventually after I am convince that the SAM4S micro-controller is suitable, I ordered the Atmel-ICE Basic debugger and programming tool, a few SAM4SD16 chips and proceed to built a first prototype as shown below.  The Atmel-ICE supports both JTAG and SWD protocols for programming the micro-controller.  SWD (I think it stands for Serial Wire Debug) is a serial protocol that is a subset of JTAG and uses only 6 pins.  You can see the 2x3 (e.g. 6-ways) Header on veroboard in the picture.  The QFP Adapter PCB can be obtained from many sources, in this part of the world I got it from Cytron Technologies. I am planning to stream the image data to a computer using a UART port in the micro-controller, at 115.2 kbps or higher data rate, either using the famous HC-05 Bluetooth Module or a standard serial-to-USB converter. Interface between the VGA camera and the micro-controller is via 8-way parallel interface, with direct memory access (DMA) function activated.  

Figure 1 - Prototype, the SAM4SD16 chip is soldered on a QFP Adapter.

 

 

 

 

 

 

 

Figure 2 - Schematic of the TCM8230 Camera Board

 

 

 

 

 

 

 

 

 

- Buffering camera image on micro-controller SRAM -

On power up the micro-controller will initialize the camera internal control registers using I2C interface. Once fully initialized the camera will stream out the pixels for each image frame line-by-line via 8-way parallel interface and three control lines: VD (Vertical strobe, one for each frame), HD (Horizontal strobe, one for each line of pixel data), PCLK (Pixel clock, two pulses for each pixel). Each pixel consists of 2 bytes (16-bits) in RGB565 format. The micro-controller performs this capture automatically using DMA (direct memory access) and store pixel data as 32-bits word in the on-chip SRAM for further processing by the micro-controller. The image is stored as a two-dimensional array of 32-bits unsigned integer format. From the RGB565 data, we can convert it to RGB888 and other color spaces. I will share the schematic and software details in future updates of this post. 


Second Prototype - Jan 2016
A second prototype with proper PCB and casing is shown below.  It is now more compact.  I also developed a simple software on the computer using Visual Basic .NET to receive the serial data transmitted by the ARM Cortex M4 micro-controller, and display the image pixel-by-pixel on a Windows Form via Bluetooth!  Due to the limit bandwidth of the serial link (the driver of the Bluetooth module that I used, HC-05, can only support 115.2 kilobits-per-second, more on this later), I choose to transmit only the luminance data (gray scale) of the image to the computer. Basically I extract the luminance value from the RGB565. Since luminance value fits into a single byte, it takes less time to transmit one frame of gray scale image than a color image.

Also shown below is the simple PC software written with Visual Basic .NET, which I called RobotEyeMon1 to grab the image transmitted from the machine vision module. 

 


  





- Frame-Rate, Image Resolution and Real-Time Image Processing -
The SAM4S ARM Cortex M4 runs at 120 MHz and has 160 kbyte of SRAM.  With this I am able to achieve a camera frame-rate of 10 fps (frame-per-second) at an image resolution of 120x160 pixels (QQVGA).  As you can see the resolution is pretty low, but I reckon it is sufficient for a small robot.  There is some latency to perform simple real-time image processing algorithms in between each frame.  Basically the frame is updated every 100 msec. Within this 100 msec, I extract the luminance, hue and saturation values for each pixel. I then have around 8 msec bandwidth for executing image processing routines!  Thus far I have managed to squeeze two simple image processing algorithms into the processor firmware within this 8 msec limit, these are:
   (1) Finding the 'bright spots' in the gray scale image.
   (2) Edge detection using Laplace or Sobel approach.

As you can see both operations only works with the luminance value of the pixels.

- Transmitting the Image to a Remote Device (Personal Computer) -
As we are using Bluetooth protocol to send the image data to the computer at 115.2 kbps, not every frame captured by the camera can be transmitted.  Essentially only 1 frame is send to the PC for every 40 frames captured by the vision module.  At present the frame-rate from the CMOS camera in the machine vision module is 10 fps, thus only 1 frame is send to the PC every 4 seconds!!!  One factor limiting the speed is I can't increase the data rate of 115.2 kbps, the Bluetooth driver my computer (running Windows) can only support up to 115.2 kbps, although the HC-05 module can send up to 1 Mbps.  Using wired USB-to-Serial cable, I am able to send at a higher data rate of 234 kbps or higher.  The image transmitted to the PC (or smartphone/tablet in future) is not meant to allow the PC to be used as FPV (first-person view) device, it is more for allowing me to see the view seen by the camera on the machine vision module during the development of the image processing algorithms.

The screenshot of RobotEyeMon1 software:



Saving and Retrieving Image Frame from Computer Hard Disk - Oct 2016
I have added a feature in the RoboEyeMon1 Visual Basic .NET codes to export a still image of the camera into a binary file onto the computer hard disk.  This allows me to try out standard image processing routines in Matlab, Scilab or other mathematical computing software.  I added a button in the RobotEyeMon1 interface, where if it is clicked the auxiliary frame buffer content will be saved to a file in the hard disk.  The Visual Basic .NET codes are shown below:

Private Sub ButtonSaveImage_Click(sender As Object, e As EventArgs) Handles ButtonSaveImage.Click
Dim nYindex As Integer
Dim nXindex As Integer
Dim strString As String

'Note: The image frame is stored as an array of 2D pixels, each with value
'from 0-255.
  'The x index corresponds to the row, and y index corresponds to the height.
  'Here we write the pixel data to the file row-by-row.


Using sw As StreamWriter = File.CreateText(mFilePath)
   For nYindex = 1 To mintImageHeight - 1
      strString = ""                  'Clear the string first.
      For nXindex = 1 To mintImageWidth - 1
         strString = strString & ChrW(mbytPixel(nXindex, nYindex))
 'Form a line representing a line of pixel data.  Somehow
         'the binary value in mbytPixel() array is store as it
         'is into the strString array with ChrW() function is used.
      Next
      sw.WriteLine(strString)    
      'Write the pixel data to file, a NL/CR character will be
  ‘appended at the end of each line.
   Next

    End Using

        '---Alternative approach using WriteAllBytes() method---
        'Dim bytData(0 To 160) As Byte

        '-Write 1st line to file-
        'nYindex = 1
        'For nXindex = 1 To mintImageWidth - 1
        '  bytData(nXindex) = mbytPixel(nXindex, nYindex)
        'Next
        'bytData(160) = 255  'A value to indicate end-of-line (EOL).
        'My.Computer.FileSystem.WriteAllBytes(mFilePath, bytData, False)

        'For nYindex = 2 To mintImageHeight - 1
        '  For nXindex = 1 To mintImageWidth - 1
        '    bytData(nXindex) = mbytPixel(nXindex, nYindex)
        '  Next
        '  bytData(160) = 255  'A value to indicate end-of-line (EOL).
        '  My.Computer.FileSystem.WriteAllBytes(mFilePath, bytData, True)
        'Next

End Sub
Listing 1 – VB codes.

The 8-bits grayscale pixel data is saved line-by-line into the file, ending with a NL/CR (newline or carriage return) character.  Assuming the filename is “testimage.txt”, the following Scilab (www.scilab.org) codes will read the image file, and create a gray scale image into graphic display window.

                    clear;
                    ImageWidth = 160;
                    ImageHeigth = 120;
                    Hgraf = scf();                  // Get the handle to current graphic window.
                    path = cd("D:\Scilab");
                    Hfile = mopen("testimage.txt",'rb');    // Open a text file for reading
                                                                                // (don't skip NL/CR character: 0x0D)
                    M = zeros(ImageHeigth+1,ImageWidth+1);  // Matrix to hold the gray scale image data
                    Mt = zeros(ImageWidth+1,ImageHeigth+1); // Another matrix also to hold the gray scale image data.

                    for i=1:ImageHeigth-1               // ImageHeigth-1 due to the last line is not
                                         // exported from the camera monitor software
                       for j=1:ImageWidth
                         M(i,j) = mget(1,'c',Hfile); // Read 1 pixel data, convert to double.
                       end
                      mgeti(1,'c',Hfile);             // Read the newline/carriage return character.
                    end

                    // Optional - to rotate the image to the correct orientation.
                    for i=1:ImageHeigth-1               // Transpose and flip the image so that
                       for j=1:ImageWidth              // it appear at the correct orientation.
                        Mt(j,ImageHeigth-i) = M(i,j)
                      end
                    end

                    mclose(Hfile);                      // Release file handle.

                    //Hgraf.color_map = graycolormap(127);    // Set current graphic window color map
                    Hgraf.color_map = graycolormap(255);    //to gray scale, 127 or 255 levels.
                    row = 1:ImageWidth + 1;
                    col = 1:ImageHeigth + 1;
                    grayplot(row,col,Mt);               // Plot the transpose and flip image.
Listing 2 – Scilab codes.


 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 1 – Sample image frame captured with the machine vision module using RobotEyeMon1.



 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 2 – Another sample image captured, after performing Sobel Edge Detection algorithm (running on the machine vision module firmware) on the image frame.  Here any output value below 25 is set to 0.


Fully Integrated Version 1.0  -  April 2017
I designed a 4-layers PCB to neatly integrate all the components into a compact form factor, combining the camera board with the micro-controller board described above. Thus this is called Version 1.0. The PCB has some errors as can be seen in the jumper wires in Figure 2 below. If you are interested to work on this, the schematic and codes for Version 1.0 can be found here.  In this version two new algorithms are added to the firmware (the numbering continue from above):

    (3) Detecting obstacle on the floor using contrast analysis.
    (4) Color object detection using the hue information on the pixels.

Figure 1 - The back view

Figure 2 - The front view.




Version 1.5C - Dec 2019
I started to explore using a more powerful micro-controller to replace the SAM4SD16 ARM Cortex-M4F process in 2018.  The natural choice is to migrate the circuit and the codes to the more powerful ARM Cortex-M7 micro-controller.  Again I am using the offering from Microchip/Atmel as I am already familiar with the Eco-system and development environment.  I started off by making a  prototype of the core micro-controller circuit using Microchip's ATSAMS70J19 as shown below.  This is a an ARM Cortex-M7 micro-controller with maximum clock frequency of 300 MHz, 512kB Flash program memory and 256 kB SRAM (static RAM) in a 64-pins TQFP package. The PCB of this prototype is made using a desktop CNC milling machine. 
 
 

Figure 1 - Prototype ARM Cortex-M7 core circuit, V1.5A.

After spending a few months to learn the intricacies of working with ARM Cortex-M7 architecture (they are more complicated than Cortex-M4 due to the presence of instruction and data cache and must be handled carefully, say when one is using peripheral with direct memory access), I managed to port the codes to ATSAMS70J19.  Subsequently I upgraded the processor to ATSAMS70J20, which has same maximum clock frequency and pin-out, but with larger memory (1024kB Flash program memory and 384 kB SRAM).  The final version is shown in Figure 2A to 2C.  I call this version V1.5C as I still retain the original TCM8230 CMOS camera, only the micro-controller is upgraded.  Perhaps in future when I change to a better CMOS camera, I would call it V2.XX !  Also the schematic and PCB have gone through 3 iterations in part to make the circuit more robust towards EMI (electromagnetic interference) and correct some design errors, thus the suffix 'C'.

Figure 2A - Front view of MVM V1.5C.


Figure 2B - Back view of MVM V1.5C. 
 
 
Figure 2C - Side view of MVM V1.5C.
 
As with version 1.0, V1.5C can stream image via serial port UART2 either using a USB-to-Serial converter or HC-05 BlueTooth module to a monitor software on computer (which I now called the MVM Monitoring Software).  But please be warned, as with version 1.0, the streaming rate is really slow as the processor is only single core, and priority is given to the image processing tasks. The video streaming function is not for the module to work as a remote wireless camera, but more for the user to see the perspective of the camera during trouble-shooting of the image processing algorithms.  In addition to displaying the streamed image, the MVM Monitoring Software can also save the displayed image in raw binary format or bitmap format, allowing the image to be retrieved by routines developed in MATLAB, Scilab or Python.  Figures 3 and 4 illustrate the process.
 
 


Figure 3 - Saving the displayed image in MVM Monitoring Software onto computer hard disk.
 


 
Figure 4 - Retrieving the saved image using Scilab and Python.
 
- Design Files, Source Code -
I have shared the schematic and source codes which can be found in this Github link as I no longer has the luxury of time to work on this project. Moreover note that in Figure 2B, the back of V1.5C has a series of solder pads for SPI communication protocol.  This allows the module to drive an external TFT LCD display in addition to streaming the video via UART2 port. For the moment due to the single-core nature of the processor, I am not able to stream video to the UART2 port and SPI port simultaneously, it's "either or" as I am using two different sets of codes!  Both versions of the codes are in the Github link. The video below provide some demonstration of the MVM V1.5C capabilities.  More capabilities will be added in future. Video 1 runs image processing routines at QQVGA resolution and 21 frames-per-second, and stream video to UART2 port (at lower frame rate).   

Video 1: General introduction and image processing

Video 2 and 3 below demonstrate the MVM V1.5C running the firmware supporting TFT LCD display. The TFT LCD version supports two camera resolutions, QQVGA and QVGA.  Selecting which resolution to use is as simple as commenting and uncommenting the #define __QQVGA__ line in the LCD driver header file. The detail is provided in the quick start guide of MVM V1.5C. Running of image processing algorithms is only supported in QQVGA resolution mode. In Video 2, in which MVM V1.5C runs at QVGA resolution at 15 frames-per-second and send the color image to a TFT LCD display.  No image processing routine is run here and the raw color information of each pixel is send to the TFT LCD display.  The TFT LCD display is sourced from Adafruit.  In Video 3 the same firmware is run, but camera resolution has been set to QQVGA with 21 frame-per-second speed. A color recognition image processing algorithm (IPA) described previously is run to illustrate the speed of the system.  The IPA detects yellow-green color pixels, and highlight them in the LCD display. 
 
Video 2: Driving an external TFT LCD display (320x240 pixels, LCD controller = ILI9430 or ILI9431) at camera resolution of QVGA with no image processing algorithm.
 
       



Video 3: Driving an external TFT LCD display (320x240 pixels, LCD controller = ILI9430 or ILI9431) at camera resolution of QQVGA with image processing algorithm.  Here I am running a sample image processing which detects yellow-green color pixels.  I use this to recognize a yellow tennis ball.
 


Streaming Video over WiFi with ESP8266, Version 1.5C - Dec 2021
In 2020-2021, I worked on this project on-and-off whenever I have some spare time. One of the things I worked on is to try to stream the image from the camera using WiFi link. This should speed up the frame rate a bit. I experimented with the ESP8266 module  i.e. the ESP-01 by Ai thinker, setting it as a TCP (transmission control protocol) server, and using it to send the image line-by-line over TCP to a computer. After surmounting many obstacles, I am able to stream video image from MVM V1.5C to a computer using WiFi. At the moment I can get between 1.0 to 1.5 frames-per-second (fps), although still slow, is nevertheless better than the 0.25-0.3 fps that I am getting with Bluetooth. ESP8266 is not designed to stream huge amount of data continuously but rather intermittent burst of bytes for IOT devices. There are some overheads in the communication between the MVM V1.5C micro-controller and ESP8266 that slows that the process. Still, I think it should be possible to increase the frame-rate further to beyond 3 fps in the future. Besides this progress, another progress is I am able to incorporate a convolutional neural network (CNN) into the MVM V1.5C firmware, using this CNN to analyze gray scale image for obstacle on the floor. The CNN is described in another post. The video below illustrates the latest progress.