Computer Vision

From APL
Revision as of 12:45, 4 December 2018 by Aplstudent (talk | contribs) (Raspberry Pi Camera)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


OpenCV is a python compatible open source library used for computer vision. OpenCV offers a wide range of functionality including built in image filters and AI based image/object recognition. So far, we've used OpenCV and python to to identify stop signs in real time, use image filters to correct for lens distortion on the RaspberryPi's camera, and code has been written to implement edge detection.

Object detection with OpenCV is not very difficult once you have a trained haar cascade. Right now we have a usable haar cascade for stop signs, eyes, and faces. These are three commonly identified image types and thus pre-made haar cascades are available on github and through google searches. Although, making your own haar cascade is possible and discussed in many tutorials. To learn more about using OpenCV and haar cascades I recommend the YouTube series, OpenCV with Python for Image and Video Analysis, by sentdex ([1]). This playlist contains multiple relevant videos which go over how to install OpenCV, how to write code to use OpenCV, how haar cascades are made and used with OpenCV.

Also note that currently OpenCV is not installed on the RaspberryPis and will need to be installed on both your computer and the RaspberryPis to make use of the preexisting code.

Basic Computer Vision

While OpenCV is extremely useful and powerful it does most of the work under the hood, that said for those more interested in learning more about computer vision or implementing their own computer vision based code we recommend exploring those avenues as well. We've written some computer vision code, a Gaussian filter to blur images and a Sobel operator to detect edges. While we've found that this code works better for our purposes it is much slower than OpenCV's filters. For those interested in optimizing this code, one should learn how Gaussian blur filters and Sobel operators work and then update the code accordingly. The current plan for the edge detection is to use the camera to identify edges and approximate the distance of the edge to the car. Code has been written to approximate object distances based on camera pixel location but this is only useful if we can identify objects or edges. The ultimate goal will be to identify obstacles and walls and navigate around them.

Sobel Operator and Edge detection

The Wikipedia page [2] for the Sobel operator provides a much more extensive and mathematical explanation than can be provided here but the general idea is to compare the differences in local intensities at every point in an image. To do this one needs to convolve a 3x3 kernel in the x direction and convolve a similar 3x3 kernel in the y direction with a 3x3 pixel matrix of the image. This step must be repeated at every unique 3x3 point on the image. The result of this process creates an image with brighter edges. The new image with enhanced edges can be passed through a filter which takes any pixel below a certain threshold and convert its intensity to zero.

Exploring parameters and different operators may help one tailor their edge detection to the specific camera and lighting conditions they expect to use the edge detection under. Lowering the threshold of the final filter will allow for a wider range of acceptable edges. Using a Scharr operator or a Prewitt operator may prove to be more effective for edge identification with later code. It also may be advisable to explore kernels of a larger size, that is a 5x5 kernel. These types of changes give one more control over the results of the edge detection and once optimized may work better than those provided by libraries like OpenCv.

Gaussian Filter

Once again, Wikipedia will provide a better overview of this subject but we'll go over the general idea here. The Gaussian filter is essentially a low pass filter that reduces noise and detail. This kind of filtering is helpful as a precursor to edge detection because the reduction in noise enhances the differences in local intensity. A combination of a Gaussian Filter and Sobel operator has been written in python and the combination of the two add for additional parameters that can be changed to create the ideal level of edge detection. The Gaussian filter, for example, can have a different standard deviation applied and thus change the level of blurriness. A higher standard deviation will increase the blur.

Raspberry Pi Camera

Accessing the Raspberry Pi's camera is fairly straightforward once the camera is plugged in and enabled. To make sure the camera is enabled SSH into the Raspberry Pi and enter the command "sudo rasp-config" this should pull up a menu which contains a camera option. Select camera and select enable. If the camera is properly connected to the Raspberry Pi you should have access to the camera. Note that the camera is only attached to the top raspberry pi and if you are SSHed into the bottom Raspberry Pi the camera will not work. To test the camera's functionality navigate to the desired directory for the image to be saved and type the command "raspistill -o imagename.jpg". After a few seconds the camera should take the image a save it in the current directory as imagename.jpg.

Distance Calculation

Using the assumption that the car is on a flat surface we can use trigonometry to calculate distances from monocular vision. Code has been written (the file) which calculates the distance an object is based on the pixel value of the object in question. The full paper can be found here [3]. One important caveat is that this process needs to be recalibrated if the camera is bumped out of position. When calibrating the camera it is best to calibrate for a far away distance rather than a close distance. To do this place the car on the ground with a meter stick in front of the car such that the meter stick starts at the base of the car and protrudes out directly in front of the car. Take a picture with Raspberry Pi camera and find the pixel value of a far away distance (1 meter or more away will be sufficient). Now use that pixel value as the input parameter for the distance calculating function. If the output is accurate then the camera is calibrated, if it is not, adjust v0 or the tilt angle parameters until the function returns the correct distance value. Now, check that the other pixel values correspond to the correct distances using the same picture (this will be easy since you can just get the pixel values of different points on the meter stick). Lastly, to somewhat improve this calculation you can use to undistort the image first, this applies a barrel distortion to correct for the camera's pincushion distortion. This doesn't change values very much but can make slight improvements.