I have a class about neural nets at the university, so I tried to program one for number recognition like this but in LabView. Then I realized: I didn't have enought knowledge for this, so I google neural networks and found this awesome course: https://www.coursera.org/course/neuralnets
So how it's working?
It has a recognition and a learning mode. As input the network get xij, an 8x8 boolean array with the pixels of the image. And there is wkij, the weight; a 3D integer array. It contains for all the 10 numbers (this is the first dimension) for all the pixels (2nd and 3rd dimension) the probability of this pixel is in the image of the number. If the pixel is always in the image, then it's a big positive number. If it's never there, it's big negative number.
In recognition mode it makes a sum of product for each numbers so: sumi=0->7( sumj=0->7( Xij*wkij)) (Xij is +1 if xij true, and -1 if xij false) and the result is a 1 D integer array yk. Then it looks for the maximum of the array y, and the index of the maximum is the result.
In learning mode it needs the image of a number, the number itself and from these information it calculates and changes the weight. It goes through all the numbers and for each number every pixel. If the number is equal the given number AND the pixel is true, then +10 to the weight, if pixel is false -10 to the weight for this case. If the number is NOT equal with the given one, AND the pixel is true, then -1 to the weight and +1 if the pixel is false. Train the system at least 2-3 image for each number, but the more image you train, the better result you get. But you have to use the same amount of image for each numbers, otherwise it will give back an incorrect result.
For the english version click here.
[Frissítés] Azóta lezajlott az eredményhirdetés is, ahol 2. díjat kaptam a biomechatronika szekcióban :P
----------------
A BME-n a 2013-as TDK konferencián ezzel a dolgozattal indultam. Dolgozatom a gépi látás
megvalósításáról szól LabVIEW környezetben. Változatos képfeldolgozási
technológiákat mutatok be folyamatosan szem előtt tartva, hogy a módszerek
szűkebb költségvetésű egyetemi és hallgatói projektekben is használhatóak
legyenek.
Először áttekintem a LabVIEW
programozás alapjait, az adatfolyam programozás mibenlétét, előnyeit,
hátrányait. Csak olyan mélységig, hogy a későbbi fejezetek azok számára is
érhetőek legyenek, akik még nem programoztak LabVIEW környezetben.
Ezután rátérek az egyszerű
képfeldolgozási módszerekre. Itt először a színalapú mintázatkeresést mutatom
be. Ennek keretében a megadott színbeli mintázatot keresi a program a képen.
Majd a fekete-fehér képen történő forma alapú mintázatkeresést írom le. Itt a
különböző módszerekkel monokrómmá tett képen keresi az előre definiált alakzatot.
Majd folytatom a magasabb szintű
képfeldolgozási technológiákkal. Ennek keretében a kétdimenziós
vonalkód (QR kód és Data Matrix) olvasásáról írok, melyek egyre elterjedtebbek
mind az iparban, mind a mindennapi életben.
Ezt követően bemutatom azt, hogy
két egymás melletti, csupán horizontálisan eltolt webkamera képét felhasználva
hogyan lehet - az emberi látáshoz hasonlóan - mélységi képet előállítani. Itt
fontos a megfelelő beállítás, ami a kameráknak több szögből megmutatott
fekete-fehér ráccsal végezhető. Ezután a beállítás természetesen elmenthető,
így ezt elegendő kamera-beállításonként egyszer elvégezni.
Ezt követően a
program képes a webkamerák képét élőben (real time) mélységi képpé alakítani és
azt egy színes grafikonon megjeleníteni. Az összes ponthoz rögtön rendel egy
mélységi koordinátát is, amit egyrészt a színekkel érzékeltet, másrészt ha az
egeret az adott pont fölé visszük, akkor külön is megjeleníti.
Természetesen
ennek a technológiának is megvannak a korlátai, nem várható el, hogy két
középkategóriás webkamera képéből tökéletes háromdimenziós képet kapjunk,
emellett a módszer mérési tartománya nagyban függ a kamerák elrendezésétől.
Egymáshoz nagyon közel elhelyezett kamerák esetén elsősorban közelebbi
célpontok esetén fog pontosabban működni, míg távolabb elhelyezettek elsősorban
távolabbi célpontoknál adnak használhatóbb eredményt.
Az általam készített vi-ok letölthetők innen: https://www.dropbox.com/sh/3k30qofbyssvj01/SzJ0mVAZZ- míg a dolgozatom itt érhető el: https://www.dropbox.com/s/26ui2qnpywgg3yh/TDK.pdf
Kiemelt köszönet illeti konzulensemet, Dr Aradi Petra tanárnőt (BME-MOGI) és Kl3m3n-t az ni.com oldalról.
[update] yaaaay, I got 2nd place on the biomechatronics sections :P
----------
It was my university project, and now I share it with you. Unfortunately it's only in Hungarian language, but I hope, at least the vi-s can help also the non-hungarian readers.
My paper is about machine vision in LabVIEW
environment. Various image processing methods are presented, considering that
it must be able to used in non-industrial environment like low-budget
university and student projects.
First I review the basics of programming in
LabVIEW, the dataflow programming method. I write about it only so quick to
make the following chapters clear for those people who have no previous
experience with LabVIEW programming.
Then I start with the easier image processing
methods. First the color matching: here the program looks for a predefined color
pattern and gives back the coordinates of it. Then the pattern matching algorithm
on a black-white image: first we make the picture monochrome with one of the
various options and then the program searches for the predefined shape or
geometry pattern.
Thereafter the more sophisticated image
processing methods come. I write about the two dimensional bar code (QR
Code and Data Matrix) scanning, which are more and more frequently used in the
industry and all day life.
Next I present how to use two horizontally
offset webcam to create a depth image, like human vision. The correct settings
are very important. It can be done with a black-white grid showed in different
angel to the camera. Of course the configuration can be saved so it has to be
done only once per camera configuration. For more info, see this video:
After that the program is ready to
process the images real time to a depth image. This depth image is presented on
a colored graph, where
each color represent a depth, and the actual depth value can be read by hover
the mouse over the point.
Of course this technology has its own limitations. It
can’t be expected to get a perfect 3D image from the image of two midrange
webcam, but it can help us to choose the closer so the more important part of
the image, where the previous image processing technologies should be used. It
saves us resources and combining with a moving robot, the robot will be able to
turn its head towards the closest activity. The measurement range highly depend
on the configurations of the cameras. Very close cameras give better results in
case of closer targets and cameras with higher distance can be used in farer
targets.
Several years ago I've seen a shooting gallery in the Palace of Miracles. They've solved it with a laser gun (laser with invisible frequency) and a photosensitive target. I wanted to do the same, but much cheaper, without a photosensitive target. So I thought about visible laser dots and image processing with a webcam watching the target. That sounded much cheaper and easier and I also wanted to learn the basics of image processing for other projects, so I've started.
Purpose of the project:
So my project has basically 3 parts: a gun, that emit a red laser dot for a 0.1-0.2 seconds, a target made of paper and a software part with a webcam, that monitors the target, recognizes the laser-dot and calculates your score.
some kind of timing IC with capacitors and resistor, more about it later
bunch of wires
some kind of base. Mine was made out of wood.
for the target:
just design and print something like this in A3 paper (or bigger) (mine's here)
for the software part:
webcam
for the image processing LabVIEW with NI-IMAQ and NI Vision Acquisition. I've used LabVIEW 2009, but I suppose other versions work too.
First step: the gun
It's not so hard to build a gun with a button and a laser diode and if you press the button, the diode will show up. But I needed something more. I wanted the laser dot only showed up for 0.1-0.2 second, because you know, that would be more realistic and otherwise people can just move the dot to the middle of the target. Of course it wouldn't be so hard with arduino, but I wanted to do it simpler with only IC(s), resistor and capacitors.
First I looked around on the internet, and found, that almost every timing problem has a solution, called IC 555. I've made very nicely blinking leds and I can even make a led, that shows up when you press the button and after a controllable time it turns off. The only problem was that this time had to be longer than the button-pressing-time. So I asked about it on a mailing list, and got several answers. They recommended a site where is a description about this, but it just didn't want to work. So I asked my digital-electronics-teacher, and he recommended the 74122 IC (actually he recommended 74121, but I could buy only 74122, but it's almost the same). With this I have solved, that if you pressed and release the button, then it shows up for a changeable time. So here is the wiring:
For debugging I connected an other button, which is a simple press-light wiring. After testing on breadboard, I built it and soldered together:
It looks good, but the laser dot was just too dim, and from 5 m I can't even see it, so I've used the debugging-button for the test and I'm still trying to find a good solution.
That is also a problem, that I forgot to unplug it for the night, and till morning the battery discharged, and I had to buy a new one. So I'll integrate a new switch between the battery and the others.
Second step: the LabVIEW programming
As I mentioned one of my previous post I've found a youtube tutorial for image processing. I've edit it a little bit to pattern the middle of the target and the laser dot:
When the program starts to process the image of the webcam, first it makes the picture black-and-white, after that starts looking for the patterns. If it finds these, it gives back the coordinates of the en-frame rectangle. After that it count the coordinate of the middle-point of the target and the dot (for this we only need the top-left and down-right corner of the rectangle. It also shows these coordinates as an array). Sometimes the processing algorithm loses the match for a second and give back 0, so I've made a subVI for filter it and give back the previous value:
So it knows the coordinates of the 2 middle-points. Then it counts the distance between these points, subtracts it from 100 and this is your score. I store it as unsigned int, so if it is less than 0, it's going to be 0.
But I got a strange error. If I cover the camera, => there isn't any pattern, all of the coordinates will be the same, the previous version of the first coordinate. After 10 minute of google, I found all my subVIs are using the same memory, so I have to clone them to solve the problem. Open the subVI, File > VI Properties and change it so:
After that, it worked fine, here is the result:
Summary
As you can see on the video, sometimes it doesn't recognize the dot. It can be because the target's lines are too thick and the dot gets lost between them. I've showed it to my teacher, and he recommended to mark the edge of the target and only looking for the dot inside of the target. He said, that will make the program run faster and smoother. I also have to rethink the gun, so version 2.0 coming soon.
I have started learning about LabVIEW at the uni. I made some search on it, and find some interesting stuff. First of all, it has an add-on for arduino and it isn't so hard to use it, thanks to that video:
So now I have a nice interface for my arduino, but it's still a little bit new and complicated for me to program in LabVIEW, so I'm going to stay with the original C++ coding. But it's going to be a cool example, when I'll make a presentation with my friend about serial communication in LabVIEW.
Image processing with LabVIEW
I have a big project about a shooting gallery, and for this project I need an image processing method with webcam and computer. First I thought about MATLAB, and found some very interesting videos, but afterwards I wondered, what if LabVIEW can also do this for me. And I found a video, which make my project a lot simpler:
With this tutorial I've made the half of my project in one day. There was only 2 problems during the process: first I can't find a VI, what he had used, so I googled on that and find out, it's part of an add-on so I downloaded it. The other problem comes out, when I start to configure the Vision Assistant. It complained about my screen size/resolution. I've tried to change the resolution, but it didn't help. Finally I had to change the taskbar to autohide to make it work. At the end I finished up here:
As you can see it sees and marks the middle of the target and the laser dot.
Now I only have to make the gun. It's a little bit complicated, because when you press the button the light should show up only for 0.1-0.2 second, and I'd like to solve it without an arduino, only with simple circuit elements. I'm going to see, what I can do.