Feature Request: color-valued mask files?
Description
May be this request is way too high on Kenneth's Nerdometer, because only a
few people use color cameras or because by implementing it, motion would
get a too big footprint in RAM and CPU... or because my description is too
lengthy or boring...
Anyway, I try to decribe an even more smart "smart mask"
, which takes
more advantage of what we know from the image data.
Summary of current mask types
Motion currently has two types of mask:
- [1] a fixed (grey-valued) mask (defined by an external *.pgm file)
- [2] a smart mask (which is build up and changed dynamically in RAM, while motion is running)
From "SupportQuestion2006x05x23x081859" we know, that the internal smart mask
is binary-valued and ANDed with the fixed mask (whatever that means in the case
where a fixed mask pixel is grey? Will it count as 0 or 1?).
This way, one has a nice combination to tailor the motion detection to
one's needs by cutting out uninteresting areas by the fixed mask and prevent
motion from detecting motion in areas that are more or less "constantly"
changing like the leaves of trees or bushes in the wind. This is really great!! But...
Problem:
The disprofit or drawback is, that motion becomes completely blind in those
tree areas, which is because we haven't taken advantage of all our knowledge
about the disturbing motion.
Two examples:
(1) When we have leaves moving, these leaves are green (or red), so it would be
sufficient to reduce sensitivity in green (or red) and keeping full sensitivity
in all other colors. This way we could still detect small animals like e.g.
birds (black, grey, bluish) or squirrels (brown) or a walking human being
moving in front of the "constantly moving" bush, since all those objects
have non-green components. (Assuming that the human is hopefully not wearing
a green army uniform
(2) When having windy conditions and cumulus clouds during the whole day, your
lawn area will change rapidly it's color between "dark green" (cloud in front
of sun), yellowish-green and may be even yellow-white or plain white (when
sunlight hits lawn unobstracted). This could partly be handled by the existing
lightswitch feature of motion, but often situations are more complex, so that
a single percentage number will not cover all your cases.
The new proposed "coloured smart mask" would become insensitive to those
changes but motion will not become blind on the lawn (or deliver detected motions
all day) and still be able to make a good surveillance job on that changing area.
Benefit:
The "colored smart mask" is, so to speak, still transparent to things which
might be of interest, while blocking an annoying foreground or background in
the image.
E.g. a bigger orange flag somewhere in front of your image moving
in the wind will not prevent you from detecting a person moving behind it...
It would be a great benefit for motion to be able to distinguish between changes
in different colors and allow motion users to filter colors individually with a
colored fixed mask and colored smart mask.
I hope, I do not receive a 10 on Kenneth's Nerdometer for this idea...
--
OliverLudwig - 24 May 2006
Follow up
Suggesting an advanced/complex feature is one thing. Having a proposal how to implement it is another.
Since I'am not a video expert, I cannot implement the idea myself or decribe in detail, what should be coded. But what follows is a general idea, of what has to be computed.
We need basically to compute two things and then use them to define where "significant real motion" has taken place. In the case of color images we do all operations on each channel R, G, and B individually.
First, if not already done in motion, we need to have a so-called "moving mean" or "exponential moving average" (MA), sometimes called "reference image".
This could be computed by:
MA_t = alpha * img_t + (1-alpha) * MA_{t-1};
with some alpha between 0 and 1. (This is analogous to the smart_mask_speed for defining how fast past images should be neglected from the current mean image.)
Secondly, we need an "exponential moving standard deviation" (or "moving variance" (MV)) as well. This could be computed by the same principle:
MV_t = beta * (img_t - MA_t)*(img_t - MA_t) + (1-beta) * MV_{t-1} * MV_{t-1};
with some beta between 0 and 1.
When computed with too low precision, the above formula is numerically instable, but there exist long-known "one-pass" stable methods, if it turns out to be neccessary -- see e.g. the following article Richard J. Hanson: Stably Updating Mean and Standard Deviation of Data. Commun. ACM 18(1): 57-58 (1975).
So, now we have a mean and variance. The last step is rather easy now. From statistics we know, that an outlier could be defined as a value which is off the mean by a distance f * sigma. Where f is a factor between 1.5 and 3, and sigma is the standard deviation (which is just the square root of the variance).
For each pixel value, we now check whether it is in the range [MA_t - f*sqrt(MV_t)...MA_t + f*sqrt(MV_t)]. If so, it is well inside the expected range of changes and is ignored. If it lies outside the interval, it is an unexpectedly large difference and hence has to be considered a significant change.
Computing MA and MV could be done on the fly and should not put much load on the motion program. Whether it is feasable to make the invervall checking in realtime, I actually don't know. This is probably the hard part.
The method is working well with color, since it is possible to have a large variance in, say, the R channel and a low value in G and a medium value in B...
--
OliverLudwig - 31 May 2006
One small detail.
Motion is entirely internally based on the
YUV420P format.
And all Motion detection is based on the Y channel only. Ie. the black and white image. So implementing what you suggest would be a complete rewrite of Motion and I truely doubt it would really work since the real world objects consists of a mix of the colours red, green and blue.
The original author of Motion wrote the detection in RGB and has poor practical results from it and decided to base the detection on the light intensity part of each pixel. Ie. the B/W contents. And this is why Motion is based on
YUV420P. Quite many webcams have their native format in
YUV420P. Which also means that converting to RGB will cause great changes in colour contents since the colour resolution is 1/4 of the B/W in YUV420.
--
KennethLavrsen - 31 May 2006
[...]One small detail.[...] This turns out to be essential
[...]The original author of Motion wrote the detection in RGB [...]
OK, I didn't knew about this detail that trying to use color has already
"been proven" to yield poor practical results...
[...]colour resolution is 1/4 of the B/W in YUV420[...]
OK, I've googled a bit to understand more about YUV and
YUV420P. So, I guess,
forget the whole idea about color... Sorry for the inconvenience due to my lack of real-time video knowledge...
--
OliverLudwig - 01 Jun 2006
I hacked in some use of color information and found it very effective - I have lots of green that moves (trees mostly) and I could filter it out effectively. On the other hand, there is little in nature that is red and high sensitivity to it is fine.
Converting to RGB, doing a mask and converting back takes some CPU but is not a rewrite at all. Nor is there any need for color resolution loss.
--
JonZeeff - 12 Jul 2011