Brief analysis of YAP (against Muskrat) score surface
In this article, I'll use offseted flipped grayscale score image <a
href= "http://dcsnake.narod.ru/img_score.jpg" > </a>.
I'll deal with score, but most of argumentation also applies to
separate w/l/t surfaces. pStep1 will be denoted as x1 and pStep2 as x2.
Score distribution <a href= "http://dcsnake.narod.ru/img_hg.png"> </a>
It has one intense peak; however there are three other peaks near low
values + one more distant peak. These peaks corresponds to darkest
lines. As one would expect, distribution is skewed towards low values.
I'll discuss it's form and origination later. Now some stats (in points
and relative to mean):
Variable Value[pts] Value/Mean
Mean 182.1 1.000
Median 195.0 1.071
Mode 209.5 1.150
Max 254.3 1.396
Std. Dev 43.2 0.237
Things worth mentioning are: relatively high standard deviation and
high max/mean ratio. Now I'll state main thesis of this article and
then try to base it.
Statement: Score surface is a superposition of two factors:
independent step effects and steps interactions.
It could be written as: S[x1,x2] = c0*f1[x1]*f2[x2]*f12[x1,x2].
I achieved approximate f1 and f2 by averaging image along corresponding
axes.
<a href src= "http://dcsnake.narod.ru/img_f1.png"> </a>
<a href src= "http://dcsnake.narod.ru/img_f2.png"> </a>
f1 is just what it's supposed to be: fractal function (similar in some
aspects to Dirichlet's function) with depressions around x=coresize/n.
Smaller n - deeper depression. This is evidently caused by
self-overwriting. f2 is much more simplier, it's almost constant on 90%
of interval. There is no overwriting effect due to the fact that in
worst case, paper will overwrite it's already dead copies (against
scanners, this may have a positive effect). However, there are some
things worth mentioning: this function has a wide depresion near low
values (x = 50..230 ). Probably, it happens because larger values could
lead to better spreading. Anyway, depression isn't too deep: it's
magnitude relative to mean is only 6 percents. Second strange thing:
fourier spectrum of this function is quite smooth, but it has strange
peak (only one) at frequency corresponding to size of 8.25 . I have no
explaination for this (Muskrat peculiarities ?). Anyway, this effect is
very small: this harmonic's amplitude related to mean is only 2%.
Conclusion: x1 is MUCH more important than x2. You can have a good
constants with almost any x2, but you cant with any x1.
Actually, f1 and f2 are only approximations (we were not considering
f12 when averaging), and I think that real f2 is just a constant; there
must be no f2 multiplyer. Depression near x=0 is actually a property of
f12.
f12 is created by interactions between constants and visually seen as
set of lines. We can see a bundle of lines crossing zero point. They
have form x1=k*x2, k = 1..+inf and other lines, namely x1=0, x2=0,
x1=-x2. Origination of these lines is clear: self-overwriting. There
are no lines of form x2=k*x1 probably because (correct me if I'm wrong)
paper would overwrite copies that were already executing for some time
and overwriting is only partial (last part that is about to execute is
not modifyed). But there are many other lines, one may say.
Statement: there are NO other lines different to that mentioned above.
They just wrap around because we have circular core. Look at this:
<a href= "http://dcsnake.narod.ru/img_tile.jpg"> </a>
Now we're about to make a very important conclusion: f12 is fully
described as a superposition of these lines. At this point, I think
that I've described the nature of the surface. As you can see, there
are no good paper constants, there are bad paper constants
Consequences:
- This score surface is affected little by muskrat and originates from
just two mathematical factors: f1 and f12 (envelope + set of lines).
Benchmark would give almost identical result (maybe with exception of
x2=k*x1 lines)
- Factors that cause such a surface are scalable -> surface is
scalable. Surface for coresize 8000 looks almost identical (relative
line width is smaller). One can use this surface to optimize his paper
for any coresize (he only have to renormalize values).
- Surface could be calculated theoretically
To illustrate last statement, I've applied Hough transform to adjusted
score surface (only adjustment was inversion; it's needed for transform
to work properly). HT is quite simple: each point of image adds to a
point of accumulators array with corresponding shift and angle. All
points of a line would add to one cell of array -> If we have a line on
image, we'll se a bright point in array. Then I quantized the array
(only main lines were left) and drew these lines. Result is quite
similar to original surface. I used only a few lines and fact that they
wrap around was not used! Thus, more precious modelling is possible. <a
href= "http://dcsnake.narod.ru/img_ht.jpg"> </a>
So, is it smooth? (We're dealing with discrete stuff so don't try to
apply mathematical smoothness definitions) Neither f1 nor f12 is
smooth. Result isn't as smooth as I'd liked to. Local maximums are
chaotically scattered around. <a href=
"http://dcsnake.narod.ru/img_lmax.png"> </a> Gradient optimization is
hardly appliable: take a look at abs(grad) and arcsin(angle(grad)):
<a href= "http://dcsnake.narod.ru/img_gabs.jpg"> </a>
<a href= "http://dcsnake.narod.ru/img_gang.jpg"> </a>
But it's not that bad, it's still quite coherent (similar values are
groupped together). Bomber's score surface is much less smooth I guess.
Taking into account scalability hypothesys, my coherence length
estimation was correct.
That's all for now, I guess now KOTH will be flooded with papers having
constants lying in good areas
