Parallelization Tests
Bezier Surfaces
Results over 20 runs with:
npts = 50
mpts = 50
udpts = 2500
wdpts = 2500
1 proc | min | avg | max |
---|---|---|---|
serial | 3.817236 | 4.302450 | 4.587352 |
parallel 1 | 3.803109 | 4.229924 | 4.662597 |
parallel 2 | 3.813949 | 4.231862 | 4.651251 |
parallel 3 | 3.813311 | 4.274616 | 4.871468 |
10 proc | min | avg | max |
---|---|---|---|
serial | 3.839719 | 4.609300 | 6.655391 |
parallel 1 | 3.826293 | 4.541800 | 5.381051 |
parallel 2 | 3.843635 | 4.517371 | 5.319194 |
parallel 3 | 3.834900 | 4.452138 | 5.040315 |
the test made are:
serial: no parallelization esplicitly made.
parallel 1: with
@sync
and@async
and matrices defined inside@sync
block.parallel 2: with
@sync
and@async
and matrices defined before@sync
block.
Result over 50 runs with:
npts = 50
mpts = 50
udpts = 2500
wdpts = 2500
1 proc | average |
---|---|
serial | 4.44101856 |
parallel 1 | 4.45837448 |
parallel 2 | 4.47456267 |
parallel 3 | 4.43402378 |
––––––– | ––––––– |
10 proc | average |
---|---|
serial | 4.61977279 |
parallel 1 | 4.60410667 |
parallel 2 | 4.55359172 |
parallel 3 | 4.51270721 |
The Code
The initialisation of the calculus:
begin
npts = 50
mpts = 50
udpts = 2500
wdpts = 2500
bplus = zeros(3, npts * mpts)
for i = 1 : npts
ki = i/2
for j = 1 : mpts
kj = j /2
r = sqrt((ki)^2+(kj)^2)
if (r == 0)
r = 0.000001
end
ij = (i-1)*npts + j
bplus[1,ij] = i
bplus[2,ij] = j# + i/10
bplus[3,ij] = i+j
end
end
end
and the execution:
begin
minor = 10.0
medium = 0.0
major = 0.0
for i = 1 : 20
x = @elapsed bezsurf(npts, mpts, bplus, 1000, 1000)
if (x < minor)
minor = x
end
if (x > major)
major = x
end
medium += x
end
medium /= 20
println("minimum = $(minor)")
println("average = $(medium)")
println("maximum = $(major)")
end
Conclusions
New tests have been made on small and medium data sets.
All of them have enhanced the same results: over "not so big" data sets no particular changes have been discovered.
At this state of art Julia parallelism seems to be the best choice as manage memory allocation of Activation Registers in the most efficient way. Usually Julia's default parallelization seems to be a little slowler. However this virtual time cost is totally overcome by the smaller amount of memory leack that, in user parallelizations, come at the cost of a high number of Garbage Collector calls.
More test should be made in order to have a precise results sets but it seems useless at the current state of art to waste time over its analysis.