[Back to MATH SWAG index] [Back to Main SWAG index] [Original]
{
>I need to find the minimum and maximum values in large arrays (~20,000
>points) of type word. Looking for a faster way using TP7 (assembler?)
>to do it than:
>
>min := data^[1];
>max := data^[1];
>
>for i := 2 to num_points do
>begin
> data_value := data^[i];
> if data_value < min then min := data_value;
> if data_value > max then max := data_value;
>end;
>
Lets try some asm here:
From: terjem@hda.hydro.com (Terje Mathisen)
}
Procedure FindMinMax(var data; num_points : word; var min, max : word);
Assembler;
asm
push ds
lds si,[data]
mov cx,[num_points]
sub cx,1
jc @done {Empty array! }
mov dx,[si] {Min value}
lea si,[si+2] {Point at second table entry}
mov bx,dx {Max value}
jz @store {Single-entry array!}
@loop:
mov ax,[si]
add si,2
cmp ax,dx
jb @new_min
cmp ax,bx
ja @new_max
dec cx
jnz @loop
jmp @store
@new_min:
mov dx,ax {Save new min value}
dec cx
jnz @loop
jmp @store
@new_max:
mov bx,ax {Save new max value}
dec cx
jnz @loop
@store: {Return the values found!}
lds si,[min]
mov [si],dx
lds si,[max]
mov [si],bx
@done:
pop ds
end;
{
This was written from scratch, so no testing whatsoever! It should be quite
well optimized for both 486 and Pentium-class machines, running in about 10
cycles/word on a 486, and just 4 cycles/word on a Pentium, since the
8 inner-loop instructions will pair perfectly. With 20,000 points in your
array, this should correspond to 6ms on a 486-33, and less than a milli-
second on a Pentium-90.
The most important feature to note is that new extremal values will be quite
rare, averaging just O(log(n)) for a random n-element array. That's why I
jump out of the loop to handle these cases, making the normal case much
faster. With a worst-case, already sorted array, we will find a new max
value on each iteration, which will increase the running time to 12 cycles
on the 486, while the Pentium will stay constant at 4 cycles.
If the array is pre-sorted in reverse (declining) order, the 486 is back to
10 cycles, while the P5 is actually faster, at just 3 cycles/word.
I think the only way to improve on this code is by unrolling it, which will
save up to 4 cycles/word for the 486, and just a single P5 cycle.
PS. Make sure that your array is naturally aligned (16-bit word), if not it
will run a lot slower, esp. on a P5.
}
[Back to MATH SWAG index] [Back to Main SWAG index] [Original]