I wrote a n-section algorithm for finding roots of a function. The working principle is exactly the same as is bisection method, only the range is divided into N equal parts instead. This is my C code:
/*
Compile with: clang -O3 -o nsect nsect.c -Wall -DCOUNT=5000000 -DNSECT=? -funroll-loops
*/
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
#define N 6
#ifndef COUNT
#error COUNT not defined!
#endif
#ifndef NSECT
#define NSECT 2
#endif
float polynomial[N];
float horner( const float poly[N], float x )
{
float val = poly[N-1];
for ( int i = N - 2; i >= 0; i-- )
val = poly[i] + x * val;
return val;
}
float f( float x )
{
return horner( polynomial, x );
}
float nsect( float a, float b, const float eps_x )
{
float fa = f( a );
float fb = f( b );
if ( fa == 0 ) return a;
else if ( fb == 0 ) return b;
else if ( fa * fb > 0 ) return 0;
float x[NSECT];
float fx[NSECT];
while ( b - a > eps_x )
{
x[0] = a;
if ( ( fx[0] = f( x[0] ) ) == 0 ) return x[0];
int found = 0;
for ( int i = 0; i < NSECT - 1; i++ )
{
x[i + 1] = a + ( b - a ) * (float)( i + 1 ) / NSECT;
if ( ( fx[i + 1] = f( x[i + 1] ) ) == 0 )
return x[i + 1];
else if ( fx[i] * fx[i + 1] < 0 )
{
a = x[i];
b = x[i + 1];
found = 1;
break;
}
}
if ( !found )
a = x[NSECT - 1];
}
return ( a + b ) * 0.5f;
}
int main( int argc, char **argv )
{
struct timeval t0, t1;
float *polys = malloc( COUNT * sizeof( float ) * N );
float *p = polys;
for ( int i = 0; i < COUNT * N; i++ )
scanf( "%f", p++ );
float xsum = 0; // So the code isn't optimized when we don't print the roots
p = polys;
gettimeofday( &t0, NULL );
for ( int i = 0; i < COUNT; i++, p += N )
{
memcpy( polynomial, p, N * sizeof( float ) );
float x = nsect( -100, 100, 1e-3 );
xsum += x;
#ifdef PRINT_ROOTS
fprintf( stderr, "%f\n", x );
#endif
}
gettimeofday( &t1, NULL );
fprintf( stderr, "xsum: %f\n", xsum );
printf( "%f ms\n", ( ( t1.tv_sec - t0.tv_sec ) * 1e6 + ( t1.tv_usec - t0.tv_usec ) ) * 1e-3 );
free( polys );
}
EDIT: This is the command I used to compile the code: clang -O3 -o nsect nsect.c -Wall -DCOUNT=5000000 -DNSECT=? -funroll-loops
.
I ran everything on i7-8700k.
I decided to test the algorithm performance for different N values. The test consists of measuring time needed to find any root in range (-100;100) for each of 5,000,000 polynomials of degree 5. The polynomials are randomly generated and have real coefficients ranging from -5 to 5. The polynomial values are calculated using Horner's method.
These are results I got from running the code 10 times for each N (x=N, y=time [ms]):
My understanding of the worst case performance here is that the amount of work to be done in the main while loop is proportional to N. The main loop needs logN(C) (where C > 1 is a constant - ratio of initial search range to requested accuracy) iterations to complete. This yields following equation:
The plot looks very similar to the violet curve I used above to roughly match the data:
Now, I have some (hopefully correct) conclusions and questions:
- Firstly, is my approach valid at all?
- Does the function I came up with really describe the relation between number of operations and N?
- I think this is the most interesting one - I'm wondering what causes such significant difference between N=2 and all the others? I consistently get this pattern for all my test data.
Moreover:
- That function has minimum in e≈2.718..., which is closer to 3 than to 2. Also, f(3) < f(2) holds. Given that the equation I came up with is correct, I think that would imply that trisection method may actually be more efficient than bisection. It seems a bit counter intuitive, but the measurements seem to acknowledge that. Is this right?
- If so, why does trisection method seem to be so unpopular compared to bisection?
Thank you
clang
compiler and-O3
optimization. I didn't, however, try using-ffast-math
and enabling vectorization. My CPU is i7-8700k.fa == f(x[0])
and possiblyfb == f(x[NSECT])
. This has a massive influcence on the runtime in the bisection and trisection cases, not so much in the higher divisions.