We don't have sincos in the Stdlib?

If I am right, we have sin and cos but not the combination of the two.
On modern hardware, I think there are special instructions to compute both at once.
Because, sometimes you need both the sin and cos of an angle.

There is an ongoing discussion in there: Stdlib.sincos [feature request] · Issue #12463 · ocaml/ocaml · GitHub

AFAIK only when using x87 FPU (i.e. when compiling with ‘-m32’, not by default), but even in that case they are so inaccurate that by default the compiler would never use them. See ⚙ D36344 [X86] Don't use fsin/fcos/fsincos instructions ever and Intel Underestimates Error Bounds by 1.3 quintillion | Random ASCII – tech blog of Bruce Dawson.

Also when using SSE there are no HW instructions to compute these AFAIK, they are computed in software in libm. But that software implementation is apparently faster than calling fsincos (see assembly - Calling fsincos instruction in LLVM slower than calling libc sin/cos functions? - Stack Overflow).

Having said that even the software implementation of sincos (if available) might be faster than calling sin and cos separately.
YMMV, I’d suggest to do some measurements (the results may also be different based on which libm implementation is used, or which architecture it is compiled for)

1 Like

I will try to benchmark a binding to the C sincos function.

That is the point. Most modern software implementations of sine and cosine rely on the following trigonometric identities: sin (a + b) = sin a * cos b + cos a * sin b and cos (a + b) = cos a * cos b - sin a * sin b. So, if your mathematical library has computed one (meaning it has already computed the four values cos a, sin a, cos b, and sin b), it is only two multiplications and one addition away from computing the other.

That is why both GCC and Clang optimize the following code into a single call to sincos(x) followed by an addition. (For Clang, you might need to pass -ffast-math.)

double foo(double x) { return sin(x) + cos(x); }

Perhaps a way to implement would be to add a configure test for sincos (it seems to be available at least on GNU libc, musl libc and FreeBSD):
https://man7.org/linux/man-pages/man3/sincos.3.html
https://man.freebsd.org/cgi/man.cgi?query=sincos&sektion=3&format=html
(Some other platforms like Android implement it too, but just as calls to sin and cos in sequence).

If the function is not available then an ifdef macro could chose to call sin and cos sequentially. To make ‘noalloc’ direct C calls from OCaml possible (without intermediate stubs) perhaps there could be an ‘ml_sincos’ function that is either defined to be equal to ‘sincos’ or the above fallback implementation. This could be prototyped as a small library on ‘opam’ initially and the performance benefits and usefulness measured.
(Although ‘sincos’ needs to return 2 values, so it is not immediately obvious to me whether a ‘noalloc’ implementation would be possible here, unless you’ve preallocated an array or bigarray to store the results in)

So FreeBSD means that Mac OS X also has it?
And any BSD in fact?

On MacOS 13.4, the function is called __sincos.

What about windows also?

If you have any numerical code dealing with rotations, as soon
as you have an angle, you are usually interested both by its sinus and its cosinus.
This is useful for computer graphics but also molecular simulation software.

One recent example I plan to use:
https://openaccess.thecvf.com/content/CVPR2022/html/Alexa_Super-Fibonacci_Spirals_Fast_Low-Discrepancy_Sampling_of_SO3_CVPR_2022_paper.html

@UnixJunkie for anything complex (not as in complex numbers, lol) I would recommend just using more advanced C libraries and OCaml bindings to them: