This is not directly related to CUDA, but if I understood well from @jrzhao42, one can use owl-symbolic to create computation graphs for any(?) owl computation and export it to ONNX format which can then be loaded and run on GPU using performant libraries.
I use this as an opportunity to ping @jrzhao42 which may correct me or give you more accurate information.