Out-of-Process Execution

Neuropod can run models in different processes using an optimized shared memory implementation with extremely low overhead (~100 to 500 microseconds).

To run a model in another process, set the use_ope option when loading a model:

neuropod::RuntimeOptions opts;
opts.use_ope = true;
Neuropod model(neuropod_path, opts);

Nothing else should need to change.

There are many potential benefits of this approach:

  • Run models that require different versions of Torch or TF from the same "master" process (in progress)
  • Pin the worker process to a specific core to reduce variability in inference time (in progress)
  • Isolate models from each other and from the rest of your program
  • Avoid sharing the GIL between multiple python models in the same process

The worker process can also be run in a docker container to provide even more isolation.

For more details and options, see the OPEOptions struct inside RuntimeOptions.