miniosl.inference¶

inference modules

class miniosl.inference.InferenceModel(device)[source]¶

interface for inference using trained models

eval(input: ndarray, *, take_softmax: bool = False) → Tuple[ndarray, float, ndarray][source]¶: return (move, value, aux) tuple, after softmax

abstract infer(inputs: Tensor) → Tuple[ndarray, ndarray, ndarray][source]¶: return inference results for batch

infer_int8(inputs: ndarray | Tensor) → Tuple[ndarray, ndarray, ndarray][source]¶: an optimized path

infer_one(input: ndarray) → Tuple[ndarray, float, ndarray][source]¶: return inference results for a single instance

class miniosl.inference.OnnxInfer(path: str, device: str)[source]¶

infer(inputs: Tensor)[source]¶: return inference results for batch

infer_iobinding(inputs: Tensor)[source]¶

work in progress: run correctly in small data, but diverges if batchsize x #batches goes beyond a threshold (e.g., 1500)

api: https://onnxruntime.ai/docs/api/python/api_summary.html#data-on-device

infer_naive(inputs: Tensor)[source]¶: inefficient in gpu-cpu transfer if inputs are on already gpu

class miniosl.inference.TorchInfer(model, device: str)[source]¶

infer(inputs: Tensor)[source]¶: return inference results for batch

class miniosl.inference.TorchScriptInfer(path: str, device: str)[source]¶

infer(inputs: Tensor)[source]¶: return inference results for batch

class miniosl.inference.TorchTRTInfer(path: str, device: str)[source]¶

infer(inputs: Tensor)[source]¶: return inference results for batch

miniosl.inference.load(path: str, device: str = '', torch_cfg: dict = {}, *, compiled: bool = False, strict: bool = True, remove_aux_head: bool = False) → InferenceModel[source]¶

factory method to load a model from file

Parameters:

path – filepath,
device – torch device such as ‘cuda’, ‘cpu’,
torch_cfg – network specification needed for TorchInfer.

miniosl.inference¶

miniosl

Navigation

Related Topics