Python bindings
TL;DR
# install vAccel
wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/main/`uname -m`/Release-deb/vaccel-0.6.0-Linux.deb
sudo dpkg -i vaccel-0.6.0-Linux.deb
# install python bindings
wget https://s3.nbfc.io/nbfc-assets/github/python-vaccel/main/x86_64/vaccel-python-0.0.1.tar.gz
pip3 install vaccel-python-0.0.1.tar.gz
# Run an example
cat << EOF > cat.py
from vaccel.session import Session
from vaccel.image import ImageClassify
source = "cat.jpeg"
def main():
ses = Session(flags=3)
print(f'Session id is {ses.id()}')
res = ImageClassify.classify_from_filename(session=ses, source=source)
print(res)
if __name__=="__main__":
main()
EOF
wget https://i.imgur.com/aSuOWgU.jpeg -O cat.jpeg
export VACCEL_BACKENDS=/usr/local/lib/libvaccel-noop.so
export LD_LIBRARY_PATH=/usr/local/lib
#export PYTHONPATH=.
python3 cat.py
Initial Setup
Install vAccel
In order to build the python bindings for vAccel, we first need a vAccel installation. We can either build it from source, or get the latest binary release.
The relevant libs & plugins should be in /usr/local/lib
, along with include
files in /usr/local/include
.
# install vAccel
wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/main/`uname -m`/Release-deb/vaccel-0.6.0-Linux.deb
sudo dpkg -i vaccel-0.6.0-Linux.deb
Install Python
To build and use the Python bindings, we need to have Python3 installed.
sudo apt-get install python3 python3-venv python3-pip
Install from binaries
We provide experimental builds (as a pip wheel and binary package). Get them through the binaries table or just run the following commands:
wget https://s3.nbfc.io/nbfc-assets/github/python-vaccel/main/x86_64/vaccel-python-0.0.1.tar.gz
pip3 install vaccel-python-0.0.1.tar.gz
You should be presented with the following output:
$ pip3 install vaccel-python-0.0.1.tar.gz
Processing ./vaccel-python-0.0.1.tar.gz
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: cffi>=1.0.0 in /usr/local/lib/python3.8/dist-packages (from vaccel-python==0.0.1) (1.15.1)
Requirement already satisfied: pycparser in /usr/local/lib/python3.8/dist-packages (from cffi>=1.0.0->vaccel-python==0.0.1) (2.21)
Building wheels for collected packages: vaccel-python
Building wheel for vaccel-python (pyproject.toml) ... done
Created wheel for vaccel-python: filename=vaccel_python-0.0.1-cp38-cp38-linux_x86_64.whl size=44484 sha256=f0a9e056367207690f08e78cf771f15d00b5e2d7b67fac1d56b17bbe9b6b9509
Stored in directory: /root/.cache/pip/wheels/2e/2e/a0/f07c8ed8d59a2cb16825ef23a4ef15b34d452a3bab962fea61
Successfully built vaccel-python
Installing collected packages: vaccel-python
Attempting uninstall: vaccel-python
Found existing installation: vaccel-python 0.0.1
Uninstalling vaccel-python-0.0.1:
Successfully uninstalled vaccel-python-0.0.1
Successfully installed vaccel-python-0.0.1
Go ahead and run an example!
Build from Source
Prerequisites
In Debian-based systems, you need to have the following packages to build the python bindings for vAccel:
- cmake
- build-essential
- python3-dev
- python3-venv
You can install them using the following command:
sudo apt-get install -y cmake build-essential python3-dev python3-venv
Install tools to build the bindings
Additionally, to build the bindings we need the following packages (installable via pip3). To avoid polluting the host, we could use a virtual environment:
python3 -m venv .venv
. .venv/bin/activate
and install the required packages:
pip3 install datestamp cffi wheel setuptools cmake_build_extension
Get the source code
Get the source code for python-vaccel:
git clone https://github.com/nubificus/python-vaccel.git
cd python-vaccel
Build the Python package
We will create a virtual environment to install the python-vaccel package inside the root directory of python-vaccel.
python3 -m venv venv
Now, go ahead and activate the newly created environment:
. venv/bin/activate
Update pip and install Python's dependencies:
python3 -m pip install --upgrade pip
python3 -m pip install wheel
python3 -m pip install flake8 build setuptools \
cffi pytest pytest-cov datestamp cmake_build_extension
Now let's build the package:
export VACCEL_BACKENDS=/usr/local/lib/libvaccel-noop.so
export LD_LIBRARY_PATH=/usr/local/lib
export PYTHONPATH=.
python3 builder.py
python3 setup.py install
[Optional] Run the tests to make sure everything was build correctly:
python3 -m pytest
Test the installation
To run the tests:
export VACCEL_BACKENDS=/usr/local/lib/libvaccel-noop.so
export LD_LIBRARY_PATH=/usr/local/lib
export PYTHONPATH=$PYTHONPATH:.
pytest
# Test coverage
export VACCEL_BACKENDS=/usr/local/lib/libvaccel-noop.so
export LD_LIBRARY_PATH=/usr/local/lib
export PYTHONPATH=$PYTHONPATH:.
pytest --cov=vaccel tests/
The output should be something like the following:
$ pytest --cov=vaccel tests
=================================================== test session starts ===================================================
platform linux -- Python 3.10.7, pytest-7.2.0, pluggy-1.0.0
rootdir: /home/ananos.linux/develop/python-vaccel
plugins: cov-4.0.0
collected 13 items
tests/test_general.py .. [ 15%]
tests/test_image.py ..... [ 53%]
tests/test_image_genop.py ..... [ 92%]
tests/test_tf.py . [100%]
---------- coverage: platform linux, python 3.10.7-final-0 -----------
Name Stmts Miss Cover
-------------------------------------------
vaccel/__init__.py 9 0 100%
vaccel/error.py 7 3 57%
vaccel/genop.py 111 12 89%
vaccel/image.py 127 15 88%
vaccel/image_genop.py 58 1 98%
vaccel/noop.py 10 2 80%
vaccel/resource.py 11 11 0%
vaccel/session.py 26 5 81%
vaccel/tensorflow.py 206 39 81%
vaccel/test.py 102 102 0%
-------------------------------------------
TOTAL 667 190 72%
=================================================== 13 passed in 0.04s ====================================================
Loading libvaccel
Loading plugins
Loading plugin: /usr/local/lib/libvaccel-noop.so
Loaded plugin noop from /usr/local/lib/libvaccel-noop.so
[noop] Calling no-op for session 2
[noop] Calling Image classification for session 2
[noop] Dumping arguments for Image classification:
[noop] len_img: 79281
[noop] will return a dummy result
[noop] Calling Image detection for session 2
[noop] Dumping arguments for Image detection:
[noop] len_img: 79281
[noop] Calling Image segmentation for session 2
[noop] Dumping arguments for Image segmentation:
[noop] len_img: 79281
[noop] Calling Image pose for session 2
[noop] Dumping arguments for Image pose:
[noop] len_img: 79281
[noop] Calling Image depth for session 2
[noop] Dumping arguments for Image depth:
[noop] len_img: 79281
[noop] Calling Image classification for session 2
[noop] Dumping arguments for Image classification:
[noop] len_img: 79281
[noop] will return a dummy result
[noop] Calling Image detection for session 2
[noop] Dumping arguments for Image detection:
[noop] len_img: 79281
[noop] Calling Image segmentation for session 2
[noop] Dumping arguments for Image segmentation:
[noop] len_img: 79281
[noop] Calling Image pose for session 2
[noop] Dumping arguments for Image pose:
[noop] len_img: 79281
[noop] Calling Image depth for session 2
[noop] Dumping arguments for Image depth:
[noop] len_img: 79281
[noop] Run options -> (nil), 0
[noop] Number of inputs: 1
[noop] Node 0: serving_default_input_1:0
[noop] #dims: 2 -> {1 30}
[noop] Data type: 1
[noop] Data -> 0xaaaaf5135600, 120
[noop] Number of outputs: 1
[noop] Node 0: StatefulPartitionedCall:0
Shutting down vAccel
Run a simple python application
To see python vAccel bindings in action, let's try the following example:
Simple Example
Download an adorable kitten photo:
wget https://i.imgur.com/aSuOWgU.jpeg -O cat.jpeg
Create a new python file called cat.py and add the following lines:
from vaccel.session import Session
from vaccel.image import ImageClassify
source = "cat.jpeg"
def main():
ses = Session(flags=3)
print(f'Session id is {ses.id()}')
res = ImageClassify.classify_from_filename(session=ses, source=source)
print(res)
if __name__=="__main__":
main()
Now, when you run that python file, you can see the dummy classification tag for that image:
$ export VACCEL_BACKENDS=/usr/local/lib/libvaccel-noop.so
$ export LD_LIBRARY_PATH=/usr/local/lib
$ export PYTHONPATH=.
$ python3 cat.py
Loading libvaccel
Loading plugins
Loading plugin: /usr/local/lib/libvaccel-noop.so
Loaded plugin noop from /usr/local/lib/libvaccel-noop.so
Session id is 1
[noop] Calling Image classification for session 1
[noop] Dumping arguments for Image classification:
[noop] len_img: 54372
[noop] will return a dummy result
This is a dummy classification tag!
Shutting down vAccel
Jetson example
To use vAccel on a more real-life example we'll use the jetson-inference framework. This way we will be able to perform image inference on a GPU and get something more useful than a dummy classification tag ;-)
Let's re-use the python program from the simple example above.
x86_64
We will need to use a host with an NVIDIA GPU (our's is just an RTX 2060
SUPER
) and jetson-inference installed. To facilitate dependency resolving we
use a container image on a host with nvidia-container-runtime installed.
so, assuming our code is in /data/code
let's spawn our container and see this in action:
docker run --gpus 0 --rm -it -v/data/code:/data/ -w /data nubificus/jetson-inference-updated:x86_64 /bin/bash
Afterwards, the steps are more or less the same as above. Install the vAccel package:
root@32e90efe86b9:/data/code# wget https://s3.nbfc.io/nbfc-assets/github/vaccelrt/main/x86_64/Release-deb/vaccel-0.6.0-Linux.deb
--2022-11-05 13:43:43-- https://s3.nbfc.io/nbfc-assets/github/vaccelrt/main/x86_64/Release-deb/vaccel-0.6.0-Linux.deb
Resolving s3.nbfc.io (s3.nbfc.io)... 84.254.1.240
Connecting to s3.nbfc.io (s3.nbfc.io)|84.254.1.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2124230 (2.0M) [application/x-debian-package]
Saving to: 'vaccel-0.6.0-Linux.deb'
vaccel-0.6.0-Linux.deb 100%[=======================================================>] 2.03M --.-KB/s in 0.06s
2022-11-05 13:43:43 (33.8 MB/s) - 'vaccel-0.6.0-Linux.deb' saved [2124230/2124230]
root@32e90efe86b9:/data/code# dpkg -i vaccel-0.6.0-Linux.deb
Selecting previously unselected package vaccel.
(Reading database ... 60677 files and directories currently installed.)
Preparing to unpack vaccel-0.6.0-Linux.deb ...
Unpacking vaccel (0.6.0) ...
Setting up vaccel (0.6.0) ...
Get and install the jetson plugin:
root@32e90efe86b9:/data/code# wget https://s3.nubificus.co.uk/nbfc-assets/github/vaccelrt/plugins/jetson_inference/master/x86_64/vaccelrt-plugin-jetson-0.1-Linux.deb
--2022-11-05 14:45:53-- https://s3.nubificus.co.uk/nbfc-assets/github/vaccelrt/plugins/jetson_inference/master/x86_64/vaccelrt-plugin-jetson-0.1-Linux.deb
Resolving s3.nubificus.co.uk (s3.nubificus.co.uk)... 84.254.1.240
Connecting to s3.nubificus.co.uk (s3.nubificus.co.uk)|84.254.1.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13146 (13K) [application/x-debian-package]
Saving to: 'vaccelrt-plugin-jetson-0.1-Linux.deb'
vaccelrt-plugin-jetson-0.1-Linu 100%[=======================================================>] 12.84K --.-KB/s in 0.001s
2022-11-05 14:45:53 (8.39 MB/s) - 'vaccelrt-plugin-jetson-0.1-Linux.deb' saved [13146/13146]
root@32e90efe86b9:/data/code# dpkg -i vaccelrt-plugin-jetson-0.1-Linux.deb
(Reading database ... 60748 files and directories currently installed.)
Preparing to unpack vaccelrt-plugin-jetson-0.1-Linux.deb ...
Unpacking vaccelrt-plugin-jetson (0.1) over (0.1) ...
Setting up vaccelrt-plugin-jetson (0.1) ...
Install the bindings:
root@32e90efe86b9:/data/code# .vaccel-venv/bin/pip3 install https://s3.nbfc.io/nbfc-assets/github/python-vaccel/main/x86_64/vaccel-python-0.0.1.tar.gz
Collecting https://s3.nbfc.io/nbfc-assets/github/python-vaccel/main/x86_64/vaccel-python-0.0.1.tar.gz
Downloading https://s3.nbfc.io/nbfc-assets/github/python-vaccel/main/x86_64/vaccel-python-0.0.1.tar.gz (23 kB)
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing wheel metadata ... done
Collecting cffi>=1.0.0
Using cached cffi-1.15.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (442 kB)
Collecting pycparser
Using cached pycparser-2.21-py2.py3-none-any.whl (118 kB)
Building wheels for collected packages: vaccel
Building wheel for vaccel (PEP 517) ... done
Created wheel for vaccel: filename=vaccel-python-0.0.1-cp38-cp38-linux_x86_64.whl size=44494 sha256=a63bd263ba219e985821fc34416dc8f12ced508eb8e265b78a896ac2ed375f72
Stored in directory: /root/.cache/pip/wheels/a6/e6/1c/4c91a42c1cad7e5e4ca86acd006bcded10cba25e85268e81ef
Successfully built vaccel
Installing collected packages: pycparser, cffi, vaccel
Successfully installed cffi-1.15.1 pycparser-2.21 vaccel-python-0.0.1
Now let's go ahead and run the example!
root@32e90efe86b9:/data/code# export LD_LIBRARY_PATH=/usr/local/lib/
root@32e90efe86b9:/data/code# export VACCEL_BACKENDS=/usr/local/lib/libvaccel-jetson.so
root@32e90efe86b9:/data/code# export VACCEL_IMAGENET_NETWORKS=/data/code/networks
root@32e90efe86b9:/data/code# .vaccel-venv/bin/python3 cat.py
Loading libvaccel
Loading plugins
Loading plugin: /usr/local/lib/libvaccel-jetson.so
Loaded plugin jetson-inference from ./libvaccel-jetson.so
imageNet -- loading classification network model from:
-- prototxt ./local_net//googlenet.prototxt
-- model ./local_net//bvlc_googlenet.caffemodel
-- class_labels ./local_net//ilsvrc12_synset_words.txt
-- input_blob 'data'
-- output_blob 'prob'
-- batch_size 1
[TRT] TensorRT version 8.5.1
[TRT] loading NVIDIA plugins...
[TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::CoordConvAC version 1
[TRT] Registered plugin creator - ::CropAndResizeDynamic version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::DecodeBbox3DPlugin version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::GenerateDetection_TRT version 1
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 2
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
[TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
[TRT] Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1
[TRT] Registered plugin creator - ::NMSDynamic_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::PillarScatterPlugin version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::ProposalDynamic version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::ROIAlign_TRT version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::ScatterND version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::VoxelGeneratorPlugin version 1
[TRT] detected model format - caffe (extension '.caffemodel')
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] [MemUsageChange] Init CUDA: CPU +307, GPU +0, now: CPU 320, GPU 223 (MiB)
[TRT] Trying to load shared library libnvinfer_builder_resource.so.8.5.1
[TRT] Loaded shared library libnvinfer_builder_resource.so.8.5.1
[TRT] [MemUsageChange] Init builder kernel library: CPU +262, GPU +74, now: CPU 636, GPU 297 (MiB)
[TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[TRT] native precisions detected for GPU: FP32, FP16, INT8
[TRT] selecting fastest native precision for GPU: FP16
[TRT] attempting to open engine cache file ./local_net//bvlc_googlenet.caffemodel.1.1.8501.GPU.FP16.engine
[TRT] loading network plan from engine cache... ./local_net//bvlc_googlenet.caffemodel.1.1.8501.GPU.FP16.engine
[TRT] device GPU, loaded ./local_net//bvlc_googlenet.caffemodel
[TRT] Loaded engine size: 15 MiB
[TRT] Trying to load shared library libcudnn.so.8
[TRT] Loaded shared library libcudnn.so.8
[TRT] Using cuDNN as plugin tactic source
[TRT] Using cuDNN as core library tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +576, GPU +236, now: CPU 977, GPU 477 (MiB)
[TRT] Deserialization required 488590 microseconds.
[TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +13, now: CPU 0, GPU 13 (MiB)
[TRT] Trying to load shared library libcudnn.so.8
[TRT] Loaded shared library libcudnn.so.8
[TRT] Using cuDNN as plugin tactic source
[TRT] Using cuDNN as core library tactic source
[TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 977, GPU 477 (MiB)
[TRT] Total per-runner device persistent memory is 94720
[TRT] Total per-runner host persistent memory is 147088
[TRT] Allocated activation device memory of size 3612672
[TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +3, now: CPU 0, GPU 16 (MiB)
[TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See `CUDA_MODULE_LOADING` in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars
[TRT]
[TRT] CUDA engine context initialized on device GPU:
[TRT] -- layers 72
[TRT] -- maxBatchSize 1
[TRT] -- deviceMemory 3612672
[TRT] -- bindings 2
[TRT] binding 0
-- index 0
-- name 'data'
-- type FP32
-- in/out INPUT
-- # dims 3
-- dim #0 3
-- dim #1 224
-- dim #2 224
[TRT] binding 1
-- index 1
-- name 'prob'
-- type FP32
-- in/out OUTPUT
-- # dims 3
-- dim #0 1000
-- dim #1 1
-- dim #2 1
[TRT]
[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=1 c=3 h=224 w=224) size=602112
[TRT] binding to output 0 prob binding index: 1
[TRT] binding to output 0 prob dims (b=1 c=1000 h=1 w=1) size=4000
[TRT]
[TRT] device GPU, ./local_net//bvlc_googlenet.caffemodel initialized.
[TRT] imageNet -- loaded 1000 class info entries
[TRT] imageNet -- ./local_net//bvlc_googlenet.caffemodel initialized.
class 0281 - 0.219604 (tabby, tabby cat)
class 0282 - 0.062927 (tiger cat)
class 0283 - 0.018173 (Persian cat)
class 0284 - 0.017746 (Siamese cat, Siamese)
class 0285 - 0.483398 (Egyptian cat)
class 0287 - 0.180664 (lynx, catamount)
imagenet: 48.33984% class #285 (Egyptian cat)
imagenet: attempting to save output image
imagenet: completed saving
imagenet: shutting down...
48.340% Egyptian cat
Shutting down vAccel
aarch64
For aarch64 things are more or less the same. We run the example on a Jetson Xavier AGX, so jetson-inference and the nvidia stack is included in the Jetson Linux variant (L4T).
The steps to take only refer to installing jetson-inference libs, vAccel and the python bindings so assuming there's a Jetson Linux distro with Jetpack installed:
- install jetson-inference:
git clone --recursive https://github.com/dusty-nv/jetson-inference
cd jetson-inference
mkdir build
cd build
cmake ../
make install
- install vAccel:
wget https://s3.nubificus.co.uk/nbfc-assets/github/vaccelrt/main/aarch64/Release-deb/vaccel-0.6.0-Linux.deb
dpkg -i vaccel-0.6.0-Linux.deb
- install the jetson plugin:
wget https://s3.nubificus.co.uk/nbfc-assets/github/vaccelrt/plugins/jetson_inference/master/aarch64/vaccelrt-plugin-jetson-0.1-Linux.deb
dpkg -i vaccelrt-plugin-jetson-0.1-Linux.deb
- install python bindings in a virtual env:
python3 -m venv .vaccel-venv
.vaccel-venv/bin/pip3 install https://s3.nbfc.io/nbfc-assets/github/python-vaccel/main/aarch64/vaccel-python-0.0.1.tar.gz
- run the example:
# .vaccel-venv/bin/python3 cat.py
Loading libvaccel
2022.11.05-20:25:12.79 - <debug> Initializing vAccel
2022.11.05-20:25:12.79 - <debug> Created top-level rundir: /run/user/0/vaccel.G2ZhVr
Loading plugins
Loading plugin: /usr/local/lib/libvaccel-jetson.so
2022.11.05-20:25:12.91 - <debug> Registered plugin jetson-inference
2022.11.05-20:25:12.91 - <debug> Registered function image classification from plugin jetson-inference
2022.11.05-20:25:12.91 - <debug> Registered function image detection from plugin jetson-inference
2022.11.05-20:25:12.91 - <debug> Registered function image segmentation from plugin jetson-inference
Loaded plugin jetson-inference from /usr/local/lib/libvaccel-jetson.so
2022.11.05-20:25:12.96 - <debug> session:1 New session
Session id is 1
2022.11.05-20:25:12.96 - <debug> session:1 Looking for plugin implementing image classification
2022.11.05-20:25:12.96 - <debug> Found implementation in jetson-inference plugin
imageNet -- loading classification network model from:
-- prototxt /home/ananos/develop/jetson-inference/data/networks/googlenet.prototxt
-- model /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel
-- class_labels /home/ananos/develop/jetson-inference/data/networks/ilsvrc12_synset_words.txt
-- input_blob 'data'
-- output_blob 'prob'
-- batch_size 1
[TRT] TensorRT version 8.4.1
[TRT] loading NVIDIA plugins...
[TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[TRT] Registered plugin creator - ::NMS_TRT version 1
[TRT] Registered plugin creator - ::Reorg_TRT version 1
[TRT] Registered plugin creator - ::Region_TRT version 1
[TRT] Registered plugin creator - ::Clip_TRT version 1
[TRT] Registered plugin creator - ::LReLU_TRT version 1
[TRT] Registered plugin creator - ::PriorBox_TRT version 1
[TRT] Registered plugin creator - ::Normalize_TRT version 1
[TRT] Registered plugin creator - ::ScatterND version 1
[TRT] Registered plugin creator - ::RPROI_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[TRT] Registered plugin creator - ::BatchTilePlugin_TRT version 1
[TRT] Could not register plugin creator - ::FlattenConcat_TRT version 1
[TRT] Registered plugin creator - ::CropAndResize version 1
[TRT] Registered plugin creator - ::CropAndResizeDynamic version 1
[TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_Explicit_TF_TRT version 1
[TRT] Registered plugin creator - ::EfficientNMS_Implicit_TF_TRT version 1
[TRT] Registered plugin creator - ::ProposalDynamic version 1
[TRT] Registered plugin creator - ::Proposal version 1
[TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[TRT] Registered plugin creator - ::Split version 1
[TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[TRT] Registered plugin creator - ::InstanceNormalization_TRT version 2
[TRT] Registered plugin creator - ::CoordConvAC version 1
[TRT] Registered plugin creator - ::DecodeBbox3DPlugin version 1
[TRT] Registered plugin creator - ::GenerateDetection_TRT version 1
[TRT] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
[TRT] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
[TRT] Registered plugin creator - ::NMSDynamic_TRT version 1
[TRT] Registered plugin creator - ::PillarScatterPlugin version 1
[TRT] Registered plugin creator - ::VoxelGeneratorPlugin version 1
[TRT] Registered plugin creator - ::MultiscaleDeformableAttnPlugin_TRT version 1
[TRT] detected model format - caffe (extension '.caffemodel')
[TRT] desired precision specified for GPU: FASTEST
[TRT] requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT] [MemUsageChange] Init CUDA: CPU +187, GPU +0, now: CPU 211, GPU 3925 (MiB)
[TRT] [MemUsageChange] Init builder kernel library: CPU +131, GPU +123, now: CPU 361, GPU 4067 (MiB)
[TRT] native precisions detected for GPU: FP32, FP16, INT8
[TRT] selecting fastest native precision for GPU: FP16
[TRT] found engine cache file /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel.1.1.8401.GPU.FP16.engine
[TRT] found model checksum /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel.sha256sum
[TRT] echo "$(cat /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel.sha256sum) /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel" | sha256sum --check --status
[TRT] model matched checksum /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel.sha256sum
[TRT] loading network plan from engine cache... /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel.1.1.8401.GPU.FP16.engine
[TRT] device GPU, loaded /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel
[TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 245, GPU 4080 (MiB)
[TRT] Loaded engine size: 14 MiB
[TRT] Deserialization required 15317 microseconds.
[TRT] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +13, now: CPU 0, GPU 13 (MiB)
[TRT] Total per-runner device persistent memory is 75776
[TRT] Total per-runner host persistent memory is 110304
[TRT] Allocated activation device memory of size 5218304
[TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +5, now: CPU 0, GPU 18 (MiB)
[TRT]
[TRT] CUDA engine context initialized on device GPU:
[TRT] -- layers 72
[TRT] -- maxBatchSize 1
[TRT] -- deviceMemory 5218304
[TRT] -- bindings 2
[TRT] binding 0
-- index 0
-- name 'data'
-- type FP32
-- in/out INPUT
-- # dims 3
-- dim #0 3
-- dim #1 224
-- dim #2 224
[TRT] binding 1
-- index 1
-- name 'prob'
-- type FP32
-- in/out OUTPUT
-- # dims 3
-- dim #0 1000
-- dim #1 1
-- dim #2 1
[TRT]
[TRT] binding to input 0 data binding index: 0
[TRT] binding to input 0 data dims (b=1 c=3 h=224 w=224) size=602112
[TRT] binding to output 0 prob binding index: 1
[TRT] binding to output 0 prob dims (b=1 c=1000 h=1 w=1) size=4000
[TRT]
[TRT] device GPU, /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel initialized.
[TRT] loaded 1000 class labels
[TRT] imageNet -- /home/ananos/develop/jetson-inference/data/networks/bvlc_googlenet.caffemodel initialized.
class 0281 - 0.222134 (tabby, tabby cat)
class 0282 - 0.063147 (tiger cat)
class 0283 - 0.018521 (Persian cat)
class 0284 - 0.018234 (Siamese cat, Siamese)
class 0285 - 0.477663 (Egyptian cat)
class 0287 - 0.182722 (lynx, catamount)
imagenet: 47.76627% class #285 (Egyptian cat)
imagenet: attempting to save output image
imagenet: completed saving
imagenet: shutting down...
47.766% Egyptian cat
2022.11.05-20:25:16.39 - <debug> session:1 Free session
Shutting down vAccel
2022.11.05-20:25:16.45 - <debug> Shutting down vAccel
2022.11.05-20:25:16.45 - <debug> Cleaning up plugins
2022.11.05-20:25:16.45 - <debug> Unregistered plugin jetson-inference