The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

InferenceUsingTFHubMobileNetV2Model - Using TensorFlow to do image classification using a pre-trained model

SYNOPSIS

The following tutorial is based on the Image Classification with TensorFlow Hub notebook. It uses a pre-trained model based on the MobileNet V2 architecture trained on the Imagenet dataset. Running the code requires an Internet connection to download the model (from Google servers) and testing data (from Wikimedia servers).

Please look at the SECURITY note regarding running models as models are programs. If you would like to visualise a model, you can use Netron on the .pb file.

COLOPHON

The following document is either a POD file which can additionally be run as a Perl script or a Jupyter Notebook which can be run in IPerl (viewable online at nbviewer). If you are reading this as POD, there should be a generated list of Perl dependencies in the CPANFILE section.

If you are running the code, you may optionally install the tensorflow Python package in order to access the saved_model_cli command, but this is only used for informational purposes.

TUTORIAL

Load the library

First, we need to load the AI::TensorFlow::Libtensorflow library and more helpers. We then create an AI::TensorFlow::Libtensorflow::Status object and helper function to make sure that the calls to the libtensorflow C library are working properly.

  use strict;
  use warnings;
  use utf8;
  use constant IN_IPERL => !! $ENV{PERL_IPERL_RUNNING};
  no if IN_IPERL, warnings => 'redefine'; # fewer messages when re-running cells
  
  use feature qw(say state);
  use Syntax::Construct qw(each-array);
  
  use lib::projectroot qw(lib);
  use AI::TensorFlow::Libtensorflow;
  
  use URI ();
  use HTTP::Tiny ();
  use Path::Tiny qw(path);
  
  use File::Which ();
  
  use List::Util ();
  
  use Data::Printer ( output => 'stderr', return_value => 'void', filters => ['PDL'] );
  use Data::Printer::Filter::PDL ();
  use Text::Table::Tiny qw(generate_table);
  
  use Imager;
  
  my $s = AI::TensorFlow::Libtensorflow::Status->New;
  sub AssertOK {
      die "Status $_[0]: " . $_[0]->Message
          unless $_[0]->GetCode == AI::TensorFlow::Libtensorflow::Status::OK;
      return;
  }
  AssertOK($s);

In this notebook, we will use PDL to store and manipulate the ndarray data before converting it to a TFTensor. The following functions help with copying the data back and forth between the two object types.

An important thing to note about the dimensions used by TensorFlow's TFTensors when compared with PDL is that the dimension lists are reversed. Consider a binary raster image with width W and height H stored in row-major format (meaning the pixels in the first row are stored next to each other followed by the second row and so on). With PDL, the dimension list for this will be [ W H ] and for TFTensor the dimension list will be [ H W ]. TensorFlow uses the same convention for the dimension list as NumPy with the faster changing dimensions at the end of the dimension list while PDL is the opposite (see Dima Kogan's library and talk for more on this).

This difference will be explained more concretely further in the tutorial.

Future work will provide an API for more convenient wrappers which will provide an option to either copy or share the same data (if possible).

  use PDL;
  use AI::TensorFlow::Libtensorflow::DataType qw(FLOAT);
  
  use FFI::Platypus::Memory qw(memcpy);
  use FFI::Platypus::Buffer qw(scalar_to_pointer);
  
  sub FloatPDLTOTFTensor {
      my ($p) = @_;
      return AI::TensorFlow::Libtensorflow::Tensor->New(
          FLOAT, [ reverse $p->dims ], $p->get_dataref, sub { undef $p }
      );
  }
  
  sub FloatTFTensorToPDL {
      my ($t) = @_;
  
      my $pdl = zeros(float,reverse( map $t->Dim($_), 0..$t->NumDims-1 ) );
  
      memcpy scalar_to_pointer( ${$pdl->get_dataref} ),
          scalar_to_pointer( ${$t->Data} ),
          $t->ByteSize;
      $pdl->upd_data;
  
      $pdl;
  }

The following is just a small helper to generate an HTML <table> for output in IPerl.

  use HTML::Tiny;
  
  sub my_table {
      my ($data, $cb) = @_;
      my $h = HTML::Tiny->new;
      $h->table( { style => 'width: 100%' },
          [
              $h->tr(
                  map {
                      [
                          $h->td( $cb->($_, $h) )
                      ]
                  } @$data
              )
          ]
      )
  }

This is a helper to display images in Gnuplot for debugging, but those debugging lines are commented out.

  sub show_in_gnuplot {
      my ($p) = @_;
      require PDL::Graphics::Gnuplot;
      PDL::Graphics::Gnuplot::image( square => 1, $p );
  }

Fetch the model and labels

We are going to use an image classification model from TensorFlow Hub based on the MobileNet V2 architecture. We download both the model and ImageNet classification labels.

  # image_size => [width, height] (but usually square images)
  my %model_name_to_params = (
      mobilenet_v2_100_224 => {
          handle     => 'https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/classification/5',
          image_size => [ 224, 224 ],
      },
      mobilenet_v2_140_224 => {
          handle => "https://tfhub.dev/google/imagenet/mobilenet_v2_140_224/classification/5",
          image_size => [ 224, 224 ],
      },
  );
  
  my $model_name = 'mobilenet_v2_100_224';
  
  say "Selected model: $model_name : $model_name_to_params{$model_name}{handle}";

STREAM (STDOUT):

  Selected model: mobilenet_v2_100_224 : https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/classification/5

RESULT:

  1

We download the model and labels to the current directory then extract the model to a folder with the name given in $model_base.

  my $model_uri = URI->new( $model_name_to_params{$model_name}{handle} );
  $model_uri->query_form( 'tf-hub-format' => 'compressed' );
  my $model_base = substr( $model_uri->path, 1 ) =~ s,/,_,gr;
  my $model_archive_path = "${model_base}.tar.gz";
  
  use constant IMAGENET_LABEL_COUNT_WITH_BG => 1001;
  my $labels_uri = URI->new('https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt');
  my $labels_path = ($labels_uri->path_segments)[-1];
  
  my $http = HTTP::Tiny->new;
  
  for my $download ( [ $model_uri  => $model_archive_path ],
                     [ $labels_uri => $labels_path        ]) {
      my ($uri, $path) = @$download;
      say "Downloading $uri to $path";
      next if -e $path;
      $http->mirror( $uri, $path );
  }
  
  use Archive::Extract;
  my $ae = Archive::Extract->new( archive => $model_archive_path );
  die "Could not extract archive" unless $ae->extract( to => $model_base );
  
  my $saved_model = path($model_base)->child('saved_model.pb');
  say "Saved model is in $saved_model" if -f $saved_model;
  
  my @labels = path($labels_path)->lines( { chomp => 1 });
  die "Labels should have @{[ IMAGENET_LABEL_COUNT_WITH_BG ]} items"
      unless @labels == IMAGENET_LABEL_COUNT_WITH_BG;
  say "Got labels: ", join( ", ", List::Util::head(5, @labels) ), ", etc.";

STREAM (STDOUT):

  Downloading https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/classification/5?tf-hub-format=compressed to google_imagenet_mobilenet_v2_100_224_classification_5.tar.gz
  Downloading https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt to ImageNetLabels.txt
  Saved model is in google_imagenet_mobilenet_v2_100_224_classification_5/saved_model.pb
  Got labels: background, tench, goldfish, great white shark, tiger shark, etc.

RESULT:

  1

Load the model and session

Now we can load the model from that folder with the tag set [ 'serve' ] by using the LoadFromSavedModel constructor to create a ::Graph and a ::Session for that graph.

  my $opt = AI::TensorFlow::Libtensorflow::SessionOptions->New;
  
  my @tags = ( 'serve' );
  my $graph = AI::TensorFlow::Libtensorflow::Graph->New;
  my $session = AI::TensorFlow::Libtensorflow::Session->LoadFromSavedModel(
      $opt, undef, $model_base, \@tags, $graph, undef, $s
  );
  AssertOK($s);

STREAM (STDERR):

  2022-12-21 15:44:50.158520: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: google_imagenet_mobilenet_v2_100_224_classification_5
  2022-12-21 15:44:50.165765: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }
  2022-12-21 15:44:50.165804: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: google_imagenet_mobilenet_v2_100_224_classification_5
  2022-12-21 15:44:50.166232: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
  To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
  2022-12-21 15:44:50.193676: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:357] MLIR V1 optimization pass is not enabled
  2022-12-21 15:44:50.202488: I tensorflow/cc/saved_model/loader.cc:229] Restoring SavedModel bundle.
  2022-12-21 15:44:50.523651: I tensorflow/cc/saved_model/loader.cc:213] Running initialization op on SavedModel bundle at path: google_imagenet_mobilenet_v2_100_224_classification_5
  2022-12-21 15:44:50.593023: I tensorflow/cc/saved_model/loader.cc:305] SavedModel load for tags { serve }; Status: success: OK. Took 434509 microseconds.

We can examine what computations are contained in the graph in terms of the names of the inputs and outputs of an operation found in the graph by running saved_model_cli.

  if( File::Which::which('saved_model_cli')) {
      local $ENV{TF_CPP_MIN_LOG_LEVEL} = 3; # quiet the TensorFlow logger for the following command
      system(qw(saved_model_cli show),
          qw(--dir)           => $model_base,
          qw(--tag_set)       => join(',', @tags),
          qw(--signature_def) => 'serving_default'
      ) == 0 or die "Could not run saved_model_cli";
  } else {
      say "Install the tensorflow Python package to get the `saved_model_cli` command.";
  }

STREAM (STDOUT):

  The given SavedModel SignatureDef contains the following input(s):
    inputs['inputs'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 224, 224, 3)
        name: serving_default_inputs:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['logits'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1001)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict

RESULT:

  1

The above saved_model_cli output shows that the model input is at serving_default_inputs:0 which means the operation named serving_default_inputs at index 0 and the output is at StatefulPartitionedCall:0 which means the operation named StatefulPartitionedCall at index 0.

It also shows the type and shape of the TFTensors for those inputs and outputs. Together this is known as a signature.

For the input, we have (-1, 224, 224, 3) which is a common input image specification for TensorFlow Hub. This is known as channels_last (or NHWC) layout where the TensorFlow dimension list is [batch_size, height, width, 3] where 3 represents the RGB channels where each element is normalised to the range [0, 1]. The -1 in the batch_size dimension represents an unknown dimension size so the model can accept any number of images. Note that the TFTensor dimension list has the dimension that changes the fastest in memory at the end of the list, so the float32_t channels for a single pixel as stored next to each other followed by the next pixel in the same row.

For the output, we have (-1, 1001) which is [batch_size, num_classes] where the elements are scores that the image received for that ImageNet class.

So let's use those names to create our ::Output ArrayRefs.

  my %ops = (
      in  => $graph->OperationByName('serving_default_inputs'),
      out => $graph->OperationByName('StatefulPartitionedCall'),
  );
  
  die "Could not get all operations" unless List::Util::all(sub { defined }, values %ops);
  
  my %outputs = map { $_ => [ AI::TensorFlow::Libtensorflow::Output->New( { oper => $ops{$_}, index => 0 } ) ] }
      keys %ops;
  
  p %outputs;
  
  say "Input: " , $outputs{in}[0];
  say "Output: ", $outputs{out}[0];

STREAM (STDOUT):

  Input: serving_default_inputs:0
  Output: StatefulPartitionedCall:0

STREAM (STDERR):

{
    in    [
        [0] AI::TensorFlow::Libtensorflow::Output {
                index   0,
                oper    AI::TensorFlow::Libtensorflow::Operation {
                    Name         "serving_default_inputs",
                    NumInputs    0,
                    NumOutputs   1,
                    OpType       "Placeholder"
                }
            }
    ],
    out   [
        [0] AI::TensorFlow::Libtensorflow::Output {
                index   0,
                oper    AI::TensorFlow::Libtensorflow::Operation {
                    Name         "StatefulPartitionedCall",
                    NumInputs    263,
                    NumOutputs   1,
                    OpType       "StatefulPartitionedCall"
                }
            }
    ]
}

RESULT:

  1

Now we can get the following testing images from Wikimedia.

  my %images_for_test_to_uri = (
      "tiger" => "https://upload.wikimedia.org/wikipedia/commons/b/b0/Bengal_tiger_%28Panthera_tigris_tigris%29_female_3_crop.jpg",
      #by Charles James Sharp, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
      "bus" => "https://upload.wikimedia.org/wikipedia/commons/6/63/LT_471_%28LTZ_1471%29_Arriva_London_New_Routemaster_%2819522859218%29.jpg",
      #by Martin49 from London, England, CC BY 2.0 <https://creativecommons.org/licenses/by/2.0>, via Wikimedia Commons
      "car" => "https://upload.wikimedia.org/wikipedia/commons/4/49/2013-2016_Toyota_Corolla_%28ZRE172R%29_SX_sedan_%282018-09-17%29_01.jpg",
      #by EurovisionNim, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons
      "cat" => "https://upload.wikimedia.org/wikipedia/commons/4/4d/Cat_November_2010-1a.jpg",
      #by Alvesgaspar, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons
      "dog" => "https://upload.wikimedia.org/wikipedia/commons/archive/a/a9/20090914031557%21Saluki_dog_breed.jpg",
      #by Craig Pemberton, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons
      "apple" => "https://upload.wikimedia.org/wikipedia/commons/1/15/Red_Apple.jpg",
      #by Abhijit Tembhekar from Mumbai, India, CC BY 2.0 <https://creativecommons.org/licenses/by/2.0>, via Wikimedia Commons
      "banana" => "https://upload.wikimedia.org/wikipedia/commons/1/1c/Bananas_white_background.jpg",
      #by fir0002  flagstaffotos [at] gmail.com         Canon 20D + Tamron 28-75mm f/2.8, GFDL 1.2 <http://www.gnu.org/licenses/old-licenses/fdl-1.2.html>, via Wikimedia Commons
      "turtle" => "https://upload.wikimedia.org/wikipedia/commons/8/80/Turtle_golfina_escobilla_oaxaca_mexico_claudio_giovenzana_2010.jpg",
      #by Claudio Giovenzana, CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons
      "flamingo" => "https://upload.wikimedia.org/wikipedia/commons/b/b8/James_Flamingos_MC.jpg",
      #by Christian Mehlführer, User:Chmehl, CC BY 3.0 <https://creativecommons.org/licenses/by/3.0>, via Wikimedia Commons
      "piano" => "https://upload.wikimedia.org/wikipedia/commons/d/da/Steinway_%26_Sons_upright_piano%2C_model_K-132%2C_manufactured_at_Steinway%27s_factory_in_Hamburg%2C_Germany.png",
      #by "Photo: © Copyright Steinway & Sons", CC BY-SA 3.0 <https://creativecommons.org/licenses/by-sa/3.0>, via Wikimedia Commons
      "honeycomb" => "https://upload.wikimedia.org/wikipedia/commons/f/f7/Honey_comb.jpg",
      #by Merdal, CC BY-SA 3.0 <http://creativecommons.org/licenses/by-sa/3.0/>, via Wikimedia Commons
      "teapot" => "https://upload.wikimedia.org/wikipedia/commons/4/44/Black_tea_pot_cropped.jpg",
      #by Mendhak, CC BY-SA 2.0 <https://creativecommons.org/licenses/by-sa/2.0>, via Wikimedia Commons
  );
  
  my @image_names = sort keys %images_for_test_to_uri;
  
  
  if( IN_IPERL ) {
      IPerl->html(
          my_table( \@image_names, sub {
              my ($image_name, $h) = @_;
              (
                  $h->tt($image_name),
                  $h->a( { href => $images_for_test_to_uri{$image_name} },
                      $h->img({
                          src => $images_for_test_to_uri{$image_name},
                          alt => $image_name,
                          width => '50%',
                      })
                  ),
              )
          })
      );
  }

DISPLAY:

appleapple
bananabanana
busbus
carcar
catcat
dogdog
flamingoflamingo
honeycombhoneycomb
pianopiano
teapotteapot
tigertiger
turtleturtle

Download the test images and transform them into suitable input data

We now fetch these images and prepare them to be the in the needed format by using Imager to resize and add padding. Then we turn the Imager data into a PDL ndarray. Since the Imager data is stored as 32-bits with 4 channels in the order ARGB, we create an uint32_t PDL ndarray and use bit manipulation to create a uint8_t ndarray (which gives a PDL dimension list that starts with 3 for the RGB channels). Then we create a float32_t ndarray by normalising the values from [0, 1] as the model specifies.

We then take all the PDL ndarrays and concatenate them. Again, note that the dimension lists for the PDL ndarray and the TFTensor are reversed.

  sub imager_paste_center_pad {
      my ($inner, $padded_sz, @rest) = @_;
  
      my $outer = Imager->new( List::Util::mesh( [qw(xsize ysize)], $padded_sz ),
          @rest
      );
  
      $outer->paste(
          left => int( ($outer->getwidth  - $inner->getwidth ) / 2 ),
          top  => int( ($outer->getheight - $inner->getheight) / 2 ),
          src  => $inner,
      );
  
      $outer;
  }
  
  sub imager_scale_to {
      my ($img, $image_size) = @_;
      my $rescaled = $img->scale(
          List::Util::mesh( [qw(xpixels ypixels)], $image_size ),
          type => 'min',
          qtype => 'mixing', # 'mixing' seems to work better than 'normal'
      );
  }
  
  sub load_image_to_pdl {
      my ($uri, $image_size) = @_;
  
      my $http = HTTP::Tiny->new;
      my $response = $http->get( $uri );
      die "Could not fetch image from $uri" unless $response->{success};
      say "Downloaded $uri";
  
      my $img = Imager->new;
      $img->read( data => $response->{content} );
  
      my $rescaled = imager_scale_to($img, $image_size);
  
      say sprintf "Rescaled image from [ %d x %d ] to [ %d x %d ]",
          $img->getwidth, $img->getheight,
          $rescaled->getwidth, $rescaled->getheight;
  
      my $padded = imager_paste_center_pad($rescaled, $image_size,
          # ARGB fits in 32-bits (uint32_t)
          channels => 4
      );
  
      say sprintf "Padded to [ %d x %d ]", $padded->getwidth, $padded->getheight;
  
      # Create PDL ndarray from Imager data in-memory.
      my $data;
      $padded->write( data => \$data, type => 'raw' )
          or die "could not write ". $padded->errstr;
  
      # $data is packed as PDL->dims == [w,h] with ARGB pixels
      #   $ PDL::howbig(ulong) # 4
      my $pdl_raw = zeros(ulong, $padded->getwidth, $padded->getheight);
      ${ $pdl_raw->get_dataref } = $data;
      $pdl_raw->upd_data;
  
      # Split uint32_t pixels into first dimension with 3 channels (R,G,B) with values 0-255.
      my @shifts = map 8*$_, 0..2;
      my $pdl_channels = $pdl_raw->dummy(0)
          ->and2(ulong(map 0xFF << $_, @shifts)->slice(':,*,*') )
          ->shiftright( ulong(@shifts)->slice(':,*,*') )
          ->byte;
  
      my $pdl_scaled = (
              # Scale to [ 0, 1 ].
              ( $pdl_channels / float(255) )
          );
  
      ## flip vertically to see image right way up
      #show_in_gnuplot( $pdl_channels->slice(':,:,-1:0')         ); #DEBUG
      #show_in_gnuplot(   $pdl_scaled->slice(':,:,-1:0') * 255.0 ); #DEBUG
  
      $pdl_scaled;
  }
  
  my @pdl_images = map {
      load_image_to_pdl(
          $images_for_test_to_uri{$_},
          $model_name_to_params{$model_name}{image_size}
      );
  } @image_names;
  
  my $pdl_image_batched = cat(@pdl_images);
  my $t = FloatPDLTOTFTensor($pdl_image_batched);
  
  p $pdl_image_batched;
  p $t;

STREAM (STDOUT):

  Downloaded https://upload.wikimedia.org/wikipedia/commons/1/15/Red_Apple.jpg
  Rescaled image from [ 2418 x 2192 ] to [ 224 x 203 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/1/1c/Bananas_white_background.jpg
  Rescaled image from [ 1600 x 1067 ] to [ 224 x 149 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/6/63/LT_471_%28LTZ_1471%29_Arriva_London_New_Routemaster_%2819522859218%29.jpg
  Rescaled image from [ 3840 x 2560 ] to [ 224 x 149 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/4/49/2013-2016_Toyota_Corolla_%28ZRE172R%29_SX_sedan_%282018-09-17%29_01.jpg
  Rescaled image from [ 4152 x 2252 ] to [ 224 x 121 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/4/4d/Cat_November_2010-1a.jpg
  Rescaled image from [ 1795 x 2397 ] to [ 168 x 224 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/archive/a/a9/20090914031557%21Saluki_dog_breed.jpg
  Rescaled image from [ 543 x 523 ] to [ 224 x 216 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/b/b8/James_Flamingos_MC.jpg
  Rescaled image from [ 3000 x 1999 ] to [ 224 x 149 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/f/f7/Honey_comb.jpg
  Rescaled image from [ 800 x 600 ] to [ 224 x 168 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/d/da/Steinway_%26_Sons_upright_piano%2C_model_K-132%2C_manufactured_at_Steinway%27s_factory_in_Hamburg%2C_Germany.png
  Rescaled image from [ 2059 x 2080 ] to [ 222 x 224 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/4/44/Black_tea_pot_cropped.jpg
  Rescaled image from [ 900 x 838 ] to [ 224 x 209 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/b/b0/Bengal_tiger_%28Panthera_tigris_tigris%29_female_3_crop.jpg
  Rescaled image from [ 4500 x 3000 ] to [ 224 x 149 ]
  Padded to [ 224 x 224 ]
  Downloaded https://upload.wikimedia.org/wikipedia/commons/8/80/Turtle_golfina_escobilla_oaxaca_mexico_claudio_giovenzana_2010.jpg
  Rescaled image from [ 2000 x 1329 ] to [ 224 x 149 ]
  Padded to [ 224 x 224 ]

STREAM (STDERR):

PDL {
    Data     : too long to print
    Type     : float
    Shape    : [3 224 224 12]
    Nelem    : 1806336
    Min      : 0
    Max      : 1
    Badflag  : No
    Has Bads : No
}
AI::TensorFlow::Libtensorflow::Tensor {
    Type            FLOAT
    Dims            [ 12 224 224 3 ]
    NumDims         4
    ElementCount    1806336
}

Run the model for inference

We can use the Run method to run the session and get the output TFTensor.

First, we send a single random input to warm up the model.

  my $RunSession = sub {
      my ($session, $t) = @_;
      my @outputs_t;
  
      $session->Run(
          undef,
          $outputs{in}, [$t],
          $outputs{out}, \@outputs_t,
          undef,
          undef,
          $s
      );
      AssertOK($s);
  
      return $outputs_t[0];
  };
  
  say "Warming up the model";
  use PDL::GSL::RNG;
  my $rng = PDL::GSL::RNG->new('default');
  my $image_size = $model_name_to_params{$model_name}{image_size};
  my $warmup_input = zeros(float, 3, @$image_size, 1 );
  $rng->get_uniform($warmup_input);
  
  p $RunSession->($session, FloatPDLTOTFTensor($warmup_input));

STREAM (STDOUT):

  Warming up the model

STREAM (STDERR):

AI::TensorFlow::Libtensorflow::Tensor {
    Type            FLOAT
    Dims            [ 1 1001 ]
    NumDims         2
    ElementCount    1001
}

Then we send the batched image data. The returned scores need to by normalised using the softmax function with the following formula (taken from Wikipedia):

$$ {\displaystyle \sigma (\mathbf {z} ){i}={\frac {e^{z{i}}}{\sum {j=1}^{K}e^{z{j}}}}\ \ {\text{ for }}i=1,\dotsc ,K{\text{ and }}\mathbf {z} =(z{1},\dotsc ,z{K})\in \mathbb {R} ^{K}.} $$

  my $output_pdl_batched = FloatTFTensorToPDL($RunSession->($session, $t));
  my $softmax = sub { ( map $_/sumover($_)->dummy(0), exp($_[0]) )[0] };
  my $probabilities_batched = $softmax->($output_pdl_batched);
  p $probabilities_batched;

STREAM (STDERR):

PDL {
    Data     : too long to print
    Type     : float
    Shape    : [1001 12]
    Nelem    : 12012
    Min      : 2.73727380317723e-07
    Max      : 0.980696022510529
    Badflag  : No
    Has Bads : No
}

Results summary

Then select the top 5 of those and find their class labels.

  my $N = 5; # number to select
  
  my $top_batched = $probabilities_batched->qsorti->slice([-1, -$N]);
  
  my @top_lists   = dog($top_batched);
  
  my $includes_background_class = $probabilities_batched->dim(0) == IMAGENET_LABEL_COUNT_WITH_BG;
  
  if( IN_IPERL ) {
      my $html = IPerl->html(
          my_table( [0..$#image_names], sub {
              my ($batch_idx, $h) = @_;
              my $image_name = $image_names[$batch_idx];
              my @top_for_image = $top_lists[$batch_idx]->list;
              (
                      $h->tt($image_name),
                      $h->a( { href => $images_for_test_to_uri{$image_name} },
                          $h->img({
                              src => $images_for_test_to_uri{$image_name},
                              alt => $image_name,
                              width => '50%',
                          })
                      ),
                      do {
                          my @tr;
                          push @tr, [ $h->th('Rank', 'Label No', 'Label', 'Prob') ];
                          while( my ($i, $label_index) = each @top_for_image ) {
                              my $class_index = $includes_background_class ? $label_index : $label_index + 1;
                              push @tr, [ $h->td(
                                      $i + 1,
                                      $class_index,
                                      $labels[$class_index],
                                      $probabilities_batched->at($label_index,$batch_idx),
                              ) ];
  
                          }
                          $h->table([$h->tr(@tr)])
                      },
                  )
          })
      );
      IPerl->display($html);
  } else {
      for my $batch_idx (0..$#image_names) {
          my $image_name = $image_names[$batch_idx];
          my @top_for_image = $top_lists[$batch_idx]->list;
          my @td;
          say "Image name: `$image_name`";
          my $header = [ ('Rank', 'Label No', 'Label', 'Prob') ];
          my @rows;
          while( my ($i, $label_index) = each @top_for_image ) {
              my $class_index = $includes_background_class ? $label_index : $label_index + 1;
              push @rows, [ (
                      $i + 1,
                      $class_index,
                      $labels[$class_index],
                      $probabilities_batched->at($label_index,$batch_idx),
              ) ];
          }
          say generate_table( rows => [ $header, @rows ], header_row => 1 );
          print "\n";
      }
  }

DISPLAY:

appleapple
RankLabel NoLabelProb
1958pomegranate0.764890849590302
2949Granny Smith0.0557115897536278
3951orange0.0294644851237535
4955banana0.0140652684494853
5952lemon0.0104219866916537
bananabanana
RankLabel NoLabelProb
1955banana0.980696022510529
2941spaghetti squash0.00609391508623958
3940zucchini0.000924494117498398
4942acorn squash0.000428267841925845
5988corn0.000371591129805893
busbus
RankLabel NoLabelProb
1706passenger car0.503800392150879
2875trolleybus0.334556519985199
3655minibus0.0483399331569672
4830streetcar0.0060268952511251
5556fire engine0.00416999123990536
carcar
RankLabel NoLabelProb
1437beach wagon0.653993427753448
2480car wheel0.0596377961337566
3582grille0.0583300851285458
4512convertible0.0284444093704224
5469cab0.0261545460671186
catcat
RankLabel NoLabelProb
1283tiger cat0.406540781259537
2286Egyptian cat0.217931881546974
3282tabby0.162566006183624
4288lynx0.0042705861851573
5284Persian cat0.003710369579494
dogdog
RankLabel NoLabelProb
1177Saluki0.956575691699982
2215Gordon setter0.0140251601114869
3166black-and-tan coonhound0.0014705111971125
4170borzoi0.00110328639857471
5159toy terrier0.000861090025864542
flamingoflamingo
RankLabel NoLabelProb
1131flamingo0.959895372390747
2130spoonbill0.0161255765706301
3128white stork0.00166661932598799
4129black stork0.000550497847143561
5773safety pin0.000548230716958642
honeycombhoneycomb
RankLabel NoLabelProb
1600honeycomb0.877344310283661
2411apiary0.0387796945869923
362boa constrictor0.0105699924752116
463rock python0.00116742600221187
5552face powder0.000891718140337616
pianopiano
RankLabel NoLabelProb
1882upright0.935266852378845
2852television0.0038805918302387
3599home theater0.00317473593167961
4652microwave0.00226501375436783
5528desktop computer0.00212761503644288
teapotteapot
RankLabel NoLabelProb
1850teapot0.934735298156738
2506coffeepot0.0514399670064449
3969cup0.000874492223374546
4551espresso maker0.000661507365293801
5726pitcher0.000382225960493088
tigertiger
RankLabel NoLabelProb
1293tiger0.648495197296143
2283tiger cat0.181235983967781
3341zebra0.0170289929956198
4291jaguar0.00196429016068578
5353impala0.00144677574280649
turtleturtle
RankLabel NoLabelProb
135leatherback turtle0.535791635513306
2347water buffalo0.0588447786867619
3148grey whale0.0354104563593864
4345hippopotamus0.0267104711383581
534loggerhead0.0217578746378422

  my $p_approx_batched = $probabilities_batched->sumover->approx(1, 1e-5);
  p $p_approx_batched;
  say "All probabilities sum up to approximately 1" if $p_approx_batched->all->sclr;

STREAM (STDOUT):

  All probabilities sum up to approximately 1

STREAM (STDERR):

PDL {
    Data     : [1 1 1 1 1 1 1 1 1 1 1 1]
    Type     : double
    Shape    : [12]
    Nelem    : 12
    Min      : 1
    Max      : 1
    Badflag  : No
    Has Bads : No
}

RESULT:

  1

DEBUGGING

The following images can be used to test the load_image_to_pdl function.

  my @solid_channel_uris = (
      'https://upload.wikimedia.org/wikipedia/commons/thumb/6/62/Solid_red.svg/480px-Solid_red.svg.png',
      'https://upload.wikimedia.org/wikipedia/commons/thumb/1/1d/Green_00FF00_9x9.svg/480px-Green_00FF00_9x9.svg.png',
      'https://upload.wikimedia.org/wikipedia/commons/thumb/f/ff/Solid_blue.svg/480px-Solid_blue.svg.png',
  );
  undef;

CPANFILE

  requires 'AI::TensorFlow::Libtensorflow';
  requires 'AI::TensorFlow::Libtensorflow::DataType';
  requires 'Archive::Extract';
  requires 'Data::Printer';
  requires 'Data::Printer::Filter::PDL';
  requires 'FFI::Platypus::Buffer';
  requires 'FFI::Platypus::Memory';
  requires 'File::Which';
  requires 'HTML::Tiny';
  requires 'HTTP::Tiny';
  requires 'Imager';
  requires 'List::Util';
  requires 'PDL';
  requires 'PDL::GSL::RNG';
  requires 'Path::Tiny';
  requires 'Syntax::Construct';
  requires 'Text::Table::Tiny';
  requires 'URI';
  requires 'constant';
  requires 'feature';
  requires 'lib::projectroot';
  requires 'strict';
  requires 'utf8';
  requires 'warnings';

AUTHOR

Zakariyya Mughal <zmughal@cpan.org>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2022 by Auto-Parallel Technologies, Inc.

This is free software, licensed under:

  The Apache License, Version 2.0, January 2004