The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Messaging::Message - abstraction of a message

SYNOPSIS

  use Messaging::Message;

  # constructor + setters
  $msg = Messaging::Message->new();
  $msg->body("hello world");
  $msg->header({ subject => "test" });
  $msg->header_field("message-id", 123);

  # fancy constructor
  $msg = Messaging::Message->new(
      body => "hello world",
      header => {
          "subject"    => "test",
          "message-id" => 123,
      },
  );

  # getters
  if ($msg->body() =~ /something/) {
      ...
  }
  $id = $msg->header_field("message-id");

DESCRIPTION

This module provides an abstraction of a "message", as used in messaging, see for instance: http://en.wikipedia.org/wiki/Enterprise_messaging_system.

A Python implementation of the same messaging abstractions is available at https://github.com/cern-mig/python-messaging so messaging components can be written in different programming languages.

A message consists of header fields (collectively called "the header of the message") and a body.

Each header field is a key/value pair where the key and the value are text strings. The key is unique within the header so we can use a hash table to represent the header of the message.

The body is either a text string or a binary string. This distinction is needed because text may need to be encoded (for instance using UTF-8) before being stored on disk or sent across the network.

To make things clear:

  • a text string (aka character string) is a sequence of Unicode characters

  • a binary string (aka byte string) is a sequence of bytes

Both the header and the body can be empty.

JSON MAPPING

In order to ease message manipulation (e.g. exchanging between applications, maybe written in different programming languages), we define here a standard mapping between a Messaging::Message object and a JSON object.

A message as defined above naturally maps to a JSON object with the following fields:

the message header as a JSON object (with all values being JSON strings)

body

the message body as a JSON string

text

a JSON boolean specifying whether the body is text string (as opposed to binary string) or not

encoding

a JSON string describing how the body has been encoded (see below)

All fields are optional and default to empty/false if not present.

Since JSON strings are text strings (they can contain any Unicode character), the message header directly maps to a JSON object. There is no need to use encoding here.

For the message body, this is more complex. A text body can be put as-is in the JSON object but a binary body must be encoded beforehand because JSON does not handle binary strings. Additionally, we want to allow body compression in order to optionally save space. This is where the encoding field comes into play.

The encoding field describes which transformations have been applied to the message body. It is a + separated list of transformations that can be:

base64

Base64 encoding (for binary body or compressed body)

utf8

UTF-8 encoding (only needed for a compressed text body)

lz4 or snappy or zlib

LZ4 or Snappy or Zlib compression (only one can be specified)

Here is for instance the JSON object representing an empty message (i.e. the result of Messaging::Message->new()):

  {}

Here is a more complex example, with a binary body:

  {
    "header":{"subject":"demo","destination":"/topic/test"},
    "body":"YWJj7g==",
    "encoding":"base64"
  }

You can use the jsonify() method to convert a Messaging::Message object into a hash reference representing the equivalent JSON object.

Conversely, you can create a new Messaging::Message object from a compatible JSON object (again, a hash reference) with the dejsonify() method.

Using this JSON mapping of messages is very convenient because you can easily put messages in larger JSON data structures. You can for instance store several messages together using a JSON array of these messages.

Here is for instance how you could construct a message containing in its body another message along with error information:

  use JSON qw(to_json);
  # get a message from somewhere...
  $msg1 = ...;
  # jsonify it and put it into a simple structure
  $body = {
      message => $msg1->jsonify(),
      error   => "an error message",
      time    => time(),
  };
  # create a new message with this body
  $msg2 = Messaging::Message->new(body => to_json($body));
  $msg2->header_field("content-type", "message/error");
  $msg2->text(1);

A receiver of such a message can easily decode it:

  use JSON qw(from_json);
  # get a message from somewhere...
  $msg2 = ...;
  # extract the body which is a JSON object
  $body = from_json($msg2->body());
  # extract the inner message
  $msg1 = Messaging::Message->dejsonify($body->{message});

STRINGIFICATION AND SERIALIZATION

In addition to the JSON mapping described above, we also define how to stringify and serialize a message.

A stringified message is the string representing its equivalent JSON object. A stringified message is a text string and can for instance be used in another message. See the stringify() and destringify() methods.

A serialized message is the UTF-8 encoding of its stringified representation. A serialized message is a binary string and can for instance be stored in a file. See the serialize() and deserialize() methods.

For instance, here are the steps needed in order to store a message into a file:

  1. transform the programming language specific abstraction of the message into a JSON object

  2. transform the JSON object into its (text) string representing

  3. transform the JSON text string into a binary string using UTF-8 encoding

"1" is called jsonify, "1 + 2" is called stringify and "1 + 2 + 3" is called serialize.

To sum up:

        Messaging::Message object
                 |  ^
       jsonify() |  | dejsonify()
                 v  |
    JSON compatible hash reference
                 |  ^
     JSON encode |  | JSON decode
                 v  |
             text string
                 |  ^
    UTF-8 encode |  | UTF-8 decode
                 v  |
            binary string

METHODS

The following methods are available:

new([OPTIONS])

return a new Messaging::Message object (class method)

dejsonify(HASHREF)

return a new Messaging::Message object from a compatible JSON object (class method)

destringify(STRING)

return a new Messaging::Message object from its stringified representation (class method)

deserialize(STRING)

return a new Messaging::Message object from its serialized representation (class method)

jsonify([OPTIONS])

return the JSON object (a hash reference) representing the message

stringify([OPTIONS])

return the text string representation of the message

serialize([OPTIONS])

return the binary string representation of the message

body([STRING])

get/set the body attribute, which is a text or binary string

header([HASHREF])

get/set the header attribute, which is a hash reference (note: the hash reference is used directly, without any deep copy)

header_field(NAME[, VALUE])

get/set the given header field, identified by its name

text([BOOLEAN])

get/set the text attribute, which is a boolean indicating whether the message body is a text string or not, the default is false (so binary body)

size()

get the approximate message size, which is the sum of the sizes of its components: header key/value pairs and body, plus framing

copy()

return a new message which is a copy of the given one, with deep copy of the header and body

The jsonify(), stringify() and serialize() methods can be given options.

Currently, the only supported option is compression and it can contain either an algorithm name like zlib (meaning: use this algorithm only of the compressed body is indeed smaller) or an algorithm name followed by an exclamation mark to always force compression.

Here is for instance how to serialize a message, with forced compression:

  $bytes = $msg->serialize(compression => "zlib!");

In addition, in order to avoid string copies, the following methods are also available:

body_ref([STRINGREF])
stringify_ref([OPTIONS])
destringify_ref(STRINGREF)
serialize_ref([OPTIONS])
deserialize_ref(STRINGREF)

They work like their counterparts but use as input or output string references instead of strings, which can be more efficient for large strings. Here is an example:

  # get a copy of the body, yielding to internal string copy
  $body = $msg->body();
  # get a reference to the body, with no string copies
  $body_ref = $msg->body_ref();

SEE ALSO

Compress::Snappy, Compress::LZ4, Compress::Zlib, Encode, JSON.

AUTHOR

Lionel Cons http://cern.ch/lionel.cons

Copyright (C) CERN 2011-2021