The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

JSON::Tokenize - Tokenize JSON

SYNOPSIS

    use JSON::Tokenize ':all';
    
    my $input = '{"tuttie":["fruity", true, 100]}';
    my $token = tokenize_json ($input);
    print_tokens ($token, 0);
    
    sub print_tokens
    {
        my ($token, $depth) = @_;
        while ($token) {
            my $start = tokenize_start ($token);
            my $end = tokenize_end ($token);
            my $type = tokenize_type ($token);
            print "   " x $depth;
            my $value = substr ($input, $start, $end - $start);
            print "'$value' has type '$type'.\n";
            my $child = tokenize_child ($token);
            if ($child) {
                print_tokens ($child, $depth+1);
            }
            my $next = tokenize_next ($token);
            $token = $next;
        }
    }

This outputs

    '{"tuttie":["fruity", true, 100]}' has type 'object'.
       '"tuttie"' has type 'string'.
       ':' has type 'colon'.
       '["fruity", true, 100]' has type 'array'.
          '"fruity"' has type 'string'.
          ',' has type 'comma'.
          'true' has type 'literal'.
          ',' has type 'comma'.
          '100' has type 'number'.

VERSION

This documents version 0.62 of JSON::Tokenize corresponding to git commit d04630086f6c92fea720cba4568faa0cbbdde5a6 released on Sat Jul 16 08:23:13 2022 +0900.

DESCRIPTION

This is a module for tokenizing a JSON string. "Tokenizing" means breaking the string into individual tokens, without creating any Perl structures. It uses the same underlying code as JSON::Parse. Tokenizing can be used for tasks such as picking out or searching through parts of a large JSON structure without storing each part of the entire structure in memory.

This module is an experimental part of JSON::Parse and its interface is likely to change. The tokenizing functions are currently written in a very primitive way.

FUNCTIONS

tokenize_child

    my $child = tokenize_child ($child);

Walk the tree of tokens.

tokenize_end

    my $end = tokenize_end ($token);

Get the end of the token as a byte offset from the start of the string. Note this is a byte offset not a character offset.

tokenize_json

    my $token = tokenize_json ($json);

tokenize_next

    my $next = tokenize_next ($token);

Walk the tree of tokens.

tokenize_start

    my $start = tokenize_start ($token);

Get the start of the token as a byte offset from the start of the string. Note this is a byte offset not a character offset.

tokenize_text

    my $text = tokenize_text ($json, $token);

Given a token $token from this parsing and the JSON in $json, return the text which corresponds to the token. This is a convenience function written in Perl which uses "tokenize_start" and "tokenize_end" and substr to get the string from $json.

tokenize_type

    my $type = tokenize_type ($token);

Get the type of the token as a string. The possible return values are

    "array",
    "initial state",
    "invalid",
    "literal",
    "number",
    "object",
    "string",
    "unicode escape"

AUTHOR

Ben Bullock, <bkb@cpan.org>

COPYRIGHT & LICENCE

This package and associated files are copyright (C) 2016-2022 Ben Bullock.

You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.