The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

Name

SPVM::Regex - Regular Expressions

Description

The Regex class of SPVM has methods for regular expressions.

Google RE2 is used as the regular expression library.

Usage

  use Regex;
  
  # Pattern match
  {
    my $re = Regex->new("ab*c");
    my $string = "zabcz";
    my $match = $re->match("zabcz");
  }

  # Pattern match - UTF-8
  {
    my $re = Regex->new("あ+");
    my $string = "いあああい";
    my $match = $re->match($string);
  }

  # Pattern match - Character class and the nagation
  {
    my $re = Regex->new("[A-Z]+[^A-Z]+");
    my $string = "ABCzab";
    my $match = $re->match($string);
  }

  # Pattern match with captures
  {
    my $re = Regex->new("^(\w+) (\w+) (\w+)$");
    my $string = "abc1 abc2 abc3";
    my $match = $re->match($string);
    
    if ($match) {
      my $cap1 = $match->cap1;
      my $cap2 = $match->cap2;
      my $cpa3 = $match->cap3;
    }
  }
  
  # Replace
  {
    my $re = Regex->new("abc");
    my $string = "ppzabcz";
    
    # "ppzABCz"
    my $result = $re->replace($string, "ABC");
  }

  # Replace with a callback and capture
  {
    my $re = Regex->new("a(bc)");
    my $string = "ppzabcz";
    
    # "ppzABbcCz"
    my $result = $re->replace($string, method : string ($re : Regex, $match : Regex::Match) {
      return "AB" . $match->cap1 . "C";
    });
  }

  # Replace global
  {
    my $re = Regex->new("abc");
    my $string = "ppzabczabcz";
    
    # "ppzABCzABCz"
    my $result = $re->replace_g($string, "ABC");
  }

  # Replace global with a callback and capture
  {
    my $re = Regex->new("a(bc)");
    my $string = "ppzabczabcz";
    
    # "ppzABCbcPQRSzABCbcPQRSz"
    my $result = $re->replace_g($string, method : string ($re : Regex, $match : Regex::Match) {
      return "ABC" . $match->cap1 . "PQRS";
    });
  }

  # . - single line mode
  {
    my $re = Regex->new("(.+)", "s");
    my $string = "abc\ndef";
    
    my $match = $re->match($string);
    
    unless ($match) {
      return 0;
    }
    
    unless ($match->cap1 eq "abc\ndef") {
      return 0;
    }
  }

Dependent Resources

Regular Expression Syntax

Google RE2 Syntax

Fields

captures

  has captures : ro string[];

The captured strings.

This field is deprecated and will be removed.

match_start

  has match_start : ro int;

The start offset of the matched string.

This field is deprecated and will be removed.

match_length

  has match_length : ro int;

The length of the matched string.

This field is deprecated and will be removed.

replaced_count

  has replaced_count : ro int;

The replaced count.

This field is deprecated and will be removed.

Class Methods

new

  static method new : Regex ($pattern : string, $flags : string = undef)

Creates a new Regex object and compiles the regex pattern $pattern with the flags $flags, and retruns the created object.

  my $re = Regex->new("^ab+c");
  my $re = Regex->new("^ab+c", "s");

Instance Methods

match

  method match : Regex::Match ($string : string, $offset : int = 0, $length : int = -1);

The alias for the following match_forward method.

  my $ret = $self->match_forward($string, \$offset, $length);

match_forward

  method match_forward : Regex::Match ($string : string, $offset_ref : int*, $length : int = -1);

Performs pattern matching on the substring from the offset $$offset_ref to the length $length of the string $string.

The $$offset_ref is updated to the next position.

If the pattern matching is successful, returns a Regex::Match object. Otherwise returns undef.

Exceptions:

The $string must be defined. Otherwise an exception is thrown.

The $offset + the $length must be less than or equal to the length of the $string. Otherwise an exception is thrown.

If the regex is not compiled, an exception is thrown.

replace

  method replace  : string ($string : string, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef)

The alias for the following replace_common method.

  my $ret = $self->replace_common($string, $replace, \$offset, $length, $options);

replace_g

  method replace_g  : string ($string : string, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef)

The alias for the following replace_common method.

  unless ($options) {
    $options = {};
  }
  $options = Fn->merge_options({global => 1}, $options);
  return $self->replace_common($string, $replace, \$offset, $length, $options);

replace_common

  method replace_common : string ($string : string, $replace : object of string|Regex::Replacer,
    $offset_ref : int*, $length : int = -1, $options : object[] = undef);

Replaces the substring from the offset $$offset_ref to the length $length of the string $string with the replacement string or callback $replace with the options $options.

If the $replace is a Regex::Replacer object, the return value of the callback is used for the replacement.

Options:

  • global

    This option must be a Int object. Otherwise an exception is thrown.

    If the value of the Int object is a true value, the global replacement is performed.

  • info

    This option must be an array of the Regex::ReplaceInfo object. Otherwise an exception is thrown.

    If this option is specifed, the first element of the array is set to a Regex::ReplaceInfo object of the replacement result.

Exceptions:

The $string must be defined. Otherwise an exception is thrown.

The $replace must be a string or a Regex::Replacer object. Otherwise an exception is thrown.

The $offset must be greater than or equal to 0. Otherwise an exception is thrown.

The $offset + the $length must be less than or equal to the length of the $string. Otherwise an exception is thrown.

Exceptions of the match_forward method can be thrown.

split

  method split : string[] ($string : string, $limit : int = 0);

The same as the split method in the Fn class, but the regular expression is used as the separator.

buffer_match

  method buffer_match : Regex::Match ($string_buffer : StringBuffer, $offset : int = 0, $length : int = -1);

The same as "match", but the first argument is a StringBuffer object, and the following excetpions are thrown.

Exceptions:

The $offset + $length must be less than or equalt to the lenght of the $string_buffer. Otherwise an exception is thrown.

buffer_match_forward

  method buffer_match_forward : Regex::Match ($string_buffer : StringBuffer, $offset_ref : int*, $length : int = -1);

The same as "match_forward", but the first argument is a StringBuffer object, and the following excetpions are thrown.

Exceptions:

The $offset + $length must be less than or equalt to the lenght of the $string_buffer. Otherwise an exception is thrown.

buffer_replace

  method buffer_replace  : void ($string_buffer : StringBuffer, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef);

The same as "replace", but the first argument is a StringBuffer object, and the return type is void.

The replacement is performed on the string buffer.

buffer_replace_g

  method buffer_replace_g  : string ($string_buffer : StringBuffer, $replace : object of string|Regex::Replacer, $offset : int = 0, $length : int = -1, $options : object[] = undef);

The same as "replace_g", but the first argument is a StringBuffer object, and the return type is void.

The replacement is performed on the string buffer.

buffer_replace_common

  method buffer_replace_common : void ($string_buffer : StringBuffer, $replace : object of string|Regex::Replacer, $offset_ref : int*, $length : int = -1, $options : object[] = undef);

The same as "replace_common", but the first argument is a StringBuffer object, and the return type is void.

The replacement is performed on the string buffer.

cap1

  method cap1 : string ();

The alias for $re->captures->[1].

This method is deprecated and will be removed.

cap2

  method cap2 : string ();

The alias for $re->captures->[2].

This method is deprecated and will be removed.

cap3

  method cap3 : string ();

The alias for $re->captures->[3].

This method is deprecated and will be removed.

cap4

  method cap4 : string ();

The alias for $re->captures->[4].

This method is deprecated and will be removed.

cap5

  method cap5 : string ();

The alias for $re->captures->[5].

This method is deprecated and will be removed.

cap6

  method cap6 : string ();

The alias for $re->captures->[6].

This method is deprecated and will be removed.

cap7

  method cap7 : string ();

The alias for $re->captures->[7].

This method is deprecated and will be removed.

cap8

  method cap8 : string ();

The alias for $re->captures->[8].

This method is deprecated and will be removed.

cap9

  method cap9 : string ();
The alias for C<$re-E<gt>captures-E<gt>[9]>.

This method is deprecated and will be removed.

cap10

  method cap10 : string ();
The alias for C<$re-E<gt>captures-E<gt>[10]>.

This method is deprecated and will be removed.

cap11

  method cap11 : string ();

The alias for $re->captures->[11].

This method is deprecated and will be removed.

cap12

  method cap12 : string ();

The alias for $re->captures->[12].

This method is deprecated and will be removed.

cap13

  method cap13 : string ();

The alias for $re->captures->[13].

This method is deprecated and will be removed.

cap14

  method cap14 : string ();

The alias for $re->captures->[14].

This method is deprecated and will be removed.

cap15

  method cap15 : string ();

The alias for $re->captures->[15].

This method is deprecated and will be removed.

cap16

  method cap16 : string ();

The alias for $re->captures->[16].

This method is deprecated and will be removed.

cap17

  method cap17 : string ();

The alias for $re->captures->[17].

This method is deprecated and will be removed.

cap18

  method cap18 : string ();

The alias for $re->captures->[18].

This method is deprecated and will be removed.

cap19

  method cap19 : string ();

The alias for $re->captures->[19].

This method is deprecated and will be removed.

cap20

  method cap20 : string ();

The alias for $re->captures->[20].

This method is deprecated and will be removed.

Repository

SPVM::Regex - Github

Author

Yuki Kimoto

Contributors

Copyright & License

Copyright (c) 2023 Yuki Kimoto

MIT License