The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Set::Object - set of objects and strings

SYNOPSIS

  use Set::Object qw(set);

  my $set = set();            # or Set::Object->new()

  $set->insert(@thingies);
  $set->remove(@thingies);

  @items = @$set;             # or $set->members for the unsorted array

  $union = $set1 + $set2;
  $intersection = $set1 * $set2;
  $difference = $set1 - $set2;
  $symmetric_difference = $set1 % $set2;

  print "set1 is a proper subset of set2"
      if $set1 < $set2;

  print "set1 is a subset of set2"
      if $set1 <= $set2;

  # common idiom - iterate over any pure Perl structure
  use Set::Object qw(reftype);
  my @stack = $root;
  my $seen = Set::Object->new(@stack);
  while (my $object = pop @stack) {
      if (reftype $object eq "HASH") {
          # do something with hash members

          # add the new nodes to the stack
          push @stack, grep { ref $_ && $seen->insert($_) }
              values %$object;
      }
      elsif (reftype $object eq "ARRAY") {
          # do something with array members

          # add the new nodes to the stack
          push @stack, grep { ref $_ && $seen->insert($_) }
              @$object;

      }
      elsif (reftype $object =~ /SCALAR|REF/) {
          push @stack, $$object
              if ref $$object && $seen->insert($$object);
      }
  }

DESCRIPTION

This modules implements a set of objects, that is, an unordered collection of objects without duplication.

The term objects is applied loosely - for the sake of Set::Object, anything that is a reference is considered an object.

Set::Object 1.09 and later includes support for inserting scalars (including the empty string, but excluding undef) as well as objects. This can be thought of as (and is currently implemented as) a degenerate hash that only has keys and no values. Unlike objects placed into a Set::Object, scalars that are inserted will be flattened into strings, so will lose any magic (eg, tie) or other special bits that they went in with; only strings come out.

CONSTRUCTORS

Set::Object->new( [list] )

Return a new Set::Object containing the elements passed in list.

set(@members)

Return a new Set::Object filled with @members. You have to explicitly import this method.

New in Set::Object 1.22: this function is now called as a method to return new sets the various methods that return a new set, such as ->intersection, ->union, etc and their overloaded counterparts. The default method always returns Set::Object objects, preserving previous behaviour and not second guessing the nature of your derived Set::Object class.

weak_set()

Return a new Set::Object::Weak, filled with @members. You have to explicitly import this method.

INSTANCE METHODS

insert( [list] )

Add items to the Set::Object.

Adding the same object several times is not an error, but any Set::Object will contain at most one occurrence of the same object.

Returns the number of elements that were actually added. As of Set::Object 1.23, undef will not insert.

includes( [list] )

has( [list] )

contains( [list] )

Return true if all the objects in list are members of the Set::Object. list may be empty, in which case true is always returned.

As of Set::Object 1.23, undef will never appear to be present in any set (even if the set contains the empty string). Prior to 1.23, there would have been a run-time warning.

member( [item] )

element( [item] )

Like includes, but takes a single item to check and returns that item if the value is found, rather than just a true value.

members

elements

Return the objects contained in the Set::Object in random (hash) order.

Note that the elements of a Set::Object in list context are returned sorted - @$set - so using the members method is much faster.

size

Return the number of elements in the Set::Object.

remove( [list] )

delete( [list] )

Remove objects from a Set::Object.

Removing the same object more than once, or removing an object absent from the Set::Object is not an error.

Returns the number of elements that were actually removed.

As of Set::Object 1.23, removing undef is safe (but having an undef in the passed in list does not increase the return value, because it could never be in the set)

weaken

Makes all the references in the set "weak" - that is, they do not increase the reference count of the object they point to, just like Scalar::Util's weaken function.

This was introduced with Set::Object 1.16, and uses a brand new type of magic. Use with caution. If you get segfaults when you use weaken, please reduce your problem to a test script before submission.

New: as of Set::Object 1.19, you may use the weak_set function to make weak sets, or Set::Object::Weak->new, or import the set constructor from Set::Object::Weak instead. See Set::Object::Weak for more.

Note to people sub-classing Set::Object: this method re-blesses the invocant to Set::Object::Weak. Override the method weak_pkg in your sub-class to control this behaviour.

is_weak

Returns a true value if this set is a weak set.

strengthen

Turns a weak set back into a normal one.

Note to people sub-classing Set::Object: this method re-blesses the invocant to Set::Object. Override the method strong_pkg in your sub-class to control this behaviour.

invert( [list] )

For each item in list, it either removes it or adds it to the set, so that a change is always made.

Also available as the overloaded operator /, in which case it expects another set (or a single scalar element), and returns a new set that is the original set with all the second set's items inverted.

clear

Empty this Set::Object.

as_string

Return a textual Smalltalk-ish representation of the Set::Object. Also available as overloaded operator "".

equal( set )

Returns a true value if set contains exactly the same members as the invocant.

Also available as overloaded operator == (or eq).

not_equal( set )

Returns a false value if set contains exactly the same members as the invocant.

Also available as overloaded operator != (or ne).

intersection( [list] )

Return a new Set::Object containing the intersection of the Set::Objects passed as arguments.

Also available as overloaded operator *.

union( [list] )

Return a new Set::Object containing the union of the Set::Objects passed as arguments.

Also available as overloaded operator +.

difference ( set )

Return a new Set::Object containing the members of the first (invocant) set with the passed Set::Objects' elements removed.

Also available as overloaded operator -.

unique ( set )

symmetric_difference ( set )

Return a new Set::Object containing the members of all passed sets (including the invocant), with common elements removed. This will be the opposite (complement) of the intersection of the two sets.

Also available as overloaded operator %.

subset( set )

Return true if this Set::Object is a subset of set.

Also available as operator <=.

proper_subset( set )

Return true if this Set::Object is a proper subset of set Also available as operator <.

superset( set )

Return true if this Set::Object is a superset of set. Also available as operator >=.

proper_superset( set )

Return true if this Set::Object is a proper superset of set Also available as operator >.

is_null( set )

Returns a true value if this set does not contain any members, that is, if its size is zero.

Set::Scalar compatibility methods

By and large, Set::Object is not and probably never will be feature-compatible with Set::Scalar; however the following functions are provided anyway.

compare( set )

returns one of:

  "proper intersect"
  "proper subset"
  "proper superset"
  "equal"
  "disjoint"

is_disjoint( set )

Returns a true value if the two sets have no common items.

as_string_callback( set )

Allows you to define a custom stringify function. This is only a class method. If you want anything fancier than this, you should sub-class Set::Object.

FUNCTIONS

The following functions are defined by the Set::Object XS code for convenience; they are largely identical to the versions in the Scalar::Util module, but there are a couple that provide functions not catered to by that module.

Please use the versions in Scalar::Util in preference to these functions. In fact, if you use these functions in your production code then you may have to rewrite it some day. They are retained only because they are "mostly harmless".

blessed

Do not use in production code

Returns a true value if the passed reference (RV) is blessed. See also Acme::Holy.

reftype

Do not use in production code

A bit like the perl built-in ref function, but returns the type of reference; ie, if the reference is blessed then it returns what ref would have if it were not blessed. Useful for "seeing through" blessed references.

refaddr

Do not use in production code

Returns the memory address of a scalar. Warning: this is not guaranteed to be unique for scalars created in a program; memory might get re-used!

is_int, is_string, is_double

Do not use in production code

A quick way of checking the three bits on scalars - IOK (is_int), NOK (is_double) and POK (is_string). Note that the exact behaviour of when these bits get set is not defined by the perl API.

This function returns the "p" versions of the macro (SvIOKp, etc); use with caution.

is_overloaded

Do not use in production code

A quick way to check if an object has overload magic on it.

ish_int

Deprecated and will be removed in 2014

This function returns true, if the value it is passed looks like it already is a representation of an integer. This is so that you can decide whether the value passed is a hash key or an array index.

is_key

Deprecated and will be removed in 2014

This function returns true, if the value it is passed looks more like an index to a collection than a value of a collection. Similar to the looks_like_number internal function, but weird. Avoid.

get_magic

Do not use in production code

Pass to a scalar, and get the magick wand (mg_obj) used by the weak set implementation. The return will be a list of integers which are pointers to the actual ISET structure. Whatever you do don't change the array :). This is used only by the test suite, and if you find it useful for something then you should probably conjure up a test suite and send it to me, otherwise it could get pulled.

CLASS METHODS

These class methods are probably only interesting to those sub-classing Set::Object.

strong_pkg

When a set that was already weak is strengthened using ->strengthen, it gets re-blessed into this package.

weak_pkg

When a set that was NOT already weak is weakened using ->weaken, it gets re-blessed into this package.

tie_array_pkg

When the object is accessed as an array, tie the array into this package.

tie_hash_pkg

When the object is accessed as a hash, tie the hash into this package.

SERIALIZATION

It is possible to serialize Set::Object objects via Storable and duplicate via dclone; such support was added in release 1.04. As of Set::Object version 1.15, it is possible to freeze scalar items, too.

However, the support for freezing scalar items introduced a backwards incompatibility. Earlier versions than 1.15 will thaw sets frozen using Set::Object 1.15 and later as a set with one item - an array that contains the actual members.

Additionally, version 1.15 had a bug that meant that it would not detect freeze protocol upgrades, instead reverting to pre-1.15 behaviour.

Set::Object 1.16 and above are capable of dealing correctly with all serialized forms, as well as correctly aborting if a "newer" freeze protocol is detected during thaw.

PERFORMANCE

The following benchmark compares Set::Object with using a hash to emulate a set-like collection (this is an old benchmark, but still holds true):

   use Set::Object;

   package Obj;
   sub new { bless { } }

   @els = map { Obj->new() } 1..1000;

   require Benchmark;

   Benchmark::timethese(100, {
      'Control' => sub { },
      'H insert' => sub { my %h = (); @h{@els} = @els; },
      'S insert' => sub { my $s = Set::Object->new(); $s->insert(@els) },
      } );

   %gh = ();
   @gh{@els} = @els;

   $gs = Set::Object->new(@els);
   $el = $els[33];

   Benchmark::timethese(100_000, {
           'H lookup' => sub { exists $gh{33} },
           'S lookup' => sub { $gs->includes($el) }
      } );

On my computer the results are:

   Benchmark: timing 100 iterations of Control, H insert, S insert...
      Control:  0 secs ( 0.01 usr  0.00 sys =  0.01 cpu)
               (warning: too few iterations for a reliable count)
     H insert: 68 secs (67.81 usr  0.00 sys = 67.81 cpu)
     S insert:  9 secs ( 8.81 usr  0.00 sys =  8.81 cpu)
   Benchmark: timing 100000 iterations of H lookup, S lookup...
     H lookup:  7 secs ( 7.14 usr  0.00 sys =  7.14 cpu)
     S lookup:  6 secs ( 5.94 usr  0.00 sys =  5.94 cpu)

This benchmark compares the unsorted members method, against the sorted @$ list context.

   perl -MBenchmark -mList::Util -mSet::Object -e'
   $set = Set::Object::set (List::Util::shuffle(1..1000));
   Benchmark::timethese(-3, {
      "Slow \@\$set       " => sub { $i++ for @$set; },
      "Fast set->members" => sub { $i++ for $set->members(); },
      });'

    Benchmark: running Fast set->members, Slow @$set        for at least 3 CPU seconds...
    Fast set->members:  4 wallclock secs ( 3.17 usr +  0.00 sys =  3.17 CPU) @ 9104.42/s (n=28861)
    Slow @$set       :  4 wallclock secs ( 3.23 usr +  0.00 sys =  3.23 CPU) @ 1689.16/s (n=5456)

THREAD SAFETY

This module is not thread-safe.

AUTHOR

Original Set::Object module by Jean-Louis Leroy, <jll@skynet.be>

Set::Scalar compatibility, XS debugging, weak references support courtesy of Sam Vilain, <samv@cpan.org>.

New maintainer is Reini Urban <rurban@cpan.org>. Patches against https://github.com/rurban/Set-Object/ please. Tickets at RT https://rt.cpan.org/Public/Dist/Display.html?Name=Set-Object

LICENCE

Copyright (c) 1998-1999, Jean-Louis Leroy. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the terms of the Perl Artistic License, either the original, or at your option, any later version.

Portions Copyright (c) 2003 - 2005, Sam Vilain. Same license.

Portions Copyright (c) 2006, 2007, Catalyst IT (NZ) Limited. This module is free software. It may be used, redistributed and/or modified under the terms of the Perl Artistic License

Portions Copyright (c) 2013, cPanel. Same license. Portions Copyright (c) 2020, Reini Urban. Same license.

SEE ALSO

perl(1), perltie(1), Set::Scalar, overload