The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::Sofu - Perl extension for Sofu data

Synopsis

        use Data::Sofu;
        %hash=readSofu("file.sofu");
        ...
        writeSofu("file.sofu",\%hash);
        

Or a litte more complex:

        use Data::Sofu qw/packSofu unpackSofu/;
        %hash=readSofu("file.sofu");
        $comments=getSofucomments;
        open fh,">:UTF16-LE","file.sofu";
        writeSofu(\*fh,\$hash,$comments);
        close fh;
        $texta=packSofu($arrayref);
        $texth=packSofu($hashref);
        $arrayref=unpackSofu($texta);
        $arrayhash=unpackSofu($texth);

Synopsis - oo-style

        require Data::Sofu;
        my $sofu=new Sofu;
        %hash=$sofu->read("file.sofu");
        $comments=$sofu->comments;
        $sofu->write("file.sofu",$hashref);
        open fh,">:UTF16-LE",file.sofu";
        $sofu->write(\*fh,$hashref,$comments);
        close fh;
        $texta=$sofu->pack($arrayref);
        $texth=$sofu->pack($hashref);
        $arrayref=$sofu->unpack($texta);
        $arrayhash=$sofu->unpack($texth);

DESCRIPTION

This Module provides the ability to read and write sofu files of the versions 0.1 and 0.2. Visit http://sofu.sf.net for a description about sofu.

It can also read not-so-wellformed sofu files and correct their errors.

Additionally it provides the ability to pack HASHes and ARRAYs to sofu strings and unpack those.

The comments in a sofu file can be preserved if they're saved with $sofu->comment or getSofucomments or if loadFile/load is used.

It also provides a compatibility layer for sofud via Data::Sofu::Object and Data::Sofu->loadFile();

Data::Sofu::Binary provides an experimental interface to Binary Sofu (.bsofu) files and streams.

SYNTAX

This module can either be called using object-orientated notation or using the funtional interface.

Some features are only avaiable when using OO.

FUNCTIONS

getSofucomments()

Gets the comments of the last file read

writeSofu(FILE,DATA,[COMMENTS])

Writes a sofu file with the name FILE.

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

An existing file of this name will be overwritten.

DATA can be a scalar, a hashref or an arrayref.

The top element of sofu files must be a hash, so any other datatype is converted to {Value=>DATA}.

        @a=(1,2,3);
        $sofu->write("Test.sofu",\@a);
        %data=$sofu->read("Test.sofu");
        @a=@{$data->{Value}}; # (1,2,3)

COMMENTS is a reference to hash with comments like the one retuned by comments()

readSofu(FILE)

Reads the sofu file FILE and returns a hash with the data.

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

These methods are not exported by default:

loadSofu(FILE)

Reads a .sofu file and converts it to Sofud compatible objects

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

Returns a Data::Sofu::Object

getSofu(HASHREF)

Converts a hashref (like returned from readSofu) to Sofud compatible objects.

Returns a Data::Sofu::Object

packSofu(DATA,[COMMENTS])

Packs DATA to a sofu string.

DATA can be a scalar, a hashref or an arrayref.

This is different from a normal write(), because the lines are NOT indented and there will be placed brackets around the topmost element. (Which is not Sofu 0.2 conform, please use write(\$scalar,$data) instead).

COMMENTS is a reference to hash with comments like the one retuned by comments().

packBinarySofu(DATA,[COMMENTS])

Same as packSofu(DATA,[COMMENTS]) but the output is binary.

packSofuBinary(DATA,[COMMENTS])

Same as packSofu(DATA,[COMMENTS]) but the output is binary.

unpackSofu(SOFU STRING)

This function unpacks SOFU STRING and returns a scalar, which can be either a string or a reference to a hash or a reference to an array.

Can read Sofu and SofuML files but not binary Sofu files

Note you can also read packed Data with readSofu(\<packed Data string>):

        my $packed = packSofu($tree,$comments);
        my $tree2 = unpackSofu($packed);
        my $tree3 = readSofu(\$packed); 
        # $tree2 has the same data as $tree3 (and $tree of course)

writeSofuBinary(FILE, DATA, [Comments, [Encoding, [ByteOrder, [SofuMark]]]])

Writes the Data as a binary file.

FILE can be:

A reference to a filehandle with raw encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

DATA has to be a reference to a Hash or Data::Sofu::Object

COMMENTS is a reference to hash with comments like the one retuned by comments

More info on the other parameters in Data::Sofu::Binary

To write other Datastructures use this:

        writeSofuBinary("1.sofu",{Value=>$data});

writeBinarySofu(FILE, DATA, [Comments, [Encoding, [ByteOrder, [SofuMark]]]])

Same as writeSofuBinary()

writeSofuML(FILE, DATA, [COMMENTS,[HEADER]])

Writes the Data as an XML file (for postprocessing with XSLT or CSS)

FILE can be:

A reference to a filehandle with some encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

DATA has to be a reference to a Hash or Data::Sofu::Object

COMMENTS is a reference to hash with comments like the one retuned by comments, only used when DATA is not a Data::Sofu::Object

HEADER can be an costum file header, (defaults to qq(<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<!DOCTYPE Sofu SYSTEM "http://sofu.sf.net/Sofu.dtd">\n) );

Default output (when given a filename) is UTF-8.

packSofuML(DATA, [COMMENTS, [HEADER]])

Returns DATA as an XML file (for postprocessing with XSLT or CSS) with no Indentation

DATA has to be a reference to a Hash or Data::Sofu::Object

COMMENTS is a reference to hash with comments like the one retuned by comments, only used when DATA is not a Data::Sofu::Object

HEADER can be an costum file header, (defaults to qq(<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<!DOCTYPE Sofu SYSTEM "http://sofu.sf.net/Sofu.dtd">\n) );

Those are not (quite) the same:

        use Data::Sofu qw/packSofuML writeSofuML/;
        $string = packSofuML($tree,$comments) #Will not indent.
        writeSofuML(\$string,$tree,$comments)# Will indent.

CLASS-METHODS

loadFile(FILE)

Reads a .sofu file and converts it to Sofud compatible objects.

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

Returns a Data::Sofu::Object

        my $tree=Data::Sofu->loadFile("1.sofu");
        print $tree->list("Foo")->value(5);
        $tree->list("Foo")->appendElement(new Data::Sofu::Value(8));
        $tree->write("2.sofu");

METHODS (OO)

new()

Creates a new Data::Sofu object.

setIndent(INDENT)

Sets the indent to INDENT. Default indent is "\t".

setWarnings( 1/0 )

Enables/Disables sofu syntax warnings.

comments()

Gets/sets the comments of the last file read

write(FILE,DATA,[COMMENTS])

Writes a sofu file with the name FILE.

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

An existing file of this name will be overwritten.

DATA can be a scalar, a hashref or an arrayref.

The top element of sofu files must be a hash, so any other datatype is converted to {Value=>DATA}.

        @a=(1,2,3);
        $sofu->write("Test.sofu",\@a);
        %data=$sofu->read("Test.sofu");
        @a=@{$data->{Value}}; # (1,2,3)

COMMENTS is a reference to hash with comments like the one retuned by comments()

read(FILE)

Reads the sofu file FILE and returns a hash with the data.

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

pack(DATA,[COMMENTS])

Packs DATA to a sofu string.

DATA can be a scalar, a hashref or an arrayref.

COMMENTS is a reference to hash with comments like the one retuned by comments

This is different from a normal write(), because the lines are NOT indented and there will be placed brackets around the topmost element. (Which is not Sofu 0.2 conform, please use write(\$scalar,$data) instead).

packBinary(DATA,[COMMENTS])

Same as pack(DATA,[COMMENTS]), but output is binary.

unpack(SOFU STRING)

This function unpacks SOFU STRING and returns a scalar, which can be either a string or a reference to a hash or a reference to an array.

load(FILE)

Reads a .sofu file and converts it to Sofud compatible objects

FILE can be:

A reference to a filehandle with the right encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

Returns a Data::Sofu::Object

toObjects(DATA, [COMMENTS])

Builds a Sofu Object Tree from a perl data structure

DATA can be a scalar, a hashref or an arrayref.

COMMENTS is a reference to hash with comments like the one retuned by comments

Returns a Data::Sofu::Object

writeBinary(FILE, DATA, [Comments, [Encoding, [ByteOrder, [SofuMark]]]])

Writes the Data as a binary file.

FILE can be:

A reference to a filehandle with raw encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

DATA has to be a reference to a Hash or Data::Sofu::Object

COMMENTS is a reference to hash with comments like the one retuned by comments

More info on the other parameters in Data::Sofu::Binary

To write other Datastructures use this:

        $sofu->writeBinary("1.sofu",{Value=>$data});

writeML(FILE, DATA, [COMMENTS,[HEADER]])

Writes the Data as an XML file (for postprocessing with XSLT or CSS)

FILE can be:

A reference to a filehandle with some encoding set or

a filename or

a reference to a scalar (Data will be read from a scalar)

DATA has to be a reference to a Hash or Data::Sofu::Object

COMMENTS is a reference to hash with comments like the one retuned by comments, only used when DATA is not a Data::Sofu::Object

HEADER can be an costum file header, (defaults to qq(<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<!DOCTYPE Sofu SYSTEM "http://sofu.sf.net/Sofu.dtd">\n) );

Default output (when given a filename) is UTF-8.

packML (DATA, COMMENTS,[HEADER])

Returns DATA as an XML file (for postprocessing with XSLT or CSS) with no Indentation

DATA has to be a reference to a Hash or Data::Sofu::Object

COMMENTS is a reference to hash with comments like the one retuned by comments, only used when DATA is not a Data::Sofu::Object

HEADER can be an costum file header, (defaults to qq(<?xml version="1.0" encoding="UTF-8" standalone="no"?>\n<!DOCTYPE Sofu SYSTEM "http://sofu.sf.net/Sofu.dtd">\n) );

Those are not (quite) the same:

        $string = $sofu->packML($tree,$comments) #Will not indent.
        $sofu->writeML(\$string,$tree,$comments)# Will indent.

INTERNAL METHODS

Sofuescape

Escapes a value for Sofu

Sofukeyescape

Escapes a sofu key

Sofukeyunescape

Inversion of Sofukeyescape().

SofuloadFile

Same as loadSofu().

allWarn

Turns on all warnings

comment

like comments()

commentary

This is used to print the comments into the file

escape

Method that calls Sofuescape()

deescape

When parsing a file this one tries to filter out references and deescape sofu strings.

get

Gets the next char from the input or the buffer. Also takes care of comments.

getSingleValue

Tries to parse a single value or list

getSofuComments

Same as getSofucomments().

iDontKnowWhatIAmDoing()

Turns on warnings.

iKnowWhatIAmDoing()

Turns on warnings.

warn()

Turns on warnings.

noWarn()

Turns off warnings.

keyescape()

Same as Sofukeyescape only as a method.

keyunescape()

Same as Sofukeyunescape only as a method.

noComments()

Discards all commentary from the file while reading.

object([0/1]).

Enables/disables the object parser (done by readSofu and loadSofu)

parsList()

Reads a Sofu list from the input buffer.

parsMap()

Reads a Sofu map from the input buffer.

parsValue()

Reads a Sofu value / Sofu 0.1 list from the input buffer.

postprocess()

Corrects references and puts comments into the objects (if load/loadSofu is used)

refe()

Tests if the input is a reference.

storeComment()

Stores a comment into the database while reading a sofu file.

wasbinary()

True when the read file was binary.

writeList()

Used to pack/write a sofu list.

writeMap()

Used to pack/write a sofu map.

CHANGES

Keys are now automatically escaped according to the new sofu specification.

Double used references will now be converted to Sofu-References.

read, load, readSofu, loadSofu and Data::Sofu::loaFile now detect binary sofu (and load Data::Sofu::Binary)

read, load, readSofu, loadSofu, Data::Sofu::loaFile, unpackSofu and unpack detect SofuML (and load Data::Sofu::SofuML)

BUGS

Comments written after an object will be rewritten at the top of an object:

        foo = { # Comment1
                Bar = "Baz"
        } # Comment2

will get to:

        foo = { # Comment1
        # Comment 2
                Bar = "Baz"
        } 

NOTE on Unicode

Sofu File are normally written in a Unicode format. Data::Sofu is trying to guess which format to read (usually works, thanks to Encode::Guess).

On the other hand the output defaults to UTF-16 (UNIX) (like SofuD). If you need other encoding you will have to prepare the filehandle yourself and give it to the write() funktions...

        open my $fh,">:encoding(latin1)","out.sofu";
        writeSofu($fh,$data);

Warning: UTF32 BE is not supported without BOM (looks too much like Binary);

Notes:

As for Encodings under Windows you should always have a :raw a first layer, but to make them compatible with Windows programs you will have to access special tricks:

        open my $fh,">:raw:encoding(UTF-16):crlf:utf8","out.sofu" #Write Windows UTF-16 Files
        open my $fh,">:raw:encoding(UTF-16)","out.sofu" #Write Unix UTF-16 Files
        #Same goes for UTF32
        
        #UTF-8: Don't use :utf8 or :raw:utf8 alone here, 
        #Perl has a different understanding of utf8 and UTF-8 (utf8 allows some errors).
        open my $fh,">:raw:encoding(UTF-8)","out.sofu" #Unix style UTF-8 
        open my $fh,">:raw:encoding(UTF-8):crlf:utf8","out.sofu" #Windows style UTF-8

        #And right after open():
        print $fh chr(65279); #Print UTF-8 Byte Order Mark (Some programs want it, some programs die on it...)
        

One last thing:

        open my $out,">:raw:encoding(UTF-16BE):crlf:utf8","out.sofu";
        print $out chr(65279); #Byte Order Mark
        #Now you can write out UTF16 with BOM in BigEndian (even if you machine in Little Endian)

SEE ALSO

perl(1),http://sofu.sf.net

For Sofud compatible Object Notation: Data::Sofu::Object

For Sofu Binary: Data::Sofu::Binary

For SofuML Data::Sofu::SofuML