Algorithm::Kmeanspp - perl implementation of K-means++
use Algorithm::Kmeanspp; # input documents my %documents = ( Alex => { 'Pop' => 10, 'R&B' => 6, 'Rock' => 4 }, Bob => { 'Jazz' => 8, 'Reggae' => 9 }, Dave => { 'Classic' => 4, 'World' => 4 }, Ted => { 'Jazz' => 9, 'Metal' => 2, 'Reggae' => 6 }, Fred => { 'Hip-hop' => 3, 'Rock' => 3, 'Pop' => 3 }, Sam => { 'Classic' => 8, 'Rock' => 1 }, ); my $kmp = Algorithm::Kmeanspp->new; foreach my $id (keys %documents) { $kmp->add_document($id, $documents{$id}); } my $num_cluster = 3; my $num_iter = 20; $kmp->do_clustering($num_cluster, $num_iter); # show clustering result foreach my $cluster (@{ $kmp->clusters }) { print join "\t", @{ $cluster }; print "\n"; } # show cluster centroids foreach my $centroid (@{ $kmp->centroids }) { print join "\t", map { sprintf "%s:%.4f", $_, $centroid->{$_} } keys %{ $centroid }; print "\n"; }
Algorithm::Kmeanspp is a perl implementation of K-means++.
Create a new instance.
Add an input document to the instance of Algorithm::Kmeanspp. $id parameter is the identifier of a document, and $vector parameter is the feature vector of a document. $vector parameter must be a hash reference, each key of $vector parameter is the identifier of the feature of documents and each value of $vector is the degree of the feature.
Do clustering input documents. $num_cluster parameter specifies the number of output clusters, and $num_iter parameter specifies the number of clustering iterations.
This method is the accessor of clustering result. The output of the method is a array reference, and each item in the array reference includes the list of the identifiers of input documents in each cluster.
# format of output clusters [ [ document_id1, document_id2, ... ], # cluster-1 [ document_id3, document_id4, ... ], # cluster-2 ... ]
This method is the accessor of the vectors of cluster centroids.
Mizuki Fujisawa <fujisawa@bayon.cc>
http://en.wikipedia.org/wiki/K-means%2B%2B
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install Algorithm::Kmeanspp, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Algorithm::Kmeanspp
CPAN shell
perl -MCPAN -e shell install Algorithm::Kmeanspp
For more information on module installation, please visit the detailed CPAN module installation guide.