The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

orgsel - Select Org document elements using CSel (CSS-selector-like) syntax

VERSION

This document describes version 0.014 of orgsel (from Perl distribution App-orgsel), released on 2021-07-03.

SYNOPSIS

Examples using todo.org

Example document todo.org (you can get this from the distribution shared file):

 #+TODO: TODO PARTIAL INPROGRESS WAITING PENDING | DONE OLD CANCELLED RETIRED DELEGATED FAILED DUPE
 #+TODO: BUG | NOTBUG FIXED CANTREPRO WONTFIX CANTFIX
 #+TODO: IDEA WISHLIST | CANCELLED REJECTED
 #+TODO: POTENTIAL | CANTUSE
 
 * proj > perl [0/1]
 ** TODO [2021-06-26 Sat] SQLite::KeyValueStore::Simple: add delete function
 ** BUG [2021-06-21 Mon] Perinci::CmdLine: unwanted removal of quotes?
 example:
 
 : ... | csv-map -H --eval '"FOO"'
 
 when $args{eval} is received by the function, if becomes ~FOO~ and not ~"FOO"~.
 in other words, the extra quote is stripped.
 ** BUG [2021-06-21 Mon] Config::IOD: set_value() eats comments
 - log ::
   + [2021-06-22] this is because currently "raw value" is set, not "value".
   + [2021-06-21] entry
 ** WISHLIST [2021-06-18 Fri] Acme::CPANModules: category
 we can already do this with Rinci, using 'category:' tags. we just need the
 figure out how to encode category when we are writing module name in markdown
 description ("<pm:Foo::Bar>").
 ** IDEA [2021-06-17 Thu] [#C] perl module: Software::Catalog::SW::rakudo::moar
 note: we need at least rakudo, perl6, and zef to install to /usr/local/bin.
 * proj > perl > csvutils [0/2]
 ** TODO [2020-05-26 Tue] create a more generic version of csv-grep or csv-map
 allow perl code specified in cli to modify or filter table rows. able to operate
 on homs (hash of maybe-strings) or aoms (array of maybe-strings).
 
 this can be applied for other cli like firefox-mua-delete-containers or
 firefox-mua-modify-containers, example:
 
 to change all colors to 'red' dan all icons to 'fingerprint':
 
 : % firefox-mua-modify-containers -e '$_->{color} = "red"; $_->{icon} = "fingerprint"'; # of course there will be a --dry-run option
 
 to delete all containers containing /test/ in their name:
 
 : % firefox-mua-delete-containers -e '$_->{name} =~ /test/'; # of course there will be a --dry-run option
 
 - log ::
   + [2020-06-03 Wed] done on firefox-mua-modify-containers (can delete items as
     well by the code returning false/undef)
 ** TODO [2020-05-29 Fri] [#C] csvutils: make csv-grep-fields, a more flexible form of csv-select-fields
 ** DONE [2020-05-29 Fri] csvutils: make csv-transpose (like 'td transpose')
 - log ::
   + [2020-08-16 Sun] done

To get a picture of the structure, you can use dump-org-structure, for example dump-org-structure todo.org will give:

 Document:
   Setting: "#+TODO: TODO PARTIAL INPROGRESS WAITING PENDING..."
   Setting: "#+TODO: BUG | NOTBUG FIXED CANTREPRO WONTFIX..."
   Setting: "#+TODO: IDEA WISHLIST | CANCELLED REJECTED\n"
   Setting: "#+TODO: POTENTIAL | CANTUSE\n"
   Text: "\n"
   Headline: l=1 prog=0/1
     (title)
     Text: "proj > perl "
     (children)
     Headline: l=2 todo=TODO "** TODO [2021-06-26 Sat] SQLite::KeyValueStore:..."
       (title)
       Text:
         Timestamp: 2021-06-26T00:00:00F "[2021-06-26 Sat]"
         Text: " SQLite::KeyValueStore::Simple: add delete..."
     Headline: l=2 todo=BUG
       (title)
       Text:
         Timestamp: 2021-06-21T00:00:00F "[2021-06-21 Mon]"
         Text: " Perinci::CmdLine: unwanted removal of quotes?"
       (children)
       Text: "example:\n\n"
       FixedWidthSection: ": ... | csv-map -H --eval '\"FOO\"'\n"
       Text: "\nwhen $args{eval} is received by the function, ..."
       Text: V "~FOO~"
       Text: " and not "
       Text: V "~\"FOO\"~"
       Text: ".\nin other words, the extra quote is stripped.\n"
     Headline: l=2 todo=BUG
       (title)
       Text:
         Timestamp: 2021-06-21T00:00:00F "[2021-06-21 Mon]"
         Text: " Config::IOD: set_value() eats comments"
       (children)
       List: D(-) indent=0
         ListItem: -
           (description term)
           Text: "log"
           (children)
           Text: "\n"
         List: U(+) indent=2
           ListItem: +
             (children)
             Timestamp: 2021-06-22T00:00:00F "[2021-06-22]"
             Text: " this is because currently \"raw value\" is set,..."
           ListItem: +
             (children)
             Timestamp: 2021-06-21T00:00:00F "[2021-06-21]"
             Text: " entry\n"
     Headline: l=2 todo=WISHLIST
       (title)
       Text:
         Timestamp: 2021-06-18T00:00:00F "[2021-06-18 Fri]"
         Text: " Acme::CPANModules: category"
       (children)
       Text: "we can already do this with Rinci, using..."
     Headline: l=2 todo=IDEA prio=C
       (title)
       Text:
         Timestamp: 2021-06-17T00:00:00F "[2021-06-17 Thu]"
         Text: " perl module: Software::Catalog::SW::rakudo::moar"
       (children)
       Text: "note: we need at least rakudo, perl6, and zef..."
   Headline: l=1 prog=0/2
     (title)
     Text: "proj > perl > csvutils "
     (children)
     Headline: l=2 todo=TODO
       (title)
       Text:
         Timestamp: 2020-05-26T00:00:00F "[2020-05-26 Tue]"
         Text: " create a more generic version of csv-grep or..."
       (children)
       Text: "allow perl code specified in cli to modify or..."
       FixedWidthSection: ": % firefox-mua-modify-containers -e '$_->{colo..."
       Text: "\nto delete all containers containing "
       Text: I "/test/"
       Text: " in their name:\n\n"
       FixedWidthSection: ": % firefox-mua-delete-containers -e '$_->{name..."
       Text: "\n"
       List: D(-) indent=0
         ListItem: -
           (description term)
           Text: "log"
           (children)
           Text: "\n"
         List: U(+) indent=2
           ListItem: +
             (children)
             Timestamp: 2020-06-03T00:00:00F "[2020-06-03 Wed]"
             Text: " done on firefox-mua-modify-containers (can..."
     Headline: l=2 todo=TODO prio=C "** TODO [2020-05-29 Fri] [#C] csvutils: make..."
       (title)
       Text:
         Timestamp: 2020-05-29T00:00:00F "[2020-05-29 Fri]"
         Text: " csvutils: make csv-grep-fields, a more..."
     Headline: l=2 todo=DONE
       (title)
       Text:
         Timestamp: 2020-05-29T00:00:00F "[2020-05-29 Fri]"
         Text: " csvutils: make csv-transpose (like 'td..."
       (children)
       List: D(-) indent=0
         ListItem: -
           (description term)
           Text: "log"
           (children)
           Text: "\n"
         List: U(+) indent=2
           ListItem: +
             (children)
             Timestamp: 2020-08-16T00:00:00F "[2020-08-16 Sun]"
             Text: " done\n"

Now for some selecting examples on todo.org:

 # select the second-level headlines where we store the todo items (title-only)
 % orgsel todo.org 'Headline[level=2]' --node-action eval:'print $_->title->as_string,"\n"'
 [2021-06-26 Sat] SQLite::KeyValueStore::Simple: add delete function
 [2021-06-21 Mon] Perinci::CmdLine: unwanted removal of quotes?
 [2021-06-21 Mon] Config::IOD: set_value() eats comments
 [2021-06-18 Fri] Acme::CPANModules: category
 [2021-06-17 Thu] perl module: Software::Catalog::SW::rakudo::moar
 [2020-05-26 Tue] create a more generic version of csv-grep or csv-map
 [2020-05-29 Fri] csvutils: make csv-grep-fields, a more flexible form of csv-select-fields

 # when was a specific todo list last updated? (we look at log entries)
 % orgsel todo.org 'Headline[level=2][title.as_string =~ /set_value/] Timestamp:first'
 [2021-06-22]

 # how many bugs do we have?
 % orgsel todo.org 'Headline[level=2][todo_state="BUG"]' --count

 # how many bugs and other undone todo items?
 % orgsel todo.org 'Headline[level=2][is_todo is true][is_done is false]' --count

 # how many undone todo items for csvutils project?
 % orgsel todo.org 'Headline[level=1][title.as_string =~ /csvutils/] > Headline[level=2][is_todo is true][is_done is false]' --count

 # dump a particular todo item (helps visualize structure to select further)
 % orgsel todo.org 'Headline[level=2][title.as_string =~ /eats comments/]' | dump-org-structure

 # show the update log of the todo item (with the "- log ::"):
 % orgsel todo.org 'Headline[level=2][title.as_string =~ /eats comments/] > List > ListItem[desc_term.text = "log"]:parent'

 # show the update log of the todo item (without the "- log ::"):
 % orgsel todo.org 'Headline[level=2][title.as_string =~ /eats comments/] > List > ListItem[desc_term.text = "log"] + *'

 # select todo items which have been updated at least twice (by looking at
 # number of list items under "log")
 % orgsel todo.org 'Headline[level=2] > List > ListItem[desc_term.text = "log"] + *:has-min-children(2):parent:parent'

Examples using addressbook.org

Example document addressbook.org (you can get this from the distribution shared file):

 #+TODO: TODO | OLD
 
 * business > tax consultants
 * business > veterinarian
 ** budi chandra                                                         :vet:
 - born :: 1981?
 - notes ::
   + moslem
   + has two kids, the older is 10yo @2012
 - address :: jl anggrek no 123, bandung
 - opening hours :: every day 07-21, also on call 24h
 - phone :: 0855 555 1234
 - log ::
   + [2012-07-30 ] dr went to our house to vaccinate bonnie
   + [2010-04-21 ] went to his clinic, bought worm tablets
 ** deni setiawan                                                        :vet:
 - notes ::
   + day-care for dogs etc at his house, but there is concern about ticks
 - address :: jl mawar 456, bandung
 - phone :: 022 5551235
 - log ::
   + [2013-09-18 Rab] entry
 * family tree > dad's side
 ** budi roland                                                     :deceased:
 ** grace shanti
 - notes ::
   + father's first aunt
   + 85 yo @2021
 - log ::
   + [2021-06-26] entry
 * family tree > mom's side
 * zzz > old
 ** OLD rudi sanusi
 - phone/whatsapp :: 0855 555 9739
 - log ::
   + [2021-06-27 Sun] mark as old
 * zzz > unorganized
 ** agus (lisa's husband)
 ** agus putra
 ** susan muliawati
 ** susi
 ** susi (ron's friend)                                                  :vet:
 - log ::
   + [2021-04-03] met her again (the 3rd time i guess?) at ron's birthday party.
     she's married now.
   + [2019-01-02] entry

Now for some selecting examples on todo.org:

 # list contacts which have certain tag ("vet")
 % orgsel addressbook.org 'Headline[level=2][tags has "vet"]' --eval 'say $_->title->as_string'

 # count contacts matching /budi (and list them as well)
 % orgsel addressbook.org 'Headline[level=2][title.as_string =~ /budi/i]'  --eval 'say $_->title->as_string' --count

 # show notes about contact 'deni setiawan'
 % orgsel addressbook.org 'Headline[level=2][title.as_string = "deni setiawan"] > List > ListItem[desc_term.text = "notes"] + *'

DESCRIPTION

This utility allows you to select nodes from Org document on the command-line using CSel selector syntax.

Org is a plain-text document format for keeping notes, maintaining to-do lists, planning projects, authoring documents, and more. The specification and official documentation is at https://orgmode.org.

CSel is a pattern syntax to select various elements from a tree. It is modeled after CSS selector. The specification is at Data::CSel.

In orgsel, Org document is first parsed into a document tree using Org::Parser then selected with the given CSel expression. Types are Org::Element::* classes (without the prefix).

See examples in the Synopsis to get an idea on how to use orgsel.

Some tips:

  • To find out which attributes (methods) available for selecting using attribute selector, see documentation on Org::Element::*

OPTIONS

* marks required options.

Main options

--count

Shortcut for --node-action count.

See --node-action.

--dump

Shortcut for --node-action dump.

See --node-action.

--eval=s@

--eval E is shortcut for --action eval:E.

See --node-action.

Can be specified multiple times.

--expr=s

Can also be specified as the 2nd command-line argument.

--file=s

Default value:

 "-"

Can also be specified as the 1st command-line argument.

--node-action=s@

Specify action(s) to perform on matching nodes.

Default value:

 ["print_as_string"]

Each action can be one of the following:

* `count` will print the number of matching nodes.

* `print_method` will call on or more of the node object's methods and print the result. Example:

    print_method:as_string

* `dump` will show a indented text representation of the node and its descendants. Each line will print information about a single node: its class, followed by the value of one or more attributes. You can specify which attributes to use in a dot-separated syntax, e.g.:

    dump:tag.id.class

  which will result in a node printed like this:

    HTML::Element tag=p id=undef class=undef

By default, if no attributes are specified, `id` is used. If the node class does not support the attribute, or if the value of the attribute is undef, then `undef` is shown.

* `eval` will execute Perl code for each matching node. The Perl code will be called with arguments: `($node)`. For convenience, `$_` is also locally set to the matching node. Example in <prog:htmlsel> you can add this action:

    eval:'print $_->tag'

  which will print the tag name for each matching <pm:HTML::Element> node.

Can be specified multiple times.

--node-actions-json=s

Specify action(s) to perform on matching nodes (JSON-encoded).

See --node-action.

--node-actions-on-descendants=s

Specify how descendants should be actioned upon.

Default value:

 ""

Valid values:

 ["","descendants_depth_first"]

This option sets how node action is performed (See `node_actions` option).

When set to '' (the default), then only matching nodes are actioned upon.

When set to 'descendants_depth_first', then after each matching node is actioned upon by an action, the descendants of the matching node are also actioned, in depth-first order. This option is sometimes necessary e.g. when your node's `as_string()` method shows a node's string representation that does not include its descendants.

--print

Shortcut for --node-action print_as_string.

See --node-action.

--print-method=s@

--print-method M is shortcut for --node-action print_method:M.

See --node-action.

Can be specified multiple times.

--root

Shortcut for --select-action=root.

See --select-action.

--select-action=s

Specify how we should select nodes.

Default value:

 "csel"

Valid values:

 ["csel","root"]

The default is `csel`, which will select nodes from the tree using the CSel expression. Note that the root node itself is not included. For more details on CSel expression, refer to <pm:Data::CSel>.

`root` will return a single node which is the root node.

-R

Shortcut for --node-action-on-descendants=descendants_depth_first.

See --node-actions-on-descendants.

Output options

--format=s

Choose output format, e.g. json, text.

Default value:

 undef
--json

Set output format to json.

--naked-res

When outputing as JSON, strip result envelope.

Default value:

 0

By default, when outputing as JSON, the full enveloped result is returned, e.g.:

    [200,"OK",[1,2,3],{"func.extra"=>4}]

The reason is so you can get the status (1st element), status message (2nd element) as well as result metadata/extra result (4th element) instead of just the result (3rd element). However, sometimes you want just the result, e.g. when you want to pipe the result for more post-processing. In this case you can use `--naked-res` so you just get:

    [1,2,3]
--page-result

Filter output through a pager.

--view-result

View output using a viewer.

Other options

--help, -h, -?

Display help message and exit.

--version, -v

Display program's version and exit.

COMPLETION

This script has shell tab completion capability with support for several shells.

bash

To activate bash completion for this script, put:

 complete -C orgsel orgsel

in your bash startup (e.g. ~/.bashrc). Your next shell session will then recognize tab completion for the command. Or, you can also directly execute the line above in your shell to activate immediately.

It is recommended, however, that you install modules using cpanm-shcompgen which can activate shell completion for scripts immediately.

tcsh

To activate tcsh completion for this script, put:

 complete orgsel 'p/*/`orgsel`/'

in your tcsh startup (e.g. ~/.tcshrc). Your next shell session will then recognize tab completion for the command. Or, you can also directly execute the line above in your shell to activate immediately.

It is also recommended to install shcompgen (see above).

other shells

For fish and zsh, install shcompgen as described above.

HOMEPAGE

Please visit the project's homepage at https://metacpan.org/release/App-orgsel.

SOURCE

Source repository is at https://github.com/perlancar/perl-App-orgsel.

BUGS

Please report any bugs or feature requests on the bugtracker website https://rt.cpan.org/Public/Dist/Display.html?Name=App-orgsel

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

SEE ALSO

If you want to select Org node elements from Perl, use Org::Parser and Data::CSel directly.

More examples of CSel in general in CSel::Examples.

App::OrgUtils for other Org-related utilities.

htmlsel, jsonsel, yamlsel, ddsel, podsel, ppisel apply CSel to other types of documents.

AUTHOR

perlancar <perlancar@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2021, 2020, 2019, 2016 by perlancar@cpan.org.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.