++ed by:
Kevin Ryde
and 1 contributors

# NAME

Math::NumSeq::BalancedBinary -- balanced 1,0 bits

# SYNOPSIS

use Math::NumSeq::BalancedBinary;
my \$seq = Math::NumSeq::BalancedBinary->new;
my (\$i, \$value) = \$seq->next;

# DESCRIPTION

This sequence is integers with 1-bits and 0-bits balanced like opening and closing parentheses.

2, 10, 12, 42, 44, 50, 52, 56, 170, 172, 178, ...
starting i=1

Written in binary, a 1-bit is an opening "(" and a 0-bit is a closing ")".

value
i   in binary   as parens
---  ---------   ----------
1         10    ()
2       1010    () ()
3       1100    (())
4     101010    () () ()
5     101100    () (())
6     110010    (()) ()
7     110100    (() ())
8     111000    ((()))
9   10101010    () () () ()
10   10101100    () () (())

Balanced means the total number of 1s and 0s are the same and when reading from high to low has count(1s) >= count(0s) at all times, which is to say any closing ")" must have a preceding matching open "(".

Because the number of 1s and 0s are equal, the width is always an even 2*w. The number of values with a given width 2*w is the Catalan number per (Math::NumSeq::Catalan). For example 6-bit values w=6/2=3 is C(3) = (2*3)!/(3!*4!) = 5 many such values, being i=4 through i=8 inclusive shown above.

## Binary Trees

The sequence values correspond to binary trees where each node can have a left and/or right child. Such a tree can be encoded by writing

0                         if no node (empty tree)
1,left-tree,right-tree    at a node

The "left-tree" and "right-tree" parts are the left and right legs of the node written out recursively. If those legs are both empty (ie. the node is a leaf) then they're empty trees and are "0" giving node "100". Otherwise the node is 1 followed by various more 1s and 0s. For example,

a (root)
/ \
b   c      =>  11001010 [0]
\         ab  c d   ^-final zero of encoding omitted
d

At "a" write 1 and recurse to write its left then right legs. The left leg is "b" so write 1 and the two legs of "b" are empty so write 0,0. That completes the left side of "a" so resume at the right side of "a" which is 1 for "c" and descend to the left and right of "c". The left of "c" is empty so write 0. The right of "c" is "d" so write 1 and the two empty legs of "d" are 0,0. The very final 0 from that right-most leaf "d" is dropped (shown "[0]" above).

This encoding can also be applied breadth-first by pushing the left and right descents onto a queue of pending work instead of onto a stack by recursing. In both cases there's an extra final 0 which is dropped. This 0 arises because in any binary tree with K nodes there are K+1 empty legs. That would give K many 1-bits and K+1 many 0-bits.

In this encoding the balanced binary condition "count 1s >= count 0s" corresponds to there being at least one unfinished node at any time in the traversal (by whichever node order).

The NumSeq code here acts on values as numbers. Tree encodings like this are probably better handled as a string or list of bits.

## Mountain Ranges

A further usual and attractive interpretation of the opens and closes is as up and down slopes of a mountain range. 1-bit for up, 0-bit for down. For example,

/\
/  \  /\
/    \/  \
/          \/\
----------------
11110001100010

The mountain range must end at its starting level and must remain at or above its starting level at all times.

Numerical order of the values means narrower mountain ranges are before wider ones, and two ranges with equal width are ordered by down-slope preceding up-slope at the first place they differ.

# FUNCTIONS

See "FUNCTIONS" in Math::NumSeq for behaviour common to all sequence classes.

\$seq = Math::NumSeq::BalancedBinary->new ()

Create and return a new sequence object.

## Random Access

\$value = \$seq->ith(\$i)

Return the \$i'th balanced binary number.

\$bool = \$seq->pred(\$value)

Return true if \$value is balanced binary.

\$i = \$seq->value_to_i(\$value)

If \$value is balanced binary then return its index i. If \$value is not balanced binary then return undef.

\$i = \$seq->value_to_i_ceil(\$value)
\$i = \$seq->value_to_i_floor(\$value)

Return the index i of \$value or if \$value is not a balanced binary integer then return the i of the next higher or lower value, respectively.

# FORMULAS

## Next

When stepping to the next value, of same bit length, the number of 1s and 0s does not change. The 1s move to make a numerically higher value. The simplest case is an isolated low 1-bit. It must move up one place. For example,

11100100             isolated low 1-bit
->                   shifts up
11101000

If the low 1 has a 1 above it then that bit must move up and the lower one goes to the low end of the value so as to be the smallest increase. For example

1110011000           pair of bits
->                   one shifts up, other drops to low end
1110100010

In general the lowest run of 1-bits is changed to have the highest of them move up one place and the rest move down to be a ...101010 pattern at the low end. For example a low run of 3 bits

1111100111000000     run of bits
->                   one shifts up, rest drop to low end
1111101000001010
^     ^ ^
up     low end

The final value in a 2*w block has all 1s at the high end. The first of the next bigger block of values is an alternating 1010..10. For example

111000    last 6-bit value, all 1-bits at high end
->
10101010    first 8-bit value

This incrementing is fairly straightforward. Some pseudocode can be found in M.C. Er, "Enumerating Ordered Trees Lexicographically", The Computer Journal, volume 28, number 5, 1985, procedure GenSuc (and Rank and Unrank).

## Ith

As described above there are Catalan(w) many values with 2*w bits. The width of the i'th value can be found by successively subtracting C(1), C(2), etc until reaching a remainder i < C(w). At that point the value is 2*w many bits, being w many "1"s and w many "0"s.

In general after outputting some bits of the value (at the high end) there will be a number z many "0"s and n many "1"s yet to be output. The choice is then to output either 0 or 1 and reduce z or n accordingly.

numvalues(z,n) = number of sequences of z "0"s and n "1"s
with remaining 1s >= remaining 0s at all times

N = numvalues(z-1,n)
= how many combinations starting with zero "0..."

if i < N  then output 0
if i >= N then output 1 and subtract N from i
(which is the "0..." combinations skipped)

numvalues() is the "Catalan table" constructed by

for z=1 upwards
numvalues(z,0) = 1
for n = 1 to z
numvalues(z,n) = numvalues(z-1,n)  # the 0... forms
+ numvalues(z,n-1)  # the 1... forms

In each numvalues(z,n) the numvalues(z,n-1) term is the previous numvalues calculated, so a simple addition loop for the table

for z=1 upwards
t = numvalues(z,0) = 1
for n = 1 to z
t += numvalues(z-1,n)
numvalues(z,n) = t

The last entry numvalues(w,w) in each row is Catalan(w), so that can be used for the initial i subtractions seeking the width w. If building or extending a table each time then stop the table at that point.

Catalan(w) grows as a little less than a power 4^w so the table has a little more than log4(i) many rows.

# OEIS

Entries in Sloane's Online Encyclopedia of Integer Sequences related to this sequence include

A063171     binary
A071152     binary, digits 0,2
A085185     base 4
A080116     predicate
A072643     width
A085192     differences
A080237     number of low 0 bits
A085223     i where single low 0 bit
A002054     number of 01 bit pairs in length 2n
A057520     value/2, so sans low 0 bit
A085183     value sans high 1 and low 0 bits
A085184        in base 4