Mirrored Arrays (Update: Involutions)

Welcome to this useless blog, let’s begin!

As my first post, I’m going to present a nice python puzzle, at least it was a puzzle for me. It gave me even the motivation to start a blog…

Recently I was working on a project and accidentally discovered some strange numbers showing up between my sorting results. First I thought it’s a bug, but it’s more like an unintentional feature ;)
It was a very simple function which took an array of numbers as parameter and returned another one. It is important that the parameter passed to the function must be a range of numbers starting from 1, without skipping. The actual code just displaced the numbers in the array, placing the number from the original array into the new array. The index in the new array was the actual number from the first one, and the value in the second array was the index from the first. Here is an illustration:

As you see it was nothing fancy…. So again, the value of the first array indicates the index in the second array, and the index of the first array becomes the value in the second array. If you place the numbers following that rule, you’ll get exactly the same number. They got the name mirrored arrays because of that properties. Now the tricky part was that it’s not true for all the permutations of the sequence, and this is why I thought first that there is something wrong with my code. For a range of numbers from 1 to 8, there are 40320 permutations, but only 764 of them can be mirrored.

I looked into the function and tried to figure out if it is a new mathematical discovery I made, or just a silly puzzle. Well the second option seemed more realistic… Back to the topic, the question is how can I generate these numbers?

Luckily the toolbox was already open, so the first implementation was finished quickly. It looks like:

from itertools import permutations

def find_mirror_arrays(array_length):
    '''
    First brute-force implementation for finding mirrored arrays
    Generates all permutations of array_length and test every
    one of them separately.
    '''
    array = [i for i in range(1, array_length + 1)]
    array_permutations = permutations(array, array_length)
    found_arrays = []

    for test_array in array_permutations:
        new_array = [0] * array_length

        for arr_index, arr_value in enumerate(test_array):
            new_array[arr_value - 1] = arr_index + 1

        if tuple(new_array) == test_array:
            found_arrays.append(new_array)

    return found_arrays

mirror_arrays = find_mirror_arrays(8)

Now what this snippet does is it generates all the permutations of the required length and then tests every one of them with that simple displacer operation, so after we execute the displacement operation on a sequence we should get the same sequence back. If we get another sequence, of course it means that it’s not a mirrored array.
Now the problem with that implementation is, that it’s a brute force method, primitive enough to cause performance problems above the length of 10 numbers.

The interesting part just started. How to generate those numbers manually to get better performances? Took a little time until I figured out, but finally I got an idea. The simplest way I had in my mind was using a recursive function. Although recursive functions are usually a bit more messy, I don’t like them either, but in theory using them in this case would result with more readable code than doing it by a regular approach.

Here’s the code i got then:

from copy import deepcopy

class MirrorCrafter(object):
    '''
    Second implementation, non-bruteforce method, uses a recursive function
    to generate a tree of solutions and collect all the results.
    '''
    def __init__(self):
        self._mirrored_arrays = []

    def _get_child_branches(self, parent):
        remaining_numbers = list(parent['remaining_numbers'])
        for num in remaining_numbers:

            src_list = list(parent['src_list'])
            free_cell_index = src_list.index(0)
            src_list[free_cell_index] = num
            src_list[num - 1] = free_cell_index + 1

            new_remaining_numbers = list(remaining_numbers)
            try:
                new_remaining_numbers.remove(num)
                new_remaining_numbers.remove(free_cell_index + 1)
            except ValueError:
                pass

            try:
                new_branch
            except NameError:
                new_branch = deepcopy(parent)
                new_branch['child_branches'] = []

            new_branch['src_list'] = src_list
            new_branch['remaining_numbers'] = new_remaining_numbers

            if new_remaining_numbers:
                new_branch['child_branches'].append(self._get_child_branches(new_branch))

        if 0 not in src_list:
            self._mirrored_arrays.append(src_list)

        return new_branch

    def get_mirrors(self, array_length):
        tree_branch = {
                       'src_list': [0] * array_length,
                       'remaining_numbers': range(1, array_length + 1),
                       'child_branches': []
                      }

        self._get_child_branches(tree_branch)

        return self._mirrored_arrays

mc = MirrorCrafter()
mirror_arrays = mc.get_mirrors(8)

This looks much more interesting. It uses a dictionary which has 3 keys. As it calls itself recursively over and over again it creates a tree by appending new branches of solutions. See the following example tree. In a similar way is our tree organized, at least it was the basic idea for it.


Benchmarks were made using a pretty old Athlon64 2800+ @1800MHz. Someone with an i7 could post some screenshots ;) Anyway, take a look at my results:

first implementation on length of 8:

time required to run script: 0.239191
time required to run script: 0.238772
time required to run script: 0.235737

second implementation on length of 8:

time required to run script: 0.550336
time required to run script: 0.577819
time required to run script: 0.551476

first implementation on length of 9:

time required to run script: 2.282437
time required to run script: 2.227491
time required to run script: 2.237375

second implementation on length of 9:

time required to run script: 2.481702
time required to run script: 2.500234
time required to run script: 2.568499

As we can see the first primitive algorithm beats our nextgen implementation. But not for so long, it’s just on smaller lengths, look at this:

first implementation on length of 10:

time required to run script: 23.976178
time required to run script: 24.317904
time required to run script: 24.385197

second implementation on length of 10:

time required to run script: 12.304419
time required to run script: 11.570755
time required to run script: 11.774170

first implementation on length of 11:

time required to run script: 286.148981
time required to run script: 295.728056
time required to run script: 298.418401

second implementation on length of 11:

time required to run script: 59.683516
time required to run script: 59.883970
time required to run script: 59.646269

And the difference exponentially rises… Not bad.

That would be the result of our discovery, the next thing should be to create a third implementation, hopefully even better than the second one. Let’s see… It would be great if a solution could be found with a regular approach using simple loops… After a little time I came up with this:

from copy import deepcopy

def get_mirror_arrays_v3(array_length):
    '''
    Third implementation, non-brute-force method using loops, does not generate
    any trees, the remaining numbers are calculated every time on the fly.
    '''
    available_numbers = range(1, array_length + 1)
    num_seq = [[x] + [0] * (array_length - 1) for x in available_numbers]
    for x in num_seq:
        x[x[0] - 1] = 1

    for count in range(array_length - 2):
        if count != 0:
            num_seq = deepcopy(completed_seq)

        completed_seq = []

        for index, seq in enumerate(num_seq):
            seq = num_seq[index]
            remaining_numbers = [x for x in available_numbers if x not in seq]

            if not remaining_numbers:
                completed_seq.append(seq)

            for next_num in remaining_numbers:
                next_seq = list(seq)
                free_cell_index = next_seq.index(0)
                next_seq[free_cell_index] = next_num
                next_seq[next_num - 1] = free_cell_index + 1
                completed_seq.append(next_seq)

    completed_seq[0][array_length - 1] = array_length
    return completed_seq

mirror_arrays = get_mirror_arrays_v3(8)

Now the code does not generate a tree, nor does it store the remaining numbers, everything is calculated on the fly, only the actual sequences are stored and loaded again in every loop. In theory I would say that it should take much more time to calculate everything again every time, but it looks like it’s still faster than working with dictionaries and data storing / loading constantly.
So the long awaited results for the third implementation are awesome. It outperforms both of the solutions mentioned before, look at them:

third implementation on length of 8:

time required to run script: 0.065746
time required to run script: 0.065963
time required to run script: 0.068096

third implementation on length of 9:

time required to run script: 0.295213
time required to run script: 0.291225
time required to run script: 0.294794

third implementation on length of 10:

time required to run script: 1.305196
time required to run script: 1.315555
time required to run script: 1.323389

third implementation on length of 11:

time required to run script: 5.836298
time required to run script: 5.799165
time required to run script: 5.847414

Looks like the best solution for now… And our job could be finished here, but there is something more we can do. Using the newly created function, it can be easily modified into a recursive function, which guess what, does the job even better! See for yourself:

'''
The fourth implementation actually evolved from the third solution but it uses
a recursive function again, and the remaining numbers are passed as parameter,
so there's no need to calculate them on the fly every time.
'''
def recursive_mirror_generator_v2(source, remaining_numbers):
    if not remaining_numbers: return source

    completed_seq = []

    for next_num in remaining_numbers:
        next_seq = list(source)
        free_cell_index = next_seq.index(0)
        next_seq[free_cell_index] = next_num
        next_seq[next_num - 1] = free_cell_index + 1

        next_remaining_numbers = list(remaining_numbers)
        try:
            next_remaining_numbers.remove(next_num)
            next_remaining_numbers.remove(free_cell_index + 1)
        except ValueError:
            pass

        results = recursive_mirror_generator_v2(next_seq, next_remaining_numbers)
        if isinstance(results[0], int):
            completed_seq.append(results)
        else:
            completed_seq.extend(results)

    return completed_seq

def get_mirrors_recursively_v2(array_length):
    available_numbers = range(1, array_length + 1)
    num_seq = [[x] + [0] * (array_length - 1) for x in available_numbers]

    completed_seq = []
    for x in num_seq:
        x[x[0] - 1] = 1
        remaining_numbers = [y for y in available_numbers if y not in x]
        completed_seq.extend(recursive_mirror_generator_v2(x, remaining_numbers))

    return completed_seq

mirror_arrays = get_mirrors_recursively_v2(14)

WARNING! The following results contains elements of humiliating nature, so readers with weak heart or persons under the age of 18 should be advised to skip this content.

fourth implementation on length of 8:

time required to run script: 0.013698
time required to run script: 0.016129
time required to run script: 0.015955

fourth implementation on length of 9:

time required to run script: 0.049936
time required to run script: 0.052711
time required to run script: 0.061206

fourth implementation on length of 10:

time required to run script: 0.190436
time required to run script: 0.195853
time required to run script: 0.201169

fourth implementation on length of 11:

time required to run script: 0.730234
time required to run script: 0.771284
time required to run script: 0.804766

At this point the funds devoted to the research of this anomaly have been completely depleted, but anyway I’m satisfied with the results. Feel free to post any suggestion, comment or even a better implementation, beat my solutions with pleasure ;)

UPDATE: Thanks for everyone’s help, so now that the actual sequence is identified, I found some very interesting articles about it, along with other algorithms, which outperformed my solutions of course. Special thanks goes to Mikko who created another implementation, which is clearly the winner among all of them.
The first algorithm found is from the combinatorial object server, otherwise known as COS ( http://theory.cs.uvic.ca/inf/perm/Involutions.html ), I just translated it to python, so here it is:

class Involution(object):
    length = 0
    p = list()
    arrays = list()

    def store_array(self):
        self.arrays.append([i + 1 for i in self.p])

    def swap(self, i, j):
        self.p[i], self.p[j] = self.p[j], self.p[i]

    def cycle(self, n):
        if n >= self.length:
            self.store_array()
            return

        self.cycle(n + 1)
        if self.length - n >= 2:
            self.swap(n, n + 1)
            self.cycle(n + 2)

            for i in range(n - 1, -1, -1):
                if self.p[i] == i:
                    self.swap(i, n)
                    self.swap(n, n + 1)
                    self.cycle(n + 2)
                    self.swap(n, n + 1)
                    self.swap(i, n)
                else:
                    self.swap(self.p[i], n)
                    self.swap(i, n + 1)
                    self.cycle(n + 2)
                    self.swap(i, n + 1)
                    self.swap(self.p[i], n)

            self.swap(n, n + 1)

    def main(self, array_length):
        self.length = array_length
        self.p = range(0, self.length)
        self.cycle(0)

        return self.arrays

inv = Involution()
arrays = inv.main(12)

The results are very good:

COS algorithm on length of 8:

time required to run script: 0.009012
time required to run script: 0.009840
time required to run script: 0.009145

COS algorithm on length of 9:

time required to run script: 0.034899
time required to run script: 0.034381
time required to run script: 0.035642

COS algorithm on length of 10:

time required to run script: 0.119194
time required to run script: 0.116180
time required to run script: 0.236970

COS algorithm on length of 11:

time required to run script: 0.709367
time required to run script: 0.465431
time required to run script: 0.467473

And finally Mikko’s algorithms:

def invol(n):
	s1, s2 = ([(1,)], [(1, 2), (2, 1)])
	if n < 2:
		return s1

	for k in range(3, n + 1):
		tmp = [x + (k,) for x in s2]
		for i in range(k - 1):
			tmp.extend(aug(s1, i, k))
		s1, s2 = s2, tmp
	return s2

def aug(lst, i, k):
	for invol in lst:
		tmp = [p + (p > i) for p in invol]
		tmp[i:i] = [k]
		tmp.append(i + 1)
		yield tuple(tmp)

arrays = invol(11)

The results of the first algorithm:

Mikko's first algorithm on length of 8:

time required to run script: 0.010087
time required to run script: 0.003362
time required to run script: 0.010010

Mikko's first algorithm on length of 9:

time required to run script: 0.018520
time required to run script: 0.016312
time required to run script: 0.014148

Mikko's first algorithm on length of 10:

time required to run script: 0.066137
time required to run script: 0.058889
time required to run script: 0.064799

Mikko's first algorithm on length of 11:

time required to run script: 0.238079
time required to run script: 0.249475
time required to run script: 0.237163

Here goes the second algorithm:

def invol2(n):
    s1, s2 = ([[1]], [[1, 2], [2, 1]])
    if n < 2:
        return s1

    for k in range(3, n + 1):
        tmp = [x + [k] for x in s2]
        for x in s1:
            tmp.append(x + [k, k - 1])
            for i in range(k - 2,0 , -1):
                x[x.index(i)] += 1
                tmp.append(x[:i - 1] + [k] + x[i - 1:] + [i])
        s1, s2 = s2, tmp
    return s2

arrays = invol2(11)

The results for the second:

Mikko's second algorithm on length of 8:

time required to run script: 0.005724
time required to run script: 0.010078
time required to run script: 0.004627

Mikko's second algorithm on length of 9:

time required to run script: 0.014572
time required to run script: 0.023883
time required to run script: 0.016792

Mikko's second algorithm on length of 10:

time required to run script: 0.067655
time required to run script: 0.069951
time required to run script: 0.077429

Mikko's second algorithm on length of 11:

time required to run script: 0.247994
time required to run script: 0.319301
time required to run script: 0.262980

And the third algorithm, holding the crown for now:

def invol3(n):
    s1, s2 = ([[1]], [[1, 2], [2, 1]])
    if n < 2:
        return s1

    for k in range(3, n + 1):
        tmp = [x + [k] for x in s2]
        for x in s1:
            tmp.append(x + [k, k - 1])
            for i in range(k - 2, 0, -1):
                a = x[i - 1]
                x[a - 1 - (a > i)] += 1
                w = x[:]
                w.insert(i - 1, k)
                w.append(i)
                tmp.append(w)
        s1, s2 = s2, tmp
    return s2

arrays = invol3(11)

With the dominating results:

Mikko's third algorithm on length of 8:

time required to run script: 0.002960
time required to run script: 0.003011
time required to run script: 0.003201

Mikko's third algorithm on length of 9:

time required to run script: 0.011731
time required to run script: 0.019164
time required to run script: 0.015256

Mikko's third algorithm on length of 10:

time required to run script: 0.048309
time required to run script: 0.053429
time required to run script: 0.054088

Mikko's third algorithm on length of 11:

time required to run script: 0.186751
time required to run script: 0.279246
time required to run script: 0.207985

Once again thanks, it’s a very nice job. If I find something in the meantime, I’ll post another update…


Follow

Get every new post delivered to your Inbox.