Travelling Salesman

Friday, February 01, 2013

This is a challenging problem about a travelling salesman that wants to visit a number of cities efficiently. He needs to sell his product in all the cities so he needs to plot out the shortest path to visit all cities but each city only once. I'm not sure why he can't return to any cities but let's assume his tradecraft is in less agreeable goods.

We'll work through the quirks of the travelling salesman problem. The map can be modelled as a graph where nodes are cities. Also each node is connected to every other node, you can travel in either direction along any edge, and edge lengths are the distances between cities. In other words, you could refer to this graph as complete, undirected, and weighted.

Let's say there are 10 node cities: A to J. The salesman starts at node A.

Finding a "short" path

There's a simple way to find a "short" path. I say "short" but this could also lead us to a very long path in certain cases.

Starting at A, traverse through each node and make note of which path is the shortest.
Take the shortest path (let's say to B) and mark A visited.
Repeat step 1 and 2 until all nodes are visited.

When might this be the shortest path? Imagine the cities laid out in a line like so:

(A) (B) (C) (D) (E) (F) (G) (H) (I) (J)

Using the "short" path algorithm, we can see that we'll just sequentially step left to right and that would certainly find us the shortest path.

When might this be a long path? Try the following:

(J)   (A) (B) (C) (D) (E) (F) (G) (H) (I)

In this case, we would visit J last, but clearly it would be faster to visit J first, then the rest of the nodes.

The problem with this "short" path algorithm is that we are considering too few paths.

Exhaustive search

One way to ensure that we have the shortest path is to simply construct all paths (try all permutations) and pick the shortest. This, however, can be prohibitively expensive as the number of cities increases (see combinatorial explosion). For less busy salesmen though, this is a viable option.

Essentially this search is brute force. It can be implemented as breadth first or depth first. They should take the same time since you'll need to search the entire graph in either case. However, usually it is easier to implement depth first search, so we can do that.

Starting at A, move to next unvisited node, note distance and mark A as visited.
Repeat step 1 until all nodes visited, then you should have some array of all distances travelled. That is your distance path.
Now traverse back up your path until you have the option of visiting a different node. Visit that and dig deeper down until you also have that path complete. (in this case, traversing back up the path is done by finishing the innermost loop and moving on to the next node of the immediate outer loop)
Eventually you will visit every path and document the distances.

We end up with (N-1)! paths which is pretty terrible in terms of computation time (Note this is not N! because we are actually only visiting N-1 nodes since we have defined a start node and we don't return to it). But on the bright side, we know we have the shortest path because we calculated them all!

So what about large N?

Well I wish I could give a solid answer here, but this is a heavily studied academic problem that is beyond my current capabilities. I'll leave you with this Stack Overflow link though, which I plan to study a bit further. I'll be sure to share my findings in an upcoming post, so stay tuned.

Shoot me an email or write a comment if you have any suggested material you think I should peruse. I'd love to learn more about this.

[algorithms] [graph] [shortest path]

Bit manipulation fun

Wednesday, January 30, 2013

So while I was working on interview prep, I got a bit curious about what cool ways bit manipulation might alter the way we might write code. Here's a few examples of short functions you can write using only bit operations.

Simple stuff first

The following trick may be standard fare but seeing as I am not a "classically" trained programmer (no degree in computer science), this guy did tickle my fancy.

When you write out a integer in a binary representation, 5 as 0101 for example, the least significant bit (leftmost bit) represents if the integer is odd or even. Of course, bits further left represent powers of two, hence the term "base two".

0101 is 0*(2^3) + 1*(2^2) + 0*(2^1) + 1*(2^0)
     or 0*8 + 1*4 + 0*2 + 1*1
     or 5

Anyone can write a simple function that tests if a given integer x is even. Usually it will involve checking to see if x mod 2 is 0.

def is_even(x):
    return (x % 2) == 0

But, given that least significant bit of a binary value represents odd or even, we can write that same is_even function using the AND bit operation.

def is_even(x):
    return (x & 1) == 0

For example 6 & 1 is 0.

  0110  <-- 6
& 0001  <-- 1
======
  0000

This ends up being the same amount of typing but it's interesting that we have this second option of writing the is_even function.

Powerful twos

If you work on Project Euler problems frequently, the following shortcut might be beneficial to keep in mind. Here's a function that checks if x is a power of two.

def is_power_of_two(x):
    power_of_two = 2
    while power_of_two < x:
        power_of_two *= 2
    return power_of_two == x

That's pretty compact but it turns out binary is well suited to figure out if a number is a power of two.

def is_power_of_two(x):
    return (x & (x-1)) == 0

Let's see this in action with x as some random binary number:

  010110000  <-- x
- 000000001
===========
  010101111  <-- x-1
& 010110000
===========
  010100000  <-- x & (x-1)

So, (x & (x-1)) just removed the left-most bit that was equal to 1. But then, when might (x & (x-1)) be 0? That would only happen when x has only one 1 bit, which means that x must be a power of 2.

Swapping in place

Python is nice in that you never have to use a temporary variable when you want to do variable swaps. This is, of course, not the same in other languages.

x, y = y, x

Behind the scenes, Python accesses the values of y and x, then rotates them as it unpacks and assigns to x and y. Here's what that looks like when run through the Python disassembler:

import dis

def func():
    a = 1
    b = 2
    a, b = a, b
    a, b = b, a

dis.dis(func)
>
>  3           0 LOAD_CONST               1 (1)
>              3 STORE_FAST               0 (a)
>
>  4           6 LOAD_CONST               2 (2)
>              9 STORE_FAST               1 (b)
>
>  5          12 LOAD_FAST                0 (a)
>             15 LOAD_FAST                1 (b)
>             18 ROT_TWO             
>             19 STORE_FAST               0 (a)
>             22 STORE_FAST               1 (b)
>
>  6          25 LOAD_FAST                1 (b)
>             28 LOAD_FAST                0 (a)
>             31 ROT_TWO             
>             32 STORE_FAST               0 (a)
>             35 STORE_FAST               1 (b)
>             38 LOAD_CONST               0 (None)
>             41 RETURN_VALUE

But, did you know we can actually swap without a temporary variable using XOR too.

x ^= y
y ^= x
x ^= y

To understand what's happening, you should recall that XOR is commutative, x^x is 0, and x^0 is x. Let's break down the XOR Swap Algorithm again.

start   ->  breaking down  ->  commuting      ->  cancelling ->  removing 0's
=============================================================================
x ^= y  ->  x = x^y
y ^= x  ->  y = y^x^y      ->  y = x^y^y      ->  y = x^0    ->  y = x
x ^= y  ->  x = x^y^y^x^y  ->  x = x^x^y^y^y  ->  x = 0^y^0  ->  x = y

That's pretty cool but I'll stick to Python's easy swap. It is nice, however, to see how binary manipulation can be used to our advantage in languages that don't have an in-place swap.

Now for the disclaimer. While the XOR swap is pretty nifty, it doesn't have much practical use. If you actually plan on using it in the real world, hopefully your application fits in one of these buckets. Turns out, optimizing compilers tend to make temporary variable swaps pretty efficient, often faster than this XOR swap.

Have any more bit manipulation tricks you'd like to share? Leave me a comment, I'd love to see more in action.

[bit manipulation] [binary] [tricks] [python] [algorithms]

The case for base cases

Monday, January 28, 2013

You have a complicated problem to solve. How do you start tackling it?

One of the most tried and tested methods of problem solving is nothing more than implementing some noise control. In other words, can you peel back the complexity of the problem and make it simpler?

Tower of Hanoi

I finally sat down and tried solving this problem two days ago. For those of you unfamiliar, the Tower of Hanoi is an old puzzle that has been mostly re-appropriated as a classic example of algorithmic thought. Here's a version of it:

Tower of Hanoi, from Wikipedia

There are three rods and a number of different sized disks that can slide onto any rod. The puzzle starts with the disks in a neat stack, ascending by size, on the first rod. The objective of the puzzle is to move the entire stack to the last rod, obeying the following rules:

Only one disk may be moved at a time.
Each move consists of taking the top-most disk from one of the rods and sliding it onto another rod, on top of the other disks that may already be present on that rod.
No disk may be placed on top of a disk smaller than it.

Seems straightforward enough, but let's solve this for N disks. How should we start?

Let's start with the base cases

N=1

    |         |         |
    |         |         |
   -|-        |         |
========= ========= =========
    A         B         C

1.  -----------------> -|-
Done.

So we solve N=1 with one move of -|-.

N=2

    |         |         |
   -|-        |         |
  --|--       |         |
========= ========= =========
    A         B         C

1.  -------> -|-

    |         |         |
    |         |         |
  --|--      -|-        |
========= ========= =========
    A         B         C

2.  ----------------> --|--
3.            -------> -|-
Done.

Let's take a closer look at these moves:
1. Exactly like solving N=1, except A->B instead of A->C.
2. The second step places the largest disc at the base of C.
3. Also solves N=1, only B->C instead of A->C.
Done.

Let's rewrite our steps like this:
1. Solve N=1, except A->B
2. Move --|-- from A->C
3. Solve N=1, except B->C
Done.

N=3

   -|-        |         |
  --|--       |         |
 ---|---      |         |
========= ========= =========
    A         B         C

1.  -----------------> -|-
2.  ------> --|--
3.           -|- <-------

    |         |         |
    |        -|-        |
 ---|---    --|--       |
========= ========= =========
    A         B         C

4.  ---------------> ---|---
5. -|- <-------
6.            ------> --|--
7.  -----------------> -|-
Done.

Notice how steps 1-3 solve N=2 and places it at B. Rewrite?
1. Solve N=2, except A->B
2. Move ---|--- from A->C
3. Solve N=2, except B->C
Done.

Interesting. Looks like we have a formula.

N=n
1. Solve N=(n-1), except A->B
2. Move largest disc from A->C
3. Solve N=(n-1), except B->C

I was pretty excited when I saw this pattern, as it meant I could code up a solution in very few lines of code. Here it is in Python, using a list as a stack:

def solve_hanoi(n, A, B, C):
    if n == 0:
        return
    solve_hanoi(n-1, A, C, B)
    C.append(A.pop())
    solve_hanoi(n-1, B, A, C)

The trick is to redefine the rods for each recursive n-1 call. What's more is you really only need to think about which rod you're starting and ending at. The odd rod out is just the buffer.

I'd show you the code in action but all the computation is behind the scenes and not very interesting. Luckily a quick search on Youtube yielded a visual solution for N=6. Notice how there is a point of symmetry in the middle as the 5 smaller disks sit at B and the largest disk migrates over to C. Utterly awesome!

[base case] [python] [tower of hanoi] [algorithms] [recursion] [youtube]

Page 1 / 1