The case for base cases

Monday, January 28, 2013

You have a complicated problem to solve. How do you start tackling it?

One of the most tried and tested methods of problem solving is nothing more than implementing some noise control. In other words, can you peel back the complexity of the problem and make it simpler?

Tower of Hanoi

I finally sat down and tried solving this problem two days ago. For those of you unfamiliar, the Tower of Hanoi is an old puzzle that has been mostly re-appropriated as a classic example of algorithmic thought. Here's a version of it:

Tower of Hanoi, from Wikipedia

There are three rods and a number of different sized disks that can slide onto any rod. The puzzle starts with the disks in a neat stack, ascending by size, on the first rod. The objective of the puzzle is to move the entire stack to the last rod, obeying the following rules:

Only one disk may be moved at a time.
Each move consists of taking the top-most disk from one of the rods and sliding it onto another rod, on top of the other disks that may already be present on that rod.
No disk may be placed on top of a disk smaller than it.

Seems straightforward enough, but let's solve this for N disks. How should we start?

Let's start with the base cases

N=1

    |         |         |
    |         |         |
   -|-        |         |
========= ========= =========
    A         B         C

1.  -----------------> -|-
Done.

So we solve N=1 with one move of -|-.

N=2

    |         |         |
   -|-        |         |
  --|--       |         |
========= ========= =========
    A         B         C

1.  -------> -|-

    |         |         |
    |         |         |
  --|--      -|-        |
========= ========= =========
    A         B         C

2.  ----------------> --|--
3.            -------> -|-
Done.

Let's take a closer look at these moves:
1. Exactly like solving N=1, except A->B instead of A->C.
2. The second step places the largest disc at the base of C.
3. Also solves N=1, only B->C instead of A->C.
Done.

Let's rewrite our steps like this:
1. Solve N=1, except A->B
2. Move --|-- from A->C
3. Solve N=1, except B->C
Done.

N=3

   -|-        |         |
  --|--       |         |
 ---|---      |         |
========= ========= =========
    A         B         C

1.  -----------------> -|-
2.  ------> --|--
3.           -|- <-------

    |         |         |
    |        -|-        |
 ---|---    --|--       |
========= ========= =========
    A         B         C

4.  ---------------> ---|---
5. -|- <-------
6.            ------> --|--
7.  -----------------> -|-
Done.

Notice how steps 1-3 solve N=2 and places it at B. Rewrite?
1. Solve N=2, except A->B
2. Move ---|--- from A->C
3. Solve N=2, except B->C
Done.

Interesting. Looks like we have a formula.

N=n
1. Solve N=(n-1), except A->B
2. Move largest disc from A->C
3. Solve N=(n-1), except B->C

I was pretty excited when I saw this pattern, as it meant I could code up a solution in very few lines of code. Here it is in Python, using a list as a stack:

def solve_hanoi(n, A, B, C):
    if n == 0:
        return
    solve_hanoi(n-1, A, C, B)
    C.append(A.pop())
    solve_hanoi(n-1, B, A, C)

The trick is to redefine the rods for each recursive n-1 call. What's more is you really only need to think about which rod you're starting and ending at. The odd rod out is just the buffer.

I'd show you the code in action but all the computation is behind the scenes and not very interesting. Luckily a quick search on Youtube yielded a visual solution for N=6. Notice how there is a point of symmetry in the middle as the 5 smaller disks sit at B and the largest disk migrates over to C. Utterly awesome!

[base case] [python] [tower of hanoi] [algorithms] [recursion] [youtube]

Unexpected random number generation

Wednesday, January 23, 2013

So, last month, I was working on building test applications for Whiskey, my Python WSGI server implementation. I wanted to ensure Whiskey was robust enough to keep going, even if it was serving applications that met critical, program-killing exceptions. To that end, I wrote a random exception throwing application to test with Whiskey. Here's the code, simplified and stripped down to its core.

import os
import random

while True:
    pid = os.fork()
    if pid == 0: # if we are in the child process
        print random.random()
        os._exit(0)

A little background

Whiskey handles concurrent client requests by forking new child processes. Calling os.fork() splits the current process into two, the parent and child. The child process will have a process identifier (pid) of 0. Recall that a forked child process has a copy of the same memory space/environment of its parent.

Result

> 0.844421851525
> 0.844421851525
> 0.844421851525
> 0.844421851525
> 0.844421851525
> 0.844421851525
> 0.844421851525

You might not be surprised since you've already come into this post with some suspicion, but remember that this is the stripped down code. What I actually ran into, was an application that either always failed or never failed. So, what's going on?

Well, every client request leads to a forked child process that handles the request. However, each child process ends up having the same random seed, since it is always forked from the same parent process (or more specifically, a parent process which never changes its random seed). This results in a pesky situation where you unexpectedly have an expected random number.

Solution

We need to reset the random seed for each forked process. One way is to set the seed to the time of the request.

import os
import random
import time                      ##### ADDED THIS LINE ######

while True:
    pid = os.fork()
    if pid == 0:
        random.seed(time.time()) ##### ADDED THIS LINE ######
        print random.random()
        os._exit(0)

Randomness restored.

Update
My friend Geoff mentioned that for real applications, you wouldn't want to use time.time() as your seed. Besides you leaving yourself open to the chance of two requests spawning processes at the same time, any PRNG (pseudo random number generator) is unsuitable for security reasons. Instead using /dev/random (UNIX) or CryptGenRandom (Windows) is preferable, since you'd have true randomness then. For the Python implementation, you'd want to make a call to os.urandom() to access your OS's random file. You can find the documentation on that, here.

I wish I could say I ran into this recently, but I've unfortunately been too busy applying to jobs instead of working on personal projects. I actually talked about this subject for a Hacker School weekly presentation on December 6th, 2012.

[random] [random numbers] [processes] [forking] [python] [whiskey]

Trying out the static blog thing

Tuesday, January 08, 2013

Since I started actively programming a few months ago, I've learned much about what technology I take for granted. In order to help me appreciate the many abstractions that our lives are built upon, I've made it my goal to be more technologically self-sufficent. The first step was to revamp my personal website. Next up was to move my code blog from Tumblr to my own domain, and then rebuild it on top of a static site/blog generator. That way, I would have complete ownership and control over my own posts.

Static what nots?

Static site generators do what they advertise. They take some content you provide, then generate a pre-determined set of static html pages that are web ready. The idea here is that once you finish setting up the styling and format of your site, you can simply focus on writing content. Of course, they have boiler plate templates ready for you too, if you'd rather jump right into writing.

Getting set up

You should certainly consult the Pelican docs for the full guide, but here's my general guide on how to get started.

Pelican is written in Python (major selling point for me), so install it and it's dependencies using virtualenv and pip. Pelican only supports reST out of the box but provides Markdown support upon installation of the Python-Markdown library. I use Markdown for my posts.

Then, you'll want to make a settings.py file somewhere in your project folder. This file will hold the global variables you set for Pelican. The important ones include:

AUTHOR = 'YOUR NAME'
SITENAME = 'YOUR SITE NAME'
SITEURL = 'https://your-site.com'

For the purposes of setting up a blog on Pelican, you will want to set DEFAULT_PAGINATION to be the (number of posts)/page and DEFAULT_ORPHANS as the largest remainder of posts you'll allow on your last page. For my setup, I selected 5 posts/page, and no more than 3 posts on the last page (so up to 8 posts on the last page).

You can browse over all my selected settings here.

Making it yours

Now, you can certainly use the default theme provided by Pelican, or one of their other sample themes, but I prefer to make a theme that suits my own tastes. Reading the Pelican documentation and working through the quickstart guide took up half of my weekend, which means I spent the other half on designing my theme.

You'll need some familiarity with Jinja2 templates. At the minimum, you should read about the conditionals/loop syntax (especially loop attributes) and template inheritance (include and block). I believe Jinja2's templating system is similar to Django's, so you should feel at home if that's your forte.

Pelican is specific about what file structure it wants in a custom theme. The documentation outlines the structure, but doesn't give you adequate information on what attributes you can access from variable objects. If you need specific functionality beyond what objects and object attributes others have done in the sample themes, you'll probably just need to read through the Pelican source or ask some questions in their IRC channel.

I ended up starting with the default theme (called: notmyidea) templates and stripping it of everything I didn't want to use. For instance, I had no need for the category and author functionalities. I don't want to categorize posts and I'll be the only author on this blog, therefore, I did not supply:

categories.html
category.html
author.html
authors.html

Pelican is smart enough to just use it's base templates from the Simple theme if you don't provide all the desired html templates.

You will, however, want to make the basic html frame for your blog and save it as base.html, as all other html templates will inherit from it. Otherwise, use the default theme's templates as a guide for building your custom templates.

Once you have your html templates set, you'll want to work on the CSS, which you will save in static/css folder.

The theme for this blog is here. Please give me a shoutout in the footer if you want to use it ;).

Workflow

So once you have your theme, you're ready to create content! I host my pages on Github Pages, but you choose your own host.

Let's go through how I am publishing this post. First I write it (in vim!) and save it in my content folder, which stores all my posts for this year.

content/2013/trying-out-the-static-blog-thing.md

I like to move out to the root of my Blog folder, then run Pelican to update all my static pages.

$ pelican content/2013/ -s path/to/settings.py

The new static pages will be placed in

output/

but you can always change that path in your settings.py.

Then I simply $ git add . in my output folder, throw on a commit message, then push it to Github.

Final thoughts

The entire set up process takes a bit of time, especially if you're making your own theme, but it's always nice to have ownership over your own content. That's just something you don't get on Tumblr or other browser based blogs. Pelican is an awesome option for anyone looking for a static blog generator and wanting to work in Python, I heartily recommend it.

[static blog] [pelican] [python] [the architect] [jinja2]

On to better things

Sunday, January 06, 2013

Pretty odd to have committed to writing 100 meaningful blog posts this year, and then promptly not post until Jan 6th. But as you can tell, my excuse is that I was spending all weekend building my new blog. Well I'm glad to say, here it is.

I'm now running on the static site generator, Pelican. It's awesome because it is actively developed (version 3.1.1 as of 12/4/12) and it is written in Python, which means I didn't have to default to Jekyll. It's not that I'm averse to Jekyll, I'd just rather work in the language I'm most comfortable in (not Ruby). Anyways, things I'm most excited about:

Writing all my posts in Markdown
Being hosted on Github Pages
Having all my posts saved on my Dropbox
New syntax highlighting color scheme
Actually using my own domain

I'm planning on writing about my experience building this blog in a post later this week. Stay tuned.

(find old posts on my tumblr)

[pelican] [new] [github pages] [jinja2] [python] [tumblr]

<< Page 2 / 2