Red Green Repeat Adventures of a Spec Driven Junkie

Karger Minimum Cut with tests

Why??

I’m discussing another algorithm: Karger’s Minimum Cut. I will take the same approach as with Mergesort: develop the algorithm using tests.

This section of the algorithm course is where I started to get a bit uneasy since I really don’t have a great handle on manipulating graphs in code. On paper, I can easily ‘visualize’ the minimum cut of a (reasonably sized) graph. Ask me to do it in code, well, that’s a bit harder.

This was a great exercise for me to get more familiar with handling graphs in code. Once I learned that there are algorithm graphs, I started to see graphs in apps I most frequently use like: social networks and maps. As this is my second time doing the course, I am doing it with testing in mind.

Minimum Cut & Karger’s algorithm

The minimum cut of a graph is ‘cut’ which separates the graph into two bodies. The number of edges going over that separation is the number of cuts.

Karger’s minimum cut takes two connected verticies in a graph and contracts them until there are only two verticies left. The number of edges between those two verticies are the minimum cuts of a graph.

How

The second time around, I definitely thought about graphs more than about Karger’s minimum cut algorithm. Graph representation can be done either by adjacency lists or adjacency matrix. Neither are a data structure supported directly in Ruby (unlike arrays for sorting.)

Graph representation

I looked for ways to represent the graph, what’s an efficient way to store and track of edges and their connected verticies? I tried a few, like multiple arrays. At the end, I used Ruby’s hash as a single unit to represent the graph. The verticies of a graph are represented by the hash’s keys, and the graph’s edges are stored as a key’s entry as an array.

graph = { vertex1 => [ vertex2, vertex3, ... ],
          vertex2 => [ vertex3 ],
          vertex3 => [ vertex1 ] }

To represent a basic star graph:

graph = { 1 => [2,3],
          2 => [1],
          3 => [1] }

To get edges of vertex 1 are:

graph[1] # => [2,3]

Is vertex 1 connected to vertex 3?

graph[1].include?(3) # => true

Is vertex 2 connected to vertex 3?

graph[2].include?(3) # => false

Cool, now that the graph is stored in a data structure I’m comfortable using, what’s the key part of the algorithm to start testing?

Key part of algorithm

What is the key part of Karger’s minimum cut algorithm? With Mergesort, the key part of the algorithm was merge. With Karger’s minimum cut, the key part is: contract!

Contract is where one connecting vertex is merged with another and all their connecting edges are updated to be connected to the merged vertex.

I will start with the contract method since Karger’s algorithm uses it a lot.

Testing contract

For a certain input graph, I want that graph to be contracted. What does that mean? It means a new vertex is created by putting two verticies together and any other verticies connected to the contracted verticies are now connected to the new vertex. Simple, right?

In code, that means contracting verticies 1 & 2 together of a graph would produce the following input and output:

1
2
3
4
5
6
7
8
input = { 1 => [2,3],
          2 => [1,3],
          3 => [1,2] }

output = { "1-2" => [3,3],
            3 => ["1-2", "1-2"] }

puts contract(input,1,2) == output

If verticies 1 and 3 are contracted in the same graph, the output would be:

1
2
3
output = { "1-3" => [2,2],
            2 => ["1-3", "1-3"] }
puts contract(input,1,3) == output

and if verticies 2 and 3 are contracted:

1
2
3
output = { 1 => ["2-3", "2-3"],
          "2-3" => [1,1] }
puts contract(input,2,3) == output

Unlike testing Mergesort, these test cases have to be written out by hand. Writing these out by hand really made me think through what is my input and what is my output. If I can’t write out a small case by hand, how can I expect my program to handle a huge case in the course homework?!

Now that contract can handle small cases, let’s start with a slightly larger graph and make sure contract is doing what I expect:

1
2
3
4
5
6
7
8
9
10
input = { 1 => [2,4],
          2 => [1,3,4],
          3 => [2,4],
          4 => [1,2,3] }

output = { "1-2" => [4,3,4],
            3 => ["1-2",4],
            4 => ["1-2","1-2",3] }

puts contract(input, 1, 2) == output            

Larger test cases

Now that I’m starting to get bigger and bigger test cases, thinking of these test cases on my own is tricky… where else can I get some test cases?

I found the coursera discussion forum to be a fantastic place for test cases. The problem is that the test cases are written as:

1 2 3
2 1 3
3 1 2

which represents:

1
{ 1 => [2,3], 2 => [1,3], 3 => [1,2] }

I can just import these cases manually, or I can write a method to import them, and I can just copy paste cases from the forum and use them as I need.

Let’s use the above case:

1
2
3
4
5
6
7
input = "1 2 3
2 1 3
3 1 2"

output = { 1 => [2,3], 2 => [1,3], 3 => [1,2] }

puts make_graph(input) == output

Now that I have more and more test cases, let’s start working on the other part of the algorithm.

Randomly select vertex

The rest of the algorithm seems pretty easy: select random vertex and a vertex connected to that vertex and contract. How do I test this…?

This part I wanted to control the randomization process by selecting certain verticies. This calls for more tests! Start by testing when the first vertex are contracted with its first edge.

1
2
3
4
5
6
7
8
9
10
11
12
13
input = { 1 => [2,4],
          2 => [1,3,4],
          3 => [2,4],
          4 => [1,2,3] }

output = { "1-2" => [4,3,4],
            3 => ["1-2",4],
            4 => ["1-2","1-2",3] }

vertex1 = input.keys.first
vertex2 = input[vertex1].first

puts contract(input, vertex1, vertex2) == output

All I am doing here is building out the algorithm towards Karger’s minimum cut algorithm by contracting the first vertex, and a vertex connected to it.

Now, let’s put two contractions together, by contracting the first vertex and its connected vertex of the connected graph (i.e. perform two contractions in a row.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
input = { 1 => [2,4],
          2 => [1,3,4],
          3 => [2,4],
          4 => [1,2,3] }

output1 = { "1-2" => [4,3,4],
             3 => ["1-2",4],
             4 => ["1-2","1-2",3] }

vertex1 = input.keys.first
vertex2 = input[vertex1].first

puts contract(input, vertex1, vertex2) == output1

output2 = { "1-2-4" => [3],
             3 => ["1-2-4"] }

vertex3 = input.keys.first
vertex4 = input[vertex3].first

puts contract(input, vertex3, vertex4) == output2

Building main contract loop

This is getting a bit messy but this test is starting to be a good candidate for the main contract loop in Karger’s algorithm. Now that these tests pass, I will replace these intermediate items and build it out as a loop.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
input = { 1 => [2,4],
          2 => [1,3,4],
          3 => [2,4],
          4 => [1,2,3] }

output2 = { "1-2-4" => [3],
             3 => ["1-2-4"] }
            
while input.keys.count > 2
  vertex1 = input.keys.first
  vertex2 = input[vertex1].first
  input = contract(input, vertex1, vertex2)
end

puts input == output2

Random vertex selection

This is starting to look like exactly the structure of the code of the algorithm. The only difference is the first vertex is being selected on each run. I can easily fix that by replacing .first with: .sample. Now my code is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
input = { 1 => [2,4],
          2 => [1,3,4],
          3 => [2,4],
          4 => [1,2,3] }

output2 = { "1-2-4" => [3],
            3 => ["1-2-4"] }

while input.keys.count > 2
  vertex1 = input.keys.sample
  vertex2 = input[vertex1].sample
  input = contract(input, vertex1, vertex2)
end

puts input == output2

This test may start failing now since output2 might match the contracted graph, it may not, since which verticies to be contracted will be selected at random.

Interpreting the output

Now I can start rolling out the output of each by hand because there are only a few, but there’s still more of the algorithm left to work on. Karger’s minimum cut algorithm is about the minimum cut. This graph will always have a cut of 1:

output2 = { "1-2-4" => [3],
             3 => ["1-2-4"] }

Even though there are two verticies, the number of verticies connected are one.

When I have a two cut graph output, it looks like:

{ 3 => ["1-4-2",
        "1-4-2"],
  "1-4-2" => [3, 3]}

and a three cut graph output:

{"12-9-10-7-6-8-11" => ["4-3-1-2-5",
                        "4-3-1-2-5",
                        "4-3-1-2-5"],
 "4-3-1-2-5" => ["12-9-10-7-6-8-11",
                 "12-9-10-7-6-8-11",
                 "12-9-10-7-6-8-11"]}

So, the minimum cut is represented by the number of entries in the array of the verticies. This is a bit confusing, but when I think about it, each entry is a different edge that is connected to the vertex, and those represent the minimum cut.

Making and Testing MinCut method

I start getting bigger graphs with different cuts as test input. The algorithm requires the contraction to happen randomly and only output the minimum. So, my tests now take the shape:

1
2
3
4
5
6
7
two_cut_graph_input = "1 2 4
2 1 3 4
3 2 4
4 1 2 3"

two_cut_graph = process_list(two_cut_graph_input)
puts mincut(two_cut_graph) == 2

It’s a bit of a jump in terms of tests, but the method mincut is basically wrapping the contract engine with some bookkeeping to keep track of the minimum cut of a graph being done several hundred or thousand times (depending on size of the graph and how lucky it is at finding the minimum).

Now that tests are just testing for a number, any issues in not being able to achieve the correct output makes testing very hard. Some tricks I used:

  • outputting the minimum graph with the minimum cut count more variety of tests
  • for different cuts (I only had minimum cuts of two which I made assumptions around)

What

A set of tests and a strategy to implement Karger’s minimum cut algorithm has been discussed.

For me, tests really helped focus on what methods to start with implementing. If I couldn’t figure out its output, I would move onto something which I could tests, in this case, it would be the contraction algorithm, which Karger’s minimum cut algorithm used repeatedly.

Karger’s minimum cut algorithm is a Monte Carlo algorithm, it requires randomization, and ultimately the algorithm’s output is a single number. This made tests simple to write, but any bugs required good old puts debugging.

Ideally, the tests would have solidified everything needed, but I used it to figure out when things didn’t happen as expected, especially when I started introducing random vertex contractions.

From working with Karger’s minimum cut algorithm using tests, I found it to be a great way of working with graphs in code.