Monday, March 25, 2013

SPOJ PRHYME


SPOJ Problem Set (classical) 2737. Perfect Rhyme

A perfect rhyme is not a crime,
it is something that exceeds time,
a bit of science, a piece of art,
soft as a pillow, sharp as a dart.

I really love this little rhyme.

Basically the problem is, you are given a dictionary of words, and some query words. For each query word q, you have to find a dictionary word u such that, u != q and the common suffix of q and u are of maximum length possible. In case there is a tie, the problem requires the lexicographically smallest such word.

This problem can be solved using STL maps and storing the suffixes along with their sorted id, but there is a more elegant solution using Trie data structure, i.e. prefix tree. As we are interested in maximizing common suffix length, we can store the strings in the Trie in reversed form, so now the suffixes will become prefixes in this tree. For each dictionary word, we also need to store its index number, when sorted, as a termination marker for that word, so that, we can find the id of an word easily during the query. On each node, we also need to keep two additional information, the ids of lexicographically smallest and second smallest strings passing through that node, which we call min1 and min2. Initially both should contain an infinite value.

Now for each query word, we also search the word in reversed form. If it is not found in the tree, i.e. a path may exist, but the end marker is not present, then the task is simple, we just return the lexicographically smallest id, which is min1, from the current node, i.e. the node which we reached while trying to match the query string.

But extra cares should be taken when the query string is found on the tree, because, then we have to look for another candidate, for which, we have kept min2, i.e. second lexicographically smallest index. If we can deduce that, going to node x from current node cur, if it evidently means we will end up finding the exact same word, then we can't follow that path, instead, we decide which index to return from current node cur, if its min1 index refers to the word itself, then we return min2, otherwise we return min1. And if we have no other choice but end up at the exact matching point, then we are also sure that there is at least another string which follows the same path, but does not end at our current node, i.e. at least two different words. Then depending on our query word, we select min1 or min2 from our current node.

So, if you know how to code a Trie, it is not really a hard one, but indeed a tricky one.


Sunday, March 24, 2013

SPOJ VONNY


SPOJ Problem Set (Classical) 224. Vonny and her dominos

This is exactly the same problem from 2006 TopCoder Collegiate Challenge, problem DominoesFinding. I am not sure whether this problem can be solved by bipartite matching algorithm or dynamic programming, probably both will run out of time limit. But it can be solved by straight forward backtracking with a little bit pruning. The backtracking idea is pretty simple, just keep track of which tiles are used (each tile must be used exactly once), and try filling the grid in row major fashion. So there is no point discussing the solution, it is better to discuss why backtracking can be used here. I am not going to write these on my own words as it has already been written, so I will repost the analysis from TopCoder

DominoesFinding

by soul-net

Backtracking. Yes, that's it. Knowing that a problem is in fact solvable with a backtracking approach is most times a matter of intuition gained with experience. Anyway, in this and some other cases, there can be found more formal estimators that the idea is in fact THE idea.

I'll describe a possible backtracking approach, possibly the easiest to implement, but there are other possibilities. The idea is based on the fact that all squares must be used. For example, if we take the upper-left square of the board, we can see that we must connect it with one of its two neighbors. With this in mind, we can iterate over all squares and, each time we find an unused one, we know that we must match it with one of its two (or one) remaining neighboors -- or both, if we iterate in a column-row or a row-column fashion; when we find an unused square, we know that everybody in its upper-left rectangle is already used.

As we do this, we go marking each used piece and only continue trying if the new piece made by each new matching is "new". In this way, if we finally get all squares to be used, we know also that all pieces are used (because we managed to get no repeats) and then, we add 1 to the counter.

To be sure this approach works perfectly in time, you can conduct a little experiment and run the algorithm over an empty board without the "new piece" pruning. This will show you that there are less than 1.3 million ways to divide the board (1,292,697 actually), so it is perfectly feasible to try every one of them. Of course, the pruning of the "new piece" will reduce the running time dramatically in most cases.

There is also a good theoretical estimator that the approach will work in time, to convince ourselves before programming anything (many programmers think this is a must). There is a total of 56 squares in the board, our algorithm does nothing for half of them (when it finds them already used) and tries 2 or less cases for the other half (the ones it finds unused). This means the total number of leaves in the search tree will be bounded by 256/2 which is roughly 256 millions. This is pretty big, but considering it is a wide margin upper bound, it can be pretty well used as a "proof" that time limits won't bother.

View original analysis page from TopCoder.


Practice Week #1



Saturday, March 16, 2013

SPOJ ARRAYSUB


SPOJ Problem Set (Classical) 10582. Subarrays

This was also one of the problems I was asked on my recent interview with Google. Problem is a simple one for segment tree based algorithm. The solution basically requires us to maintain a Range Maximum Query (RMQ) algorithm, and I implemented this using segment tree.

Given, there are N items and a window of size K, we have to find the maximum item in each K sized window. First, we insert the first K items in the RMQ, so the segment tree root now knows the maximum item at this stage. Now if you observe you will see, for the (k+1)th item, 1st item will be removed from the tree and (k+1)th item will be inserted. This can be done by inserting (k+1)th item at the position of 1st item. Because for (k+1)th item, 1st item is in the oldest position. And clearly, for each of the next items, we can just insert it in the current oldest position on the K sized window. so (k+1) will be inserted at index 1, (k+2) at 2, (k+3) at 3 ... (2k) at k, (2k+1) at 1, (2k+2) at 2 and so on. So all we need is to keep inserting the items in the RMQ in a circular fashion and each time taking the updated range maximum value.

So, basically the structure of the code is:

for(i = 0; i < k; i++) insert(root, 0, k-1, item[i]);
output Tree[root];
for(; i < n; i++) {
    insert(root, 0, k-1, item[i % k]);
    output Tree[root];
}
// considering item indexes to be 0 based in code
// insert is the function that inserts an item on a specific index in the RMQ tree
So, once you write down the RMQ function insert, you are pretty much done with it. Happy coding!