Finding Reversed Words

Following on from coding experiments with using the decorator pattern to reverse strings in Java and with palindromes and string reversal in C#, I decided to search for the words in the English language that are the reverse of other words. The Devil lived, but Black Sabbath gave us Live Evil. To beat Sega takes ages. You’re upping the ante living near Etna.

I had a copy of a dictionary file from Ubuntu on my hard drive. To make life a little simpler, I used some PowerShell magic to remove the many words that end in “‘s” (there are no words that start “s'”) and to make the list all lower case:

PS M:data> Get-Content .\british-english.csv | ?{ $_ -NotMatch "'s$" } | %{ $_.ToLower() } | Sort-Object | Get-Unique | Set-Content british-english-without-apostrophe-s.txt

The simplest algorithm for finding words that are the reverse of other words is to simply go through the whole list and then check whether any of the other words are the reverse of that word. To do this, we need a method to test if two strings are a reversed pair. Pushing the characters of the first string onto a stack and then popping them off as you check them against each character of the second string can achieve this:

public bool AreReversedPair(string firstString, string secondString)
{
    if (firstString.Length != secondString.Length)
    {
        return false;
    }

    var firstStringStack = new Stack();

    foreach (var c in firstString)
    {
        firstStringStack.Push(c);
    }

    foreach (var currentCharFromSecond in secondString)
    {
        var currentCharFromFirst = firstStringStack.Pop();

        if (currentCharFromFirst != currentCharFromSecond)
        {
            return false;
        }
    }

    return true;
}

Next, the code for finding the pairs:

public IEnumerable FindReversedWords(IQueryable allWords)
{
    var reversedWords = new LinkedList();
    var reversedStringChecker = new StackReversedStringChecker();

    foreach (var firstWord in allWords)
    {
        foreach (var secondWord in allWords)
        {
            if (reversedStringChecker.AreReversedPair(firstWord, secondWord))
            {
                reversedWords.AddLast(new ReversedWordPair(firstWord, secondWord));
            }
        }
    }

    return reversedWords;
}

This code is very straightforward, but not fast at all. The time complexity of the algorithm is O(n^2) because for each word, we need to check every other word. For a list with almost 73,000 words, this takes a long time. In spite of this, it does finish. I left the program running, went to dinner and found a list waiting for me when I came back.

reversed-word-pairs.txt

As the list of words is not going to change very often and I only need to run the algorithm once, I could just leave the program as it is. However, it would be nice to reduce the time complexity of the algorithm. The simplest way to do this would be to reverse each word and then check whether the list of all the words contains the reversed word (readers who are familiar with .Net may have noticed that I passed the list of words as an IQueryable rather than an IEnumerable for the list of words). Depending on the data structure, checking whether the reversed string is a member of the set of words should be less than O(n). For example, retrieval from a Trie takes O(m) where m is the length of the string. This would leave us with O(nm) time. As with all optimisations, you’ve got to time the code to see whether you get a boost or not.

My next task is to come up with a meaningful sentence with all words the reverse of the corresponding word at the other end of the sentence, with a palindrome at the centre. There must be something that can be made of Dennis, who sinned in the straw, only to get warts to stun his nuts.

See if you can make a mirror sentence of your own.

The complete code, along with some tests, can be found at:

https://github.com/robert-impey/CodingExperiments/tree/master/C-Sharp/FindReversedWords