Running with Code Like with scissors, only more dangerous

3Apr/140

The technical interview, part 1: Palindrome

Last time, I talked about what I expect of an interview candidate as a technical job interviewer. This time, we’re going to go through an initial question.

Interviewer: Write a function/method/subroutine to determine whether a particular string is a palindrome, i.e., is the same text when reversed as it is forward.

This is a common question for an interview because it allows you to demonstrate basic problem-solving skills and you can also show your interviewer that you know how to dive into the requirements. It requires that you can implement a simple algorithm, have an understanding of how strings work in your particular language, and can work through optimizations of your algorithm. Because the problem itself is so simple, you get a chance to dazzle your interviewer with good programming practices.

Understanding the problem

A palindrome is a string like kayak which you can reverse and end up with the same string. This gives you the naïve implementation:

    IsPalindrome(test)
        let reversed = ReverseString(test)
        return reversed equals test
    end

Most languages will have a baked-in function to reverse a string. So you could do that, but could you do it better?

Interviewer: Tell me about the performance characteristics of that baseline implementation.

You’ve allocated a new string, and depending on that particular implementation, it could have been costly. But you’ve got at least an O(n)* memory usage. String equality testing is going to be character-by-character, so that’s going to be O(n) execution. So, if you’re testing a string that has 2 million characters, the naïve algorithm allocates an additional 2mb string and does (in theory) up to 2 million comparisons.

How can we optimize? Well, let’s start with understanding the basic problem again. Let’s consider kayak:

k a y a k
0 1 2 3 4

Given this string layout, what would we need to do to go faster?

  1. Compare 0 to 4. Match, continue.
  2. Compare 1 to 3. Match, continue.
  3. Only one uncompared character is remaining, therefore return true.

This can be generalized to an algorithm:

  1. Initialize start and end indices to 0 and length - 1 (or 1 and length for 1-based index programming languages).
  2. Compare characters at the start and end indices. If they don’t match, return false (fast exit).
  3. Increment start and decrement end.
  4. If the end index is less than or equal to start, return true. Otherwise, loop back to the step 2.

We now know enough to author a simple method to do this:

function isPalindrome(stringToTest) {
    var start = 0, end = stringToTest.length - 1;
    for (/* already initialized */ ; start < end; start++, end--) {
        if (stringToTest[start] !== stringToTest[end])
            return false;
    }
    // got to the end with no mismatches, implies success
    return true;
}
class Utilities 
{
    public static bool IsPalindrome(string toTest) 
    {
        int start = 0, end = toTest.Length;
        for (/* already initialized */ ; start < end; start++, end--)
        {
            if (toTest[start] != toTest[end])
                return false;
        }
        // got to the end with no mismatches, implies success
        return true;
    }
}
bool isPalindrome(const wchar_t* toTest const) {
    auto len = wcslen(toTest);
    wchar_t* start = const_cast<wchar_t*>(toTest);
    wchar_t* end = start + len - 1;
    for (/* already initialized */ ; start < end; start++, end--) {
        if (*start != *end)
            return false;
    }
    return true;
}

This is a fairly optimal solution.

Interviewer: What about error cases? What kind of error cases exist, and how can you work around them?

In JavaScript, because we don’t do compile-time type validation, it’s possible for types to mismatch. A caller can pass in a number, an object, undefined, no value, null, a Boolean, etc. Each of these has different odd behaviors; for an Array, for example, the behavior will effectively match how things work with strings. But, for most other values, because there isn’t any string coercion, and math is performed directly against the value’s length property (which will generally result in NaN), and therefore that will return true. isPalindrome(true) definitely should return false, but it doesn’t!

There are a couple of ways to guard against this. You can test the input parameter for its type and fail with a TypeError. You could combine this by attaching the method to the String prototype, which would enable the function to be called against any String object as an instance method.

In C#, if you pass in a null reference, the method will fail with a NullReferenceException on the first dereference of toTest.Length. This is acceptable but likely not ideal; instead, it would be preferable to fail with an ArgumentNullException. You can also show nice API design intuition by making it into an extension method.

If you’re authoring in C++, a great question to ask would be the kind of error environment you need to deal with. In COM, for example, you don’t want to use structured exception handling, or at least allow SEH exceptions to cross the ABI boundary. Depending on your environment, you may be able to use SEH, but be prepared to author a COM version of the API. Of course, COM also advises a different type system. COM strings tend to be BSTRs, where a null value is equivalent to a zero-length string. (WinRT uses HSTRINGs, which behave similarly to BSTRs).

String.prototype.isPalindrome = function() { // argument removed, it is "this"
    if (typeof this !== 'string' && !(this instanceof String)) // In case of call, apply, or bind
        throw new TypeError('Can only call against a string.');

    var start = 0, end = this.length - 1;
    for (/* already initialized */ ; start < end; start++, end--) {
        if (this[start] !== this[end])
            return false;
    }
    // got to the end with no mismatches, implies success
    return true;
}
static class StringExtensions
{
    public static bool IsPalindrome(this string toTest) 
    {
        if (toTest == null)
            throw new ArgumentNullException("toTest");

        int start = 0, end = toTest.Length;
        for (/* already initialized */ ; start < end; start++, end--)
        {
            if (toTest[start] != toTest[end])
                return false;
        }
        // got to the end with no mismatches, implies success
        return true;
    }
}
// This sample is using WRL and Windows Runtime (because it's the C++ COM library that I know)
// IDL file:
namespace MyApi 
{
    [uuid(...)]
    [version(...)]
    interface IPalindromeTest 
    {
        HRESULT IsPalindrome([in] HSTRING toTest, [out, retval] boolean* result);
    }
}

// header:
using namespace ABI::MyApi;
class PalindromeTest : public RuntimeClass<IPalindromeTest>
{
    InspectableClass(InterfaceName_MyApi_IPalindromeTest, BaseTrust)

public:
    PalindromeTest();
    virtual ~PalindromeTest();
    HRESULT RuntimeClassInitialize();

    IFACEMETHOD(IsPalindrome)(_In_ HSTRING toTest, _Out_ boolean* result);
}

// implementation:
// Other stuff - RuntimeClassInitialize, etc. - goes here
IFACEMETHODIMP PalindromeTest::IsPalindrome(_In_ HSTRING toTest, _Out_ boolean* result)
{
    IfNullReturnError(result, E_POINTER);
    HRESULT hr;
    size_t len;
    const wchar_t* strVal;

    strVal = WindowsGetStringRawBuffer(toTest, &len);
    wchar_t* start = const_cast<wchar_t*>(strVal);
    wchar_t* end = start + len - 1;
    for (/* already initialized */ ; start < end; start++, end--) {
        if (*start != *end) 
        {
            *result = false;
            return S_OK;
        }
    }
    *result = true;
    return S_OK;
}

Interviewer: This is a pretty good solution. What are some of the pitfalls or problems with it? What are some possible variations that you might do?

This is an open-ended question intended to see what you might consider for your implementation and also to gauge what you know and maybe don’t know. Possible answers include:

This implementation fails to work with Unicode characters above U+FFFF. JavaScript, modulo ES6, doesn’t support characters outside of the UCS-2 range. They can go back and forth to a host which does, but intrinsically, JavaScript didn’t understand the idea of surrogate pairs. C++’s wchar_t and C#’s string don’t directly support surrogate pairs because the underlying character type is 16-bit. In order to fix the code, you’d need to get the “real” length of the strings (factoring in the code points) and then going code-point-by-code-point instead of index-by-index.

I could implement this function recursively instead of iteratively. Generally, this implementation is better as an iterative function than a recursive function. But recognizing that it’s possible, because the algorithm is really about solving progressively simple sub-problems, is a good thing to do. In languages which support tail-call recursion, this can be as well-performing as the iterative implementation.

(function() {
    function isPalindrome(toCheck, startIndex, endIndex) {
        if (endIndex >= startIndex)
            return true;

        if (toCheck[startIndex] !== toCheck[endIndex])
            return false;

        return isPalindrome(toCheck, startIndex + 1, endIndex - 1);
    }

    String.prototype.isPalindrome = function() { 
        if (typeof this !== 'string' && !(this instanceof String)) 
            throw new TypeError('Can only call against a string.');

        return isPalindrome(this, 0, this.length - 1);
    }
})();

Interviewer: Great. How would you validate this implementation?

This gives you a few options for approaches. This problem is simple enough that you could intuit the sets of cases:

  • A null value
  • A zero-length string
  • An odd-length string
  • An even-length string

You might also walk through your code to identify all of the different code paths. Doing so will help you get to the same set of cases.

Summary

If you’ve talked through all of this, then chances are I already think you’re a pretty strong candidate. This isn’t a complete list of things I might ask you to talk about, but I’m hoping it’s taken 15-20 minutes, and we can talk about more next.

Up next: a deck of cards.


* O(n) notation is a mechanism by which we describe the worst-case performance characteristics of a system. For memory, it’s referring to the number of bytes allocated; for CPU utilization, it’s the number of times a particular loop operation must be executed. It’s generally described as the largest factor in the equation; O(n) is linear time, O(n2) is polynomial time, O(log n) is logarithmic time, etc. This is a simplification, but I’d suggest having a strong grasp of this concept.


Comments (0) Trackbacks (1)

Leave a comment

ERROR: si-captcha.php plugin says GD image support not detected in PHP!

Contact your web host and ask them why GD image support is not enabled for PHP.

ERROR: si-captcha.php plugin says imagepng function not detected in PHP!

Contact your web host and ask them why imagepng function is not enabled for PHP.