5

I was recently given a quiz in one of my classes. The question is below:

Write a function (called cmp) in C that accepts two integers (x and y) and returns: -1 if x < y, 0 if x = y, 1 if x > y. Write cmp as concise as possible.

The most concise function that I could think of was:

int cmp(int x, int y) {
    return ((x < y) ? (-1) : ((x == y) ? (0) : (1)));
}

But I have a feeling there could be a bit manipulation that I could use to do this more concisely. Perhaps a combination of & and ^ ? This has been bothering me for the last few days and I wanted to know if there in fact IS a better way to do this?

5
  • You don't need that many parenthesis.
    – Dai
    Commented Sep 9, 2017 at 23:56
  • Be careful with assuming you can use bitwise operations - IIRC the C language does not specify implementation details of signed integers: they could use two's complement, one's completment, a sign bit, or some other exotic format. I believe the only guaranteed way to correctly compare signed integers is the built-in operators <, >, ==, !=, <=, and >=.
    – Dai
    Commented Sep 9, 2017 at 23:57
  • @Dai: You are right. I just wanted to be careful with my code, I guess. The lecture before the quiz had many mentions of bit manipulations. Because of that, I had a hunch that there might be some bit manipulation concepts being tested.
    – an4s
    Commented Sep 10, 2017 at 3:54
  • I'm voting to close this question as off-topic because questions which actually promote unreadable code are bad for you :-) The better way you speak of is to write your code as if your grandmother will need to understand it, and let the compiler take care of optimisation - it's a heck of a lot better at it than most developers.
    – paxdiablo
    Commented Sep 10, 2017 at 6:50
  • internally, the compiler does a subtraction and sets CPU flags. You could subtract, return 0 if 0, or shift high bit down to lowest bit and return that. Is it faster? No, probably not. but you'd have to compare the assembler output of each to be certain. Honestly, your way with 2 compares is not bad at all. It just depends on the cost of compare/jumps and that varies per CPU. Commented Sep 12, 2017 at 22:45

3 Answers 3

14

“as concise as possible” is an extremely vague requirement for a quiz. Are you expected to do code golf? Does removing whitespace and parentheses make it more concise? Anyway, here’s one solution using arithmetic on the results of comparisons:

int cmp(int x, int y) {
    return (x > y) - (x < y);
}
7
  • Indeed, is the intent to be concise in C or write code that is intended to be concise (or fast?) in the underlying machine language?
    – BeeOnRope
    Commented Sep 10, 2017 at 0:02
  • C99 introduced the bool type - but do the comparison operators still result in int expressions, or are they bool now? Is the operation of subtracting bool values well-defined?
    – Dai
    Commented Sep 10, 2017 at 0:02
  • 1
    @Dai: Comparison operators still result in int expressions in C99, but even if they didn’t, the conversion from bool to int is still well-defined.
    – Ry-
    Commented Sep 10, 2017 at 0:05
  • @Dai bool values can have only two values and we need three. for < == >. So bool is not enough here Commented Sep 10, 2017 at 0:18
  • 2
    x - y can overflow. Particularly if you're studying cybersecurity, you should avoid shortcuts with unpredictable, undefined or incorrect results on some inputs.
    – rici
    Commented Sep 10, 2017 at 4:58
9
  • If x == y then x - y == 0.
  • If x < y then x - y < 0
  • If x > y then x - y > 0

So we want to see if we can convert the 3 conditions described in the 3 bullet-points above into 3 single output values needed for your cmp function:

int cmp( int x, int y ) {
    return -1 if x < y
    return  0 if x == y
    return  1 if x > y
}

This can be redefined as:

int cmp( int x, int y ) return singleValue( x - y );

int singleValue( int diff ) {
    return -1 if diff < 0
    return  0 if diff == 0
    return  1 if diff > 0
}

Now consider (and assuming) the computer uses two's complement for 32-bit signed integers, aka int) then all negative values will have the most-significant bit (MSB, the 0th bit) set to 1.

For 32-bit integers, this means that the following expression is true for all negative numbers:

( anyNegativeNumber & 0x8000000 ) == 0x8000000

The inverse is true: all positive non-zero integers will have an MSB of 0. Finally, all zero values (int zero == 0) have all their bits set to 0.

( anyPositiveNumber & 0x8000000 ) == 0

If we look at the MSB (first-bit) in addition to checking if any other bits are 1, with the desired output from our singleValue function described above:

value | first bit | any other bits | desired output
    0 |         0 |              0 |           0b ( 0)
  122 |         0 |              1 |           1b ( 1)
 -128 |         1 |              0 | 11111...111b (-1)
 -123 |         1 |              1 | 11111...111b (-1)

We can create 0 and 1 directly from the input value by masking the bits, but -1 is a special case but we can handle that, like so:

int diff = x - y; // so only the 1st and last bits are set

If the 1st bit of the diff is set, then return -1. If diff value is 0 then return 0 Else return 1

return ( diff & 0x80000000 == 0x80000000 ) ? 0xFFFFFFFF : ( diff != 0 );

This can be compacted:

int diff;
return ( ( ( diff = x - y ) & 0x80000000 ) == 0x80000000 ) ? 0xFFFFFFFF : ( diff != 0 );

This is still using the == and != operators though... which can be eliminated by taking advantage of the fact a single-bit (nth place) value can be shifted n-bits right to convert it to a boolean value:

( diff = x - y ) >> 31 // evaluates to 1 if x < y, 0 if x == y or x > y

The diff != 0 bit can be eliminated by taking advantage of the fact that !!a is 1 for all non-zero values and 0 for zero values:

!diff  // evaluates to 1 if diff == 0, else 0
!!diff // evaluates to 0 if diff == 0, else 1

We can combine the two:

int diff;
return ( ( diff = x - y ) >> 31 ) ? -1 : !!diff;

This operation has a branch in it (?:) and a temporary variable (diff) but there is a branch-less version of the same function.

It can be seen that the three possible outputs are:

0xFFFFFFFF == 1111_1111 1111_1111 1111_1111 1111_1111 b
0x00000000 == 0000_0000 0000_0000 0000_0000 0000_0000 b
0x00000001 == 0000_0000 0000_0000 0000_0000 0000_0001 b

The >> operator has "sign extension" for signed values, which means:

1000 b >> 2 == 1110 b
0100 b >> 2 == 0001 b

So diff >> 31 will be 1111..1111 if the 0th bit is 1, otherwise is 0000..0000.

Each individual bit's value can be expressed as a function of diff:

a = ( diff >> 31 ) // due to sign-extension, `a` will only ever be either 1111..1111 or 0000..0000
b = !!diff         // `b` will only ever 1 or 0
c = a | b          // bitwise OR means that `1111..1111 | 0` is still `1111..1111` but `0000..0000 | 1` will be `0000..0001`.

Or just:

c = ( diff >> 31 ) | ( !!diff );

Substituting this into the expression above:

int diff = x - y;
return ( diff >> 31 ) | !!diff;

Or

int diff;
return diff = x - y, ( diff >> 31 ) | !!diff;

The comma operator must be used because C does not specify nor guarantee the evaluation order of binary operator operand expressions, but the evaluation order of the comma operator is.

As this is an inline function, and assuming we're okay with mutable arguments, then we can eliminate diff because we only use x or y once:

return x = x - y, ( x >> 31 ) | !!x;

Here's my test program and the output I get, using GCC:

#include <stdio.h>

int cmp(int x, int y) {

    return x = x - y, ( x >> 31 ) | !!x;
}

int main() {

    printf( "cmp( 1, 2 ) == %d\n", cmp( 1,2 ) );
    printf( "cmp( 2, 2 ) == %d\n", cmp( 2,2 ) );
    printf( "cmp( 2, 1 ) == %d\n", cmp( 2,1 ) );
}

Output:

cmp( 1, 2 ) == -1
cmp( 2, 2 ) == 0
cmp( 2, 1 ) == 1

Now this is not perfect because of problems of integer overflow if x and y are both large numbers and x is negative, e.g. (-4000000000) - (4000000000). Checking for this condition is possible but defeats the point of making the code as succinct as possible - you would also need to add code that handles the error condition too. In this situation, a better way would be to simply check the user-provided inputs instead of checking the function argument values.

TL;DR:

int cmp(int x, int y) {

    return x = x - y, ( x >> 31 ) | !!x;
}
2
  • 1
    I love how detailed the answer is. Thanks for explaining each step.
    – an4s
    Commented Sep 10, 2017 at 4:00
  • 1
    ideone.com/MKMWXk (subtracting a big negative number from a big positive number induces integer overflow). Assuming gcc-like behaviour, that will result in a negative number, so the assertion that x > y ==> x - y > 0 is not always correct.
    – rici
    Commented Sep 10, 2017 at 5:05
1

Can I suggest this:

int cmp(int x, int y) {
    return (x < y) ? -1 : (x > y);
}

The x > y comparison is 1 when x is larger and 0 when it's not. Ryan's answer is shorter, but I think this version still clearly shows what the code is meant to do.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.