Login

So, we all know the C/C++ signed/unsigned comparison rules where `-1 > 2u == true`, and I have a situation where I want to implement 'correct' comparisons efficiently.

My question is, which is more efficient with considerations to as many architectures as people are familiar with. Obviously Intel and ARM have higher weight.

Given:

int x;
unsigned int y;
if (x < y) {}

Is it better to promote:

x < y => (int64)x < (int64)y

or is it better to perform 2 comparisons, ie:

x < y => (x < 0 || x < y)

The former implies a zero-extend, sign-extend, and one comparison+branch, and latter requires no sign-extend operations, but 2 consecutive cmp+branches.
Traditional wisdom suggests that branches are more expensive than sign extends, which will both pipeline, but there is a stall between the extends and the single comparison in the first case, whereas in the second case I can imagine that some architectures might pipeline the 2 comparisons, but then followed by 2 conditional branches?

Another case exists, where the unsigned value is a smaller type than the signed type, which means it can be done with a single zero-extend to the signed type's length, and then a single comparison... in that case, is it preferable to use the extend+cmp version, or is the 2-comparison method still preferred?

Intel? ARM? Others?
I'm not sure if there's a right answer here, but I'd like to hear peoples take's. Low-level performance is hard to predict these days, especially on Intel and increasingly so on ARM.

Edit:

I should add, there is one obvious resolution, where the types are sized equal to the architecture int width; in that case it is obvious that the 2-comparison solution is preferred, since the promotion itself can't be performed efficiently. Clearly my `int` example meets this condition for 32-bit architectures, and you can transpose the thought experiment to `short` for the exercise applied to 32bit platforms.

Edit 2:

Sorry, I forgot the `u` in `-1 > 2u`! >_<

Edit 3:

I want to amend the situation to assume that the result of the comparison an actual branch, and the result is NOT returned as a boolean. This is how I'd prefer the structure look; although this does raise an interesting point that there is another set of permutations when the result is a bool vs a branch.

int g;
void fun(int x, unsigned in y) { if((long long)x < (long long)y) g = 10; }
void gun(int x, unsigned in y) { if(x < 0 || x < y) g = 10; }

This produces the intended branch typically implied when you encounter an `if` ;)

Well, you've correctly typified the situation: C/C++ have no way of doing a full signed int/unsigned int comparison with a single compare.

I would be surprised if promotion to int64 was faster than doing two comparisons. In my experience, compilers are quite good at realizing that a subexpression like that is pure (has no side effects) and thus that there's no need for a second branch. (You can also explicitly opt-out of short circuiting using bitwise-or: `(x < 0) | (x < y)`.) In contrast, my experience is that compilers tend NOT to do much special-case optimization on integers greater than the native word size, so `(int64)x < (int64)y` is quite likely to actually do a full int comparison.

Bottom line, there's no incantation which is guaranteed to produce the best possible machine code on any processor, but for the most common compilers on the most common processors, I would guess that the two-comparison form would be no slower than the promotion-to-int64 form.

EDIT: Some mucking about on Godbolt confirms that on ARM32, GCC puts way too much machinery into the int64 approach. VC does the same on x86. With x64, though, the int64 approach is actually one instruction shorter (since the promotion and 64-bit comparison are trivial). The pipelining might make the actual performance go either way, though.

[To see links please register here]

With a little template jiggery-pokery I think we can get the optimal result in all scenarios automatically:

#include<iostream>
#include<cassert>

template<class T> auto make_unsigned(T i) -> T { return i; }

auto make_unsigned(int i) -> unsigned int {
assert(i >= 0);
return static_cast<unsigned int>(i);
}

auto make_unsigned(short i) -> unsigned short {
assert(i >= 0);
return static_cast<unsigned short>(i);
}

auto make_unsigned(long long i) -> unsigned long long {
assert(i >= 0);
return static_cast<unsigned long long>(i);
}

template<
class I1,
class I2,
std::enable_if_t<(std::is_signed<I1>::value and std::is_signed<I2>::value)
or (not std::is_signed<I1>::value and not std::is_signed<I2>::value)>* = nullptr
>
bool unsigned_less(I1 i1, I2 i2) {

return i1 < i2;
};

template<
class I1,
class I2,
std::enable_if_t<std::is_signed<I1>::value and not std::is_signed<I2>::value>* = nullptr
>
bool unsigned_less(I1 i1, I2 i2) {

return (i1 < 0) or make_unsigned(i1) < i2;
};

template<
class I1,
class I2,
std::enable_if_t<not std::is_signed<I1>::value and std::is_signed<I2>::value>* = nullptr
>
bool unsigned_less(I1 i1, I2 i2) {

return not (i2 < 0) and i1 < make_unsigned(i2);
};

int main() {

short a = 1;
unsigned int b = 2;

std::cout << unsigned_less(a, b) << std::endl;

using uint = unsigned int;
using ushort = unsigned short;
std::cout << unsigned_less(ushort(1), int(3)) << std::endl;

std::cout << unsigned_less(int(-1), uint(0)) << std::endl;
std::cout << unsigned_less(int(1), uint(0)) << std::endl;

return 0;
}

You have to judge this on case by case basis. There are several reasons why signed types would be used in a program:

1. Because you actually need to have negative numbers in calculations or output.
2. "Sloppy typing", meaning that the programmer just types `int` all over their program without giving it much thought.
3. Accidentally signed. The programmed didn't actually want signed numbers, but ended up with them by accident, either through implicit type promotions or by using integer constants such as `0`, which is of type `int`.

---

In case of 1) then the arithmetic should be carried out with signed arithmetic. You should then convert to the smallest possible type that is needed to contain the maximum expected values.

Suppose for example that a value can have range from `-10000` to `10000`. You will then need to use a 16 bit signed type to represent it. The correct type to use then, platform-independently, is `int_fast16_t`.

The `int_fastn_t` and `uint_fastn_t` types require that the type is at least as large as _n_ but the compiler is allowed to pick a larger type if it gives faster code/better alignment.

---

2) is cured by studying `stdint.h` and by stop being lazy. As a programmer, one always needs to consider size and signedness of _every single variable declared in the program_. This has to be done at the point of declaration. Or if you get some manner of revelation later, go back and change the type.

If you don't consider the types carefully, you will with absolute certainty end up writing numerous, often subtle bugs. This is particularly important in C++, which is more picky about type correctness than C.

When "sloppy typing" is used, the actual intended type is most often unsigned rather than signed. Consider this sloppy typing example:

for(int i=0; i<n; i++)

It makes no sense at all to use signed int here, so why would you? Most likely you are iterating over an array or container and then the correct type to use is `size_t`.

Or alternatively, if you know the maximum size that `n` can hold, for example 100, then you can use the most suitable type for that:

for(uint_fast8_t i=0; i<100; i++)

---

3) is also cured by studying. Notably the various rules for implicit promotions that exist in these languages, such as _the usual arithmetic conversions_ and _integer promotion_.

Given the specific setup you presented:

int x;
unsigned int y;

and your apparent intent to evaluate whether the value of `x` is numerically less than that of `y`, respecting the sign of `x`, I'd be inclined to write it as

if ((x < 0) || (x < y)) {}

that is, your second alternative. It expresses the intent clearly, and it is extensible to wider types, as long as the maximum representable value of `y`'s type is at least as large as the maximum representable value of `x`'s type. Thus, if you're willing to stipulate that the arguments will have that form then you could even write it as -- avert your eyes, C++ adherents -- a macro.

Converting both arguments to a signed, 64-bit integer type is not a portable solution, because there is no guarantee that that would in fact be a *promotion* from either `int` or `unsigned int`. It also is not extensible to wider types.

As for relative performance of your two alternative, I doubt there's much difference, but if it matters to you then you would want to write a careful benchmark. I could imagine the portable alternative requiring one more machine instruction than the other, and I can also imagine it requiring one less. Only if such comparisons dominate the performance of your application would a single instruction make a noticeable difference one way or the other.

Of course, this is specific to the situation you presented. If you want to handle mixed signed / unsigned comparisons in either order, for many different types, as sorted out at compile time, then a template-based wrapper could help you with that (and that would moot the question of using a macro), but I take you to be asking specifically about details of the comparison itself.

Have a look at Andrei Alexandrescus keynote at the recent D conference in Berlin on Design by Introspection.

In it he shows how to design a checked int class at DESIGN time and one of the features he comes up with is exactly this - how to compare signed and unsigned.

Basically you need to perform 2 comparisons

*If (signed_var < 0) then return unsigned_var
else promote / cast signed_var to unsigned_var and then compare*

statin799358

joey457

kayagmizmsjm

joeannswidefxqkx

welling976518

subsulphiduysgginfoy