更新时间:2023-11-10 13:28:34
有没有一种安全的方法来可靠地确定整数类型T是否可以存储浮点整数值f?
Is there a safe way to reliably determine if an integral type T can store a floating-point integer value f?
是的。关键是测试 f
是否在 T :: MIN - 0.999 ...
到 T :: MAX + 0.999 ... 使用浮点数学 - 没有舍入问题。奖励:舍入模式不适用。
Yes. The key is to test if f
is in the range T::MIN - 0.999...
to T::MAX + 0.999...
using floating point math - with no rounding issues. Bonus: rounding mode does not apply.
有3条失败路径:太大,太小,不是数字。
There are 3 failure paths: too big, too small, not-a-number.
以下假定
int / double
。我将为OP留下C ++模板。
The below assumes
int/double
. I'll leave the C++ template forming for OP.
形成精确的 T :: MAX + 1
完全使用浮点数学很容易,因为 INT_MAX
是 Mersenne Number 。 (我们不是在这里谈论 Mersenne Prime 。)
Forming exact T::MAX + 1
exactly using floating point math is easy as INT_MAX
is a Mersenne Number. (We are not talking about Mersenne Prime here.)
代码利用:
A Mersenne数字除以2,整数数学也是 Mersenne数。
整数类型的2次幂常量到浮点类型的转换可以是肯定是完全。
Code takes advantage of:
A Mersenne Number divided by 2 with integer math is also a Mersenne Number.
The conversion of a integer type power-of-2 constant to a floating point type can be certain to be exact.
#define DBL_INT_MAXP1 (2.0*(INT_MAX/2+1))
// Below needed when -INT_MAX == INT_MIN
#define DBL_INT_MINM1 (2.0*(INT_MIN/2-1))
成形确切 T :: MIN - 1
很难,因为它的绝对值通常是2 + 1的幂,并且整数类型和FP类型的相对精度不是某些。相反,代码可以减去2的精确幂并与-1进行比较。
Forming exact T::MIN - 1
is hard as its absolute value is usually a power-of-2 + 1 and the relative precision of the integer type and the FP type are not certain. Instead code can subtract the exact power of 2 and compare to -1.
int double_to_int(double x) {
if (x < DBL_INT_MAXP1) {
#if -INT_MAX == INT_MIN
// rare non-2's complement machine
if (x > DBL_INT_MINM1) {
return (int) x;
}
#else
if (x - INT_MIN > -1.0) {
return (int) x;
}
#endif
Handle_Underflow();
} else if (x > 0) {
Handle_Overflow();
} else {
Handle_NaN();
}
}
关于非二进制基数的浮点类型( FLT_RADIX!= 2
)
Regarding floating-point types with non-binary radix (FLT_RADIX != 2
)
使用 FLT_RADIX = 4,8,16 ......
,转换也是准确的。使用 FLT_RADIX == 10
,代码至少精确到34位 int
为 double
必须完全编码+/- 10 ^ 10。所以问题是说 FLT_RADIX == 10
,64位 int
机器 - 风险很低。基于内存,生产中的最后一个 FLT_RADIX == 10
是十多年前的。
With FLT_RADIX = 4, 8, 16 ...
, the conversion would be exact too. With FLT_RADIX == 10
, code is at least exact up to a 34-bit int
as a double
must encode +/-10^10 exactly. So a problem with say a FLT_RADIX == 10
, 64-bit int
machine - a low risk. Based on memory, the last FLT_RADIX == 10
in production was over a decade ago.
整数类型是始终编码为2的补码(最常见),1s补码或符号幅度。 INT_MAX
始终是power-2-minus-1。 INT_MIN
总是a-power-2或1。实际上,总是以2为基础。
The integer type is always encoded as 2's complement (most common), 1s' complement, or sign magnitude. INT_MAX
is always a power-2-minus-1. INT_MIN
is always a - power-2 or 1 more. Effectively, always base 2.