Functions to speed up multiplication and division in certain cases.
More...
Functions |
| static unsigned long long | umul32x32_64 (unsigned int a, unsigned int b) |
| | Unsigned 32x32 => 64 integer multiplication.
|
| static signed long long | smul32x32_64 (signed int a, signed int b) |
| | Signed 32x32 => 64 integer multiplication.
|
| static unsigned long long | umul64x32_64 (unsigned long long a, unsigned int b) |
| | Unsigned 64x32 => 64 integer multiplication.
|
| static signed long long | smul64x32_64 (signed long long a, signed int b) |
| | Signed 64x32 => 64 integer multiplication.
|
| static long long | mul64x64_64 (long long a, long long b) |
| | 64x64 => 64 integer multiplication (signed or unsigned).
|
| long long | udiv32prep (unsigned div) |
| | Prepares for an unsigned 32-bit division by a constant.
|
| unsigned | udiv32 (long long divisor, unsigned dividend) |
| | Unsigned 32-bit division by a constant.
|
| long long | sdiv32prep (int div) |
| | Prepares for a signed 32-bit division by a constant.
|
| int | sdiv32 (long long divisor, int dividend) |
| | Signed 32-bit division by a constant.
|
Detailed Description
Functions to speed up multiplication and division in certain cases.
The ARM7TDMI core has instructions that multiply two 32 bit words and and gain a 64 bit result. However, those instructions are not available in THUMB mode. These functions provide access to those instructions efficiently. If the code is compiled for ARM, these functions will be inlined and turned into a few assembly instructions by gcc. When compiling for THUMB, these functions will be faster than what you could achieve by type-casting the multiplicands and let the compiler generate the multiplication for you.
The division routines can be used to avoid doing actual division when the same divisor is used several times. Preparation for the fast division takes about 4 times longer than a division, but then the fast division itself will be about 10 times faster than normal division. If the divisor is a constant or changes infrequently, then you can save a lot of runtime using these routines.
To use these functions you have to include misc/multiply.h.
Function Documentation
| static long long mul64x64_64 |
( |
long long |
a, |
|
|
long long |
b | |
|
) |
| | [inline, static] |
64x64 => 64 integer multiplication (signed or unsigned).
- Parameters:
-
| a,b | The 64 bit multiplicands. |
- Returns:
- The lower 64 bits of the 128 bit result.
| int sdiv32 |
( |
long long |
divisor, |
|
|
int |
dividend | |
|
) |
| | |
Signed 32-bit division by a constant.
This function divides its second argument by a constant and returns the quotient. The divisor constant is determined by the first argument and must have been obtained by a call to the sdiv32prep() function.
- Parameters:
-
| divisor | The 64-bit data structure calculated by the sdiv32prep() function, representing a 32-bit divisor. |
| dividend | The 32-bit number to divide. |
- Returns:
- The 32-bit quotient.
| long long sdiv32prep |
( |
int |
div |
) |
|
Prepares for a signed 32-bit division by a constant.
This function is the signed counterpart of the udiv32prep() function, see the explanation there. The difference is that the divisor, which is the argument of this function, as well as the divident and quotient, which are the argument and return value of the sdiv32() function, respectively, are signed 32 bit integers.
- Parameters:
-
| div | The value that you want to divide with. It must not be 0 or 0x80000000. If it is, then the returned value will be such that sdiv32() will return 0 regardless of its actual argument. |
- Returns:
- A 64-bit object that can be passed to sdiv32() so that it will divide its argument by
div.
| static signed long long smul32x32_64 |
( |
signed int |
a, |
|
|
signed int |
b | |
|
) |
| | [inline, static] |
Signed 32x32 => 64 integer multiplication.
- Parameters:
-
| a,b | The 32 bit multiplicands |
- Returns:
- The 64 bit result
| static signed long long smul64x32_64 |
( |
signed long long |
a, |
|
|
signed int |
b | |
|
) |
| | [inline, static] |
Signed 64x32 => 64 integer multiplication.
- Parameters:
-
| a | The 64 bit multiplicand |
| b | The 32 bit multiplicand |
- Returns:
- The lower 64 bits of the 96 bit result
| unsigned udiv32 |
( |
long long |
divisor, |
|
|
unsigned |
dividend | |
|
) |
| | |
Unsigned 32-bit division by a constant.
This function divides its second argument by a constant and returns the quotient. The divisor constant is determined by the first argument and must have been obtained by a call to the udiv32prep() function.
- Parameters:
-
| divisor | The 64-bit data structure calculated by the udiv32prep() function, representing a 32-bit divisor. |
| dividend | The 32-bit number to divide. |
- Returns:
- The 32-bit quotient.
| long long udiv32prep |
( |
unsigned |
div |
) |
|
Prepares for an unsigned 32-bit division by a constant.
Division is a costy operation on the ARM7TDMI core because it does not have a hardware divider. However, division by a constant can be done fast by using 64-bit multiplication. There is a cost of calculating the parameters, but then the division itself will be very fast.
If you compile for the ARM and you divide by a constant, gcc actually calculates those values and generates the relevant fast code sequence. On the other hand, if you compile for THUMB it will call the division routine even for constants, because the necessary instructions are not available in THUMB mode.
This routine calculates the necessary values and allows you to use the fast division in THUMB mode. In addition, regardless of whether you compile for ARM or THUMB, if you have to divide with the same number several times, then it might be worth to use this function, because the initial cost of calculating the parameters will be amortised over several calls to the actual fast division routine.
To use the fast division, first you have to call this function, passing it the divisor. It will then return a 64-bit value that contains the parameters needed for the actual division. You then pass this object and the value you want to divide to the udiv32() function, which will perform the division very fast. For example, let's assume that you have to divide the elements of an array by the same number. You could do it this way:
void ArrayDivide( unsigned *array, unsigned size, unsigned divisor )
{
long long params;
params = udiv32prep( divisor );
for ( ; size-- ; array++ )
*array = udiv32( params, *array );
}
The cost of calculating the parameters is approximately equal to four 32-bit divisions. The cost of doing a fast division using the precalculated value is roughly the same as a 64-bit multiply. If you have to divide at least five times with the same number, then it is worth to use the fast division.
- Parameters:
-
| div | The value that you want to divide with. It must not be 0. If it is, then the returned value will be such that udiv32() will return 0 regardless of its actual argument. |
- Returns:
- A 64-bit object that can be passed to udiv32() so that it will divide its argument by
div.
| static unsigned long long umul32x32_64 |
( |
unsigned int |
a, |
|
|
unsigned int |
b | |
|
) |
| | [inline, static] |
Unsigned 32x32 => 64 integer multiplication.
- Parameters:
-
| a,b | The 32 bit multiplicands |
- Returns:
- The 64 bit result
| static unsigned long long umul64x32_64 |
( |
unsigned long long |
a, |
|
|
unsigned int |
b | |
|
) |
| | [inline, static] |
Unsigned 64x32 => 64 integer multiplication.
- Parameters:
-
| a | The 64 bit multiplicand |
| b | The 32 bit multiplicand |
- Returns:
- The lower 64 bits of the 96 bit result