libdfloat defines the following data types:
dfloat16_t: 16-bit decimal float with 8-bit mantissa and exponent
dfloat32_t: 32-bit decimal float with 16-bit mantissa and exponent
dfloat64_t: 64-bit decimal float with 32-bit mantissa and exponent
dfloat128_t: 128-bit decimal float with 64-bit mantissa and exponent
The proper way to declare a dfloat is to specify a decimal literal in
string format and convert it to a dfloat using one of the dfloatN_atof()
functions.
In the following function definitions, the capital letters M and N stand for integers, either 16, 32, 64, or 128. These numbers can be substituted as desired, as there are macros that generate the corresponding functions.
(Example: dfloat64_add(), no spaces)
libdfloat defines the following groups of functions:
void dfloatN_add( dfloatN_t *dst, dfloatN_t *src )
Add N-bit src and dst operands together and store the result in dst.
void dfloatN_sub( dfloatN_t *dst, dfloatN_t *src )
Subtract N-bit src from N-bit dst and store the result in dst.
void dfloatN_mul( dfloatN_t *dst, dfloatN_t *src )
Multiply N-bit src and dst operands together and store the result in dst.
void dfloatN_div( dfloatN_t *dst, dfloatN_t *src, int precision )
Divide N-bit dst by src and store the result in src, with up to precision
digits past the decimal point.
int dfloatN_cmp( dfloatN_t *df1, dfloatN_t *df2 )
Compares N-bit operands df1 and df2.
Returns:
1 if df1 > df2
-1 if df1 < df2
0 if df1 == df2
dfloatN_t *dfloatN_atof( char *str )
Reads an N-bit dfloat from input string str.
Input string must be in the following format (BNF):
[-]<digit><digit>*[.<digit><digits>*]
char *dfloatN_ftoa( dfloatN_t *df )
Generates a string representation of N-bit dfloat value df.
void dfloatN_cpy( dfloatN_t *dst, dfloatN_t *src )
Copies the mantissa and exponent from src to dst.
dfloatN_t *dfloatM_castN( dfloatM *src )
Takes a dfloat of size M and typecasts it, returning a dfloat of size N.
Cast functions where M == N have not been implemented as the author
saw no need for such functions.
Note: Overflow errors may occur if the result of an operation is too large to fit into the designated space. These will not be reported by the compiler and will simply result in incorrect values. It is up to the programmer to account for these overflows.
As of Version 0.2, libdfloat has versions of some of the above functions that free their source operands, allowing the user to build more complex expressions using these functions without having to worry about lost objects accumulating. These "free versions" are as follows:
dfloatN_t *dfloatN_addf( dfloatN_t *arg1, dfloatN_t *arg2 )
Adds arg1 and arg2 and returns the result
dfloatN_t *dfloatN_subf( dfloatN_t *arg1, dfloatN_t *arg2 )
Subtracts arg2 from arg1 and returns the result
dfloatN_t *dfloatN_mulf( dfloatN_t *arg1, dfloatN_t *arg2 )
Multiplies arg1 by arg2 and returns the result
dfloatN_t *dfloatN_divf( dfloatN_t *arg1, dfloatN_t *arg2, int precision )
Divides arg1 by arg2 with the given precision and returns the result
int dfloatN_cmpf( dfloatN_t *arg1, dfloatN_t *arg2 )
Like dfloatN_cmp(), but frees the source operands
char *dfloatN_ftoaf( dfloat_t *src )
Like dfloatN_ftoa(), but frees the source operand
dfloatN_t *dfloatM_castNf( dfloatM_t *src )
Like dfloatM_castN(), but frees the source operand
Note: Because these functions free their operands, they should not be used on variables. They should only be used on immediate operands, where the output of one function is directly supplied as a parameter to another.
Example:
dfloat64_t *sum = dfloat64_addf( dfloat64_atof( "1.2" ), dfloat64_atof( "3.4" ) );
Any questions or problems? Feel free to contact me at the following:
Github: github.com/PsychoCod3r
Personal email: acidkicks@protonmail.com
Submit issues at github.com/PsychoCod3r/libdfloat