JOSTALY TECHNOLOGIES
John SETH Thielemann
sthielemann@jostaly.com
+1-223-231-3511
(1745418747 UNIX 112) Wed Apr 23 14:32:27 2025 - (1746022848 UNIX 119) Wed Apr 30 14:20:48 2025
0 NEWS)
1 Company News)
* Almost seamless Standard C replacement build with an OSS package.
* More ARM MMU work, into page tables.
* Expansion of floating point support across architectures.
2) Floating-Point Logarithms
Attachment@2848/size:78264 sha256: 061223709f3360c01a4bb62117280b00a600b3f44b3f4c9237833bb8b591c114
Floating point support has proven interesting across user/kernel space and
the span of architectures. Hardware and the compilation level may(/not) support
floating point. Of recent interest: some architectures may not have a FP
logarithmic instruction. Target front-end Standard C (libm) calls:
double log2(double);
float log2f(float);
The x87 has a log2 instruction: fyl2x (absent from SSE), quick 'debug' test:
asm volatile(
"fld1\n\t" -> load 1.0 constant
"fld %0\n\t" -> load raw ieee754 FP value (64.015625) (2^6 + 2^-6)
"fyl2x\n\t" -> st(0) = 6.00035217748030102721 (raw 0x4001c002e291d85437f9)
"fst %0\n\t" -> 0x40c002e3 = (rounded: 6.00035095)
The integer portion is simple, the negative exponent(s) are a bit more
interesting and took some time to figure out. Calculator shows: 2^6.00035095
= 64.015570533 (pretty close). What's a precise way to calculate?
Texas Instruments TMS320 DSP Designer's Notebook: Fast Logarithms on a
Floating-Point Device: outlines an algorithm: log2(X) = EXP_old + log2(mant_old)
(Log base two). Our implementation of this allows for tuning between low to high
precision for speed and accuracy, without rounding.
Generate simple tests with just negative exponents first, then various exponents.
integral_t resultFraction = 0;
T last = value;
T next = last * last;
size_t precision = 1;
for (size_t i = 1; i <= B; i++, precision++) {
T temp = next * next;
if (!temp.isNormalized() || i == B || temp.exponent() < next.exponent()) {
const sintegral_t nextExponent = next.exponent();
resultFraction |= nextExponent << (integral_bits - precision);
next.exponent(0);
temp = next * next;
}
last = next;
next = temp;
}
if (precision < TypeBits::bits)
resultFraction |= (next.fractionBits() << (integral_bits - T::bits_t::Fraction_BitMax)) >> precision;
Fast run: precision bits on span of exponent: 8
bit: 0 Test value: 1.00000012 result: 0.0000001192/3E80007FF2A95000
bit: 1 Test value: 1.00000024 result: 0.0000002385/3E9000FFEAA69800
...
bit: 21 Test value: 1.25000000 result: 0.3215220938/3FD493D1676BB8A7
bit: 22 Test value: 1.50000000 result: 0.5849631070/3FE2B8048CBC905B
60/100 checkboxes completed, 255 files changed, 16052 insertions(+), 2916 deletions(-)
EOF
ELF > @ @ 8/ @ 8 @
@ @ @ @ A A A A @ @ 0 0 Std @ @ 0 0 Qtd GNU AWAVAUATUSH LcH^A H+H Ld$P HILu9 HD$PH HT$XH H HD$`H H5 HHD$3 HD$`H HT$hHy Hp H$ Ht$ HH$=$ H$ HD$H4$H|$L Z+ L$ M H$ H LHD$H H<$ @ H8 L$ Ht$LH$ LLL$PHL$X m LL$@HL$HHD$@HD$XHD$HHD$@H HT$HH} HHq 1
HH9N < uH9 HHE 1HHEIH$ LHHIHD$/ H$ LHH$b H$ H$ HD$ 1H$ HD$ HD$0HD$0Hti HD$0H?H9tV HT$0H4H9
HL$0H4HtHL$0HHtHL$0H
H A H HD$(8H A H H$ H$ HD$(HD$(H?HD$(1H$ H
H
E1LT$0HH$ HH$ H H H<$H$ ! H$ 11E1H$ H$p H- H$ H$ H|$H$@ H$H H$P H$X HH$` H$ H$ @ H$h H$ H$ L$ L$ HDŽ$ @ - H$ H$ H
f[ A" E11I H H H HH$ HH$/ H$ H$ HD$`HT$hHtHtHrH$ Ht$1HHD$V H|$" H$ H$ H$ H$ H|$@ @ H4 E1ɸ A" 1I L$ L$ E1H
H
HH
HHHLt$H$ H$ 1H$ L L" H$ H$ H$ H$ H;
H5<
Hl$pH. HD$pHt,HT$xHt"HrH$ HH A" E11I H H H$ HH H. H$ H$ HD$pHT$xHHHL$ 1HLk L! H$ H$ H$ H$ 1H$ 11H$ H$ H$ H$ H$ H$ H$ H54 H$ H$ H$ H$ H$ H$ H$ H$ H$( HDŽ$0 @ H1HH @ < uHH$ Z* @ HH: HtHH( H$ tHHt( H$ H$ hD H1 H []A\A]A^A_1HD$ HD$ HHH1
HH9 <
uH9L4$HH$@ L) LD$(HL @ H HtH<$H' H$ tH<$H' H$ T L$ M H$ H? LH L$ L;$ HHH$ I9tkHHt$HHL$HD$H9HHI|U0 HSHD$HL$H9Ht$r1H$ H @ aHHH4$H$ L$ H$ S
HIH/ 1HD$(HD$ M9U1,Ht$H<$H1L)0 H$ H|$@H$ / yHJH<$H-&