| p Statistics: Sum of Digits |
| by JVSchmidt |
| General | |
|
The idea of this test is to divide data into substrings of length L, building the sum
of digits of each substring and calculating the Chi2-value for the sum's distribution. Here is an example for the p digits with L=5. First sequence 14159 -> SUM = 20 Second sequence 26535 -> SUM = 21 Third sequence 89793 -> SUM = 36 For single digits (L=1) the sum is equal to the digits value and thus we have a probability of 1/10 for any sum of digits from 0 to 9. If w(L,S) is the probability that a chain of length L has a sum of digits equal to S, so w(1,y)=1/10 with y=0,1,..,9 Any further distribution can be calculated recursively: w(L,y) = w(L-1,y) + 1/10 * sum (for all i=0..y) w(L-1,y-i) For any L the sum of digits is located between S=0 (all digits=0) and S=9*L (all digits=9). When going to longer and longer chains the min and max sums became extremely improbable because the likelihood for a single digit long run falls like 10-L. Graph shows the distribution of the sum of digits for different lengths of chains:
| |
| Result's Overview | |
| Digits analyzed: 4.2 * 10 9 Analysis started at digit: 1 Ellapsed computer time for one class: 3 min 30 sec - 4 min | |
| Length of chains L | Number of examined chains K = N/L | Number of different sum values D = 9*L+1 | Chi2 | Number of statistical relevant subdivision for the sum values | MIN / MAX sum found |
| 2 | 2.100.000.000 | 10 | 25,200 | 19 | 0 / 18 |
| 3 | 1.400.000.000 | 28 | 25,169 | 28 | 0 / 27 |
| 5 | 840.000.000 | 46 | 37,080 | 46 | 0 / 45 |
| 6 | 700.000.000 | 55 | 36,628 | 55 | 0 / 54 |
| 10 | 420.000.000 | 91 | 70,158 | 77 | 1 / 88 |
| 20 | 210.000.000 | 181 | 124,307 | 157 | 24 / 158 |
| 40 | 105.000.000 | 361 | 136,567 | 143 | 81 / 285 |
| 80 | 52.500.000 | 721 | 147,654 | 190 | 219 / 505 |