-
Notifications
You must be signed in to change notification settings - Fork 16
/
ChangeLog
3425 lines (2546 loc) · 157 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
------------------------------------------------------------------------
r529 | rgbatduke | 2011-04-01 13:49:31 -0400 (Fri, 01 Apr 2011) | 117 lines
OK, this is a fairly enormously major brutal checkin. Both dieharder
and libdieharder are ALMOST -Wall -pedantic clean. To get it there I
had to learn several things, such as how to get gcc to ignore "unused
variables" that are conveniently in a shared include file but aren't
really used in all the modules that share it, the fact that the various
flavors of C have varying "maximum string size guaranteed to be
supported" limits (none of which are really relevant to gcc, but it
complains about them anyway), and more. And of course I had to delete
all the cruftish lines of e.g. unused loop variables. I'm not quite
done with cleanup -- I may have gone overboard in a place or two and may
need to put some things back or address things that might affect
function -- but I want to get this all checked in.
There are two build errors left -- one is in dieharder/rdieharder.c (and
hence is yours, Dirk) and the other is in the skein code (and hence is
yours, David). David, I also need you to check a fix I made to the
rng_threefish code -- I finally took the time to figure out the dread
"dereferencing type-punned pointer breaks strict aliasing rules"
warning. I replaced the offending line:
*((unsigned long int *) state->block) = s;
with
unsigned long int *blockptr;
...
blockptr = (unsigned long int*) &state->block;
*blockptr = s;
That is, I read what you were trying to do as "Set the contents of
state->block, cast to an unsigned long int pointer, equal to unsigned
long int pointer s" which might work but gcc -Wall hated it even before
-pedantic and (from what I've read) can have undesired side effects. So
I introduced an actual unsigned long int pointer, put the address of
state->block in it, and the set its contents equal to s. It didn't seem
to break threefish -- I tested the first few returns before and after
the fix with -S 1 and they were the same -- and I'm using threefish
right now in a validate.sh run to make sure that I didn't egregiously
break dieharder with all of the changes.
Changes you should be aware of:
* To avoid most of the "too long string" errors I went to -std=c95,
which permits strings a page in size (4095 bytes). That accommodates
the auto-documentation strings in the test headers. There may be
another way of doing this -- in fact there are probably two or three --
but to alter the dh headers at this point would (marginally) break the
API so it will need to wait for v4, I think. Apparently gcc is about to
be dressed up with an __attribute__ that will probably enable extra
large or unlimited data strings without complaints which is sensible
enough since it works on them anyway AFAICT.
* c95 turned off uint translation . I went through a huge block of
code turning uint into the two words unsigned int before getting
irritated enough to look at the headers where I discovered that yeah,
you can turn on the uint -> unsigned int macro with a suitable define.
So I did.
* c95 turned of BSD math macros in math.h, including M_PI. That
seemed really silly, so I turned them back on with a suitable define. I
didn't turn on the long forms (they only really make sense for long
doubles) but we can do that if we ever need PI to 24 places or whatever
it was.
* -Wall -pedantic really hates any sort of data that is included in a
source file where it isn't used. If we were all perfect programmers, I
suppose that we would create enough include files and control where they
were included precisely enough that no source file even included an
include file with a variable it didn't actually use. Alas, I'm not a
perfect programmer and lots of the data structures used only inside
certain tests or by certain generators are shared via libdieharder.h
with program modules that don't use them. Adding
__attribute__((unused)) after the definition but before the = sign
basically tells the compiler "Yes, I know, I planned it that way, now
shut the fuck up" and passes them through -pedantic without complaint.
I suppose that the virtue of the check is that it helps prevent
namespace collision, but of course the compiler checks for that anyway
and general local vs global rules seem like they would handle any
accidents that crop up the right way. If I feel really, really
energetic someday I may go and segregate out the data and either add it
to the sources directly (in a lot of cases that's a good place for it
anyway) or put it in a separate include file per module. OTOH things
like the dh headers are shared because I DO access content from them in
lots of places and want to be able to get to it from anyplace, so there
will always be some ((unused)) attribute variables in the program.
Printing out the test description string for any given test, or looking
up the default tsamples or psamples, for example, is something any sort
of application that uses the libdieharder library might want to do.
* As per current GBT recommendations and Dirk's suggestions, all of
the auto-whatever stuff in autogen.sh is now basically a single
autoreconf call. In fact, it looks like they made autoreconf just
because getting all of the things just right after a major GBT update
is, in fact, the pain in the ass that it has been to me from the
beginning, so this is rather a relief. I did leave the configure call
in the bottom, so running autogen.sh should still take one from a clean
checkout to make-ready, or of course you can enter autoreconf by hand
and run configure by hand as per usual. Hopefully this will all make
Debianheads happy...;-)
Things that I have NOT done yet -- this checkin is basically six hours
of work from 9 to 3 am plus another couple of hours today, so I'm
working as fast as I can as it is -- include debugging the endian
problem in the threefish (or was it AES?) code on e.g. a sparc or
powerpc set to the other endianness and dealing with a few real bug
reports that have come in from users already. I wanted to get the code
clean first as who knows, maybe doing so will help solve the problem?
SO, if you guys could each fix the two remaining problems (or tell me to
play through in spite of the fact that I'm not sure what is being
accomplished and what would break what) then I'll try to move on to the
next step.
rgb
------------------------------------------------------------------------
r523 | rgbatduke | 2011-03-10 11:09:12 -0500 (Thu, 10 Mar 2011) | 4 lines
A last minute oops. I wanted to mark operm5 as good, and mark all of
the monkeys suspect (as they can pretty easily be run to failure for
good generators still).
------------------------------------------------------------------------
r510 | rgbatduke | 2011-01-07 16:19:40 -0500 (Fri, 07 Jan 2011) | 31 lines
This is a WORKING snap and bump to 3.29.6beta. I actually fixed several
things that I broke before in the rng selection process. New features:
rng_kiss -- a damn fine rng. Faster than mt, better than mt except for
period.
rng_XOR -- Select this rng, and a list of others, e.g.
./dieharder -g 207 -g 208 -g 14 -g 6 -g 205 -a
dieharder will then return the output of 208 (kiss), 14 (mt19937_1999),
6 (gfsr4) and 205 (aes) all xor'd together. Period infinite, no LESS
random than the MOST random of the generators alone. The price you pay
is sure, 2, 3, 4 times slower. But this is now the official gold
standard dieharder testing generator, as finding something randomer will
be difficult and of longer period impossible (what is the least common
multiple of 19937, 121, and all the rest? 2 to that power, like that).
I'm working on superkiss, a vectorized version of kiss with an insanely
long period, but the double precision part is broken and I don't see
why. The integer part works. I'll figure it out maybe tonight, and
have a few other Marsaglia generators to add.
Then I'll return (finally) to tests, with the gold standard generator
well and truly in place. At least three dieharder (diehard) tests are
broken, and I'd like to fix at least ONE of them before I get bogged
down teaching again.
rgb
------------------------------------------------------------------------
r508 | rgbatduke | 2010-02-19 13:13:56 -0500 (Fri, 19 Feb 2010) | 9 lines
Oops, forgot to update FIRST. This should get me back in sync. You
guys should ignore this; I'm rearranging my whole source tree on my
laptop(s), dieharder included, and am just trying to make sure that the
rebuilt one is clean.
I also haven't completely forgotten the last post/request for interface
room -- I've just been insanely busy and haven't had time to even think
about it for the last few weeks. But I will get back there, I promise.
------------------------------------------------------------------------
r498 | rgbatduke | 2009-10-28 01:48:25 -0400 (Wed, 28 Oct 2009) | 28 lines
Wow, a lot of stuff. This checkin contains a working -Y 2 option for
"test to destruction" where ttd is by default a return pvalue of
0.000001 or less OR getting to 10000 samples alive (both parameters
can be set on the command line with -X tolerance and -Z cutoff). I
actually did it two ways, and will keep the second (better) one and
shortly remove the cruft in std_test.c. In addition I had to update the
help, I updated the output routines in output_rnds so I could dump a
list of formatted floats (to test another rng tester that alas was so
broken it couldn't read any format I tried anyway), I fixed and updated
the man page, I got rid of the old overlap variable (no longer
desireable or necessary, although I have a bit of cruft left behind to
clean up still).
As a result of the initial ttd test, I am certain that there is a
problem with diehard_dna, one that causes it to fail aes at 0.000001 in
1500-4000 samples. This is odd, since this test has an "exactly"
computed mean and sigma target. I may try threefish in a second to see
if it fails too, in the same general order. Haven't done the auto-xor
generator thing yet.
I still have to implement -Y 1 (resolve ambiguity mode) where it will
force a test to fail or come back up with more samples, but it should be
straightforward. However, it is almost 2 am and I teach way too early,
so it is off to bed.
rgb
------------------------------------------------------------------------
r497 | rgbatduke | 2009-10-22 09:45:52 -0400 (Thu, 22 Oct 2009) | 20 lines
This is a not-quite-yet-broken checkin of -Xtreme mode changes. Three new
control variables are in place. They are parsed (untested). They are
used in std_test to allocate much larger pvalue vectors (Xcutoff in
size) at test initialization time. I'm JUST READY to hack into the main
std_test execution loop with case switches or other conditionals and
implement at least resolve ambiguity and ttd modes. But as usual, I
have to go in and teach. At the moment, though, DH still builds and
runs -a correctly, so it seems like a good idea to check in a still
working snap in case I break everything and want to start over from
here.
Oh, I also am cleaning up a bit and made the multiply_p variable (-m
option) a double, so you CAN enter -m 0.1 and run only 10 psamples for a
fast version of -a(ll). At this point a lot of debugging is just
ensuring that all the tests run, and it is a PITA to wait 30+ minutes
for a -a(ll) run to get through. So you now CAN test fewer than the
default number of psamples in an -all run, even though most people won't
use the option in actual testing. The usual usage, -m 10 or -m 100,
works fine still.
------------------------------------------------------------------------
r495 | rgbatduke | 2009-10-20 14:25:32 -0400 (Tue, 20 Oct 2009) | 34 lines
I'm checking in a lot of changes down below. -m is implemented and
documented. -k is implemented and documented. The man page is fixed
(post good-kstest and aes/threefish). The endian bug went away when I
refreshed the include files, making me wonder if it wasn't some sort of
strange GBT stuff and not a real problem -- I left in the endian code in
configure.ac but don't use it. I re-fixed diehard_runs.c -- it was
broken post patching but now seems good. I filed some documentation and
bug reports. I fixed a number of pernicious warnings about needing
casts (one still remains in threefish, but it is David's and I don't
know how to fix it). I worked on dieharder.html.in pretty substantially
to get it to match all of the above.
Next, -x (and maybe -X).
(p.s. -- Welcome David.)
(p.p.s. -- I'm still testing -- sigh, forever -- but it looks like all
non-deprecated tests are working OK in this snap, and that the -m
feature works nicely. I documented timings for the k options, and
basically it comes down the kstest being too slow to do large numbers of
samples without switching over to the asymptotic form of the test at
some point. I mean, going from three minute runs to over three hours
and still counting when I quit for a factor of 10 difference in the
number of samples, really serious nonlinear gains in the amount of
work/time required, and this was still -k 1 with Marsaglia's more modest
speedup, not even the "exact" mode with no speedup at all.
This will quite possibly require some further hacking of the boundaries
for a crossover that is "practical" and not too inaccurate as we gain
experience with our own patience, especially as we implement a -x like
option that just keeps crankin' on the number of samples to hit a
prespecified tolerance for failure.)
------------------------------------------------------------------------
r494 | rgbatduke | 2009-10-19 09:46:39 -0400 (Mon, 19 Oct 2009) | 3 lines
This is all of the Bauer patches. Some are tested, but the testing
continues.
------------------------------------------------------------------------
r493 | rgbatduke | 2009-10-18 10:43:52 -0400 (Sun, 18 Oct 2009) | 108 lines
This is checking in what will be 3.29.4beta.
Primary fixes so far:
Several changes to configure.ac to eliminate all reference to libaes
and to set macros ENDIAN_BIG or ENDIAN_LITTLE to 1 in the configure
stage of the build. I plan to insert a very simple prequel in Brian
Goodman's brg_endian.h header file that handles endian issues cleanly
and skips most of the stuff below for little endian. I do need to
ensure that it builds on i386 as well (when I'm done) as I have a report
that 3beta doesn't build clean on that architecture due to problems in
this header file.
A fix due to Glenn Emelko, [email protected], where I correctly
bumped filecount to type off_t in libdieharder.h but failed to redefine
rtot and rptr accordingly in the rng_file_input.c struct and code. He
was running 18 GB raw files and obviously this overflowed uint variables
with bad results. Oops, and thanks Glenn.
I am trying to get sts_serial.h to run at 24 bits by default, not 16
(I think that this will still take a not unreasonable amount of time).
The problem is that sts_serial doesn't use bits.c calls to parse out the
next 24 bits, it just grabs 16 bit, then the next 16 bits, out of 32 bit
uints. This is fast but not scalable. I have to go in and edit the
code to use bits calls to get the next 24 bits, no foolin, or better yet
use -n ntuple to set the maximum number of bits teste (that's my real
goal, with 24 being the default).
Plans: David Bauer sent me a fairly extensive patch against 3.29.3beta
that fixes some memory leaks and/or speed issues in bits (?) as well as
fixing some parts of the diehard OP code -- probably fixing Marsaglia's
bugs and not mine, but hard to say. There are bugs in there and I've
already squashed several so it wouldn't surprise me if there are more
(even more of my own:-). I'm going to TRY to implement most of his
fixes if they work well and seem to fix something that makes sense to
me, although I'd feel better (per fix) if I could find a test case that
illustrates the failure. I may have to ask him how he found the bugs so
I can document them in svn somewhere, later. Memory leaks of course are
relatively easy and again, I could easily have created some -- getting
rid of them is definitely called for. David is also looking at the
rgb_bitdist tests (which SHOULD be as sensitive as the OP tests if
cranked up to the correct degree) -- there may be some fixes there
coming.
Finally, I have a few operational changes in mind -- primarily adding or
altering the new interface in a couple of small ways. Kuiper will go
away as an option (but not the code -- I'll leave a macro in place that
can switch it back on in case there is ever any point in reconsidering
the test, if for example I or David or somebody else figures out an
exact CDF for it so it becomes as accurate and perhaps faster than KS,
or it is needed for a specific rng test in the future. -k flags will be
used to control how hard ks works (and hence how fast vs accurate
dieharder is) with a default of pretty fast, pretty accurate and
alternatives of slow but to-convergence-exact and really fast but only
accurate enough for the short version of the -a(ll) run.
I'd also like to introduce two new run modes controlled by flags. One
of them, -m(ultiplier), will allow the user to enter a scale factor to
be applied to the default 100 -psamples used in -a(ll) runs (otherwise
ignored). So if one want to run all the tests but with 1000 psamples
per test (or 10x whatever the per-test default is) one runs -a -m 10.
This should make it MUCH easier to test to destruction, increase test
resolution, etc.
Second, I want to introduce a flag that runs a test "to failure" --
something I've planned to do for a long time. David has already hacked
in his own version of doing this, and I used to do something very
similar in my numerical simulations. The idea of running in -x(treme)
mode (or whatever I name the flag) would be to start with e.g. 100
pvalues and then add 100 pvalues at a time to the test run until the
final pvalue fails a fairly stringent (user selectable) cutoff.
-d 1 -x 0.000001
would add psamples to the birthdays test until the final pvalue is under
0.000001.
-d 1 -X 0.000001
would do the same thing, but it would run the test to this degree of
failure psamples = 100 times with different rng seeds (if appropriate)
and return something like max, min, mean number of psamples required to
cause a test failure. SOMETHING like this is going to be needed,
because I think it is entirely plausible that some tests have
"poisonous" seeds that have just the right prime modulus to introduce
correlations in their stream, but that NEARLY ALWAYS are started with
seeds that yield good streams.
I'd like to have these last two options work for -a -m runs as well, so
-a -m 10 -X 0.000001
runs all tests until they fail low or high at one part in a million,
1000 times for different seeds per run, returning the average number of
psamples required to reach failure. I'd even like to be able to plot
the distribution of this number so one can pick out e.g. bimodal
distributions (bad seeds!) etc.
At some point being able to do everything that dieharder will want/need
to do is going to require a GUI -- something that can generate scatter
plots, candlesticks, real non-ascii histograms, line graphs, 2d/3d
surfaces. But that's still a ways in the future. -X is going to be
pretty tricky as well, as dieharder isn't equipped to return anything
but a final cumulative "pvalue" in [0,1] for a test. But it is probably
better to do it now in the beta phase where this doesn't really damage
any other future dependent interfaces (e.g R).
------------------------------------------------------------------------
r492 | rgbatduke | 2009-10-12 18:53:13 -0400 (Mon, 12 Oct 2009) | 8 lines
THis is mostly to check in the dieharder NSF proposal from last year as
it has a roadmap for future dieharder development, and I'm thinking hard
about adding a few of the many missing generators now that kstest is
reliable. I'm still working on kstest, mind you, but it is mostly on
the details, not on the basic code.
rgb
------------------------------------------------------------------------
r491 | rgbatduke | 2009-10-12 14:55:55 -0400 (Mon, 12 Oct 2009) | 6 lines
This actually works PRECISELY for all count ranges. It is still in
testing -- I've got a bit of work to do to be ready to release this
globally (including letting David test it and see if he agrees) but it
should COMPLETELY FIX dieharders final kstest (and I'll give it one
last opportunity to fix diehard_sums():-).
------------------------------------------------------------------------
r490 | rgbatduke | 2009-10-12 12:39:51 -0400 (Mon, 12 Oct 2009) | 10 lines
Checking in some key papers (and some stuff getting rid of broken
diehard_sums altogether for now -- leaving in the test but strongly
deprecating it in dieharder). The papers SHOULD permit us to compute
the exact CDF for the one-sided KS test against a uniform distribution
for small N and thereby make the KS test reliable for all sample sizes.
In particular ks_CDF_N.pdf looks like it will do the trick.
rgb
------------------------------------------------------------------------
r489 | rgbatduke | 2009-10-11 11:05:24 -0400 (Sun, 11 Oct 2009) | 32 lines
FINALLY I got threefish to work. brg_endian.h was broken as shit; it
starts off by pulling something from crypt.h that is obviously broken on
modern linux boxes (at least my Fedora 11 x86_64 box). The remaining
code looks quite general and seems to work, although I have to admit I
absolutely hate crap code like this -- it smacks of aimk, imake, and
other crap tools that detect platform type using some sort of transient
trace from one tool or another that breaks three years later (or rather,
requires yet another conditional). I'll leave this in for now in case
somebody wants to port to sparc or some bigendian platform, but since we
are using threefish only to make random numbers and don't care to ever
decrypt the stream of 0's or whatever it is applied to, I honestly doubt
that it matters. Getting endianness wrong sounds like it is at worst an
extra byte shuffle.
Either way, this will be 3.29.2beta and I'll put it up on the dieharder
website in a few minutes (after a full -a run of threefish passes). I
may add a comment to brg_endian.h indicating my hack, lest people be
tempted to use it as if it weren't modified.
Grrr. I'm REALLY tempted to just strip it to the two line definition
that is all that matters in skein_port.h and screw the whole "automagic"
thing. Robust code is robust code, and there are bound to be
intrinsically portable ways to handle endianness IF it is really
necessary in the first place.
At least this finally liberates me to move on and work on kstest and
kuiper again. That's been on hold for a few days, but I'm feeling like
we're getting close to having one or the other work "perfectly" (if I
can find and add the missing O(1/N) correction terms from the
literature).
------------------------------------------------------------------------
r488 | rgbatduke | 2009-10-08 12:50:29 -0400 (Thu, 08 Oct 2009) | 25 lines
This is most of the threefish stuff required, but I'm still having
trouble with the big/smallendian conversions apparently needed by skein
in threefish. One function that is supposed to be defined automagically
is coming out UNdefined in the linker, which is "bad". I may have to
ask David Bauer how he got this to compile. Note that I've added both
bauer and emelko's current round of bug reports and remarks to the Bugs
directory below. David in particular has been really looking hard at
kstests, and with good justification. The kstest is apparently very
poorly defined even in stats texts and the literature. It is apparently
more broken in R than it is in dieharder, and it is still a bit broken
in dieharder.
As is so often the case in dieharder problems, pushing the test suite to
new limits exposes weaknesses in code that has long been taken for
granted because it has never been used for a rigorous analysis of this
sort. But it NEEDS a precise ks or kuiper test, not just a sorta-useful
approximate one, or one cannot rely on its statements of weakness or
failure.
Anyway, this checkin is still broken but is within one #define or so of
working, I think, once I figure out how to do it without violating the
code in the brg_endian.h include that is supposed to automagically
select the right Skein function that is currently undefined.
------------------------------------------------------------------------
r487 | rgbatduke | 2009-10-07 12:19:44 -0400 (Wed, 07 Oct 2009) | 16 lines
David Bauer contribued two cryptographic grade GSL wrapped rngs (one of
which I had been working on myself, but his has no dependencies and it
works already). rng_aes appears to work, very respectably. It has
minimal controls (compared to aespipe) but aespipe is still there if
people want to play with it directly. It isn't too shabby speedwise,
actually, for what should be a world-class rng. I'm going to see if he
(David) cares if I contribute it back to the GSL -- it needs a few
generators like this in its collection. Although as it is GPL the
answer is obviously not, I think.
In a second I'm going to insert rng_3fish as a second one. These are
enormously useful for testing dieharder itself, and as GPL sources will
be useful just being part of the dieharder unless/until they make the
GSL.
------------------------------------------------------------------------
r486 | rgbatduke | 2009-10-06 14:26:39 -0400 (Tue, 06 Oct 2009) | 3 lines
Little fixes, ignore. Added -d 204 to -all properly, fixed its
autodocumentation a bit.
------------------------------------------------------------------------
r485 | rgbatduke | 2009-10-06 14:17:43 -0400 (Tue, 06 Oct 2009) | 35 lines
This records a validation script to use with aespipe to produce a
"standard run" of dieharder in -v3. aespipe with the fixed, trivial
256-bit key in aeskey below, is used to encrypt /dev/zero into a stdin
stream and fed to dieharder -a. The encrypted stream should be as close
to "truly random" as we can currently manage with simple, reasonably
fast tools. The interesting thing is that this stream actually PASSES
ALL OF THE TESTS in dieharder, even the "known bad" tests such as
diehard_operm5. This makes it very, very useful for comparison
purposes. For example, for the first time ever, I feel like I can now
say that mt19937 actually FAILS dieharder (or has weaknesses that are
explicitly exposed by dieharder) when it consistently has tests (even
very specific tests for very specific length ntuples) on which it is
weak or fails or exhibits high bias in its output pvalues.
To be fair, passing all of the tests isn't necessarily a good thing,
since there are over 100 of them including ntuples. One expects 1/100
or 1/200 or thereabouts to be weak for a PERFECT RNG on most runs.
Eyeballing the distribution of final P in the aespipe run reveals that
dieharder still produces a weak high bias in the final distribution of
pvalues, but this is very much in line for the bias revealed by
rgb_kstest_test and is therefore very likely an artifact of using -p 100
as the default for most of the tests in -a(ll).
I'm going to run the validation line:
cat /dev/zero | aespipe -P aeskey | ./dieharder -g 200 -a
with -p 1000 just for grins (which will take the rest of the day, I
expect) and see if it doesn't push the final distribution right back
where it belongs, with less visible bias towards the 0.9-1.0 range and
away from 0.0-0.1 on the bottom.
Still, a perfect PASS for a nearly perfect generator. How cool is that?
------------------------------------------------------------------------
r484 | rgbatduke | 2009-10-06 12:45:36 -0400 (Tue, 06 Oct 2009) | 40 lines
This is a set of changes that:
a) Fix (for the time being) a problem with ltmain.sh, badly. I
suspect that I'll need to add a libtoolize command to the autogen.sh
script in order to prevent drift from local libtools in the long run, or
give in and make it a link to /usr/share/libtool/config/ltmain.sh and
pray that this is portable.
b) Changes the default ks test in dieharder from broken Kuiper or
broken KS to fixed KS. This is an CRITICAL fix and needs to backport to
2.28 as with it dieharder will FINALLY give much more nearly correct
pvalues for the relatively small number of pvalue samples in the kstest
at the end of each test. With the old code one needed two or three
orders of magnitude more samples -- at LEAST -p 10000 -- in order for
the final pvalue to be not VISIBLY high biased when applied to perfect
uniform deviates. With the fix -p 100 works "OK" although -p 1000 would
be better and will probably be the default -a(ll) option in 3.x.
The actual fixes are a single line in dieharder/set_globals.c (change
the comment name but not the default number of the ks_test global), a
single line in libdieharder/kstest.c, and switching the order in
libdieharder/std_test.c so that ks_test == 0 runs kstest, not
kuiper_kstest. Fixing the documentation is probably not worth it in
2.28.
I would suggest still holding out on actually making the fixes for a
bit, as I'm actively playing with things and testing out the new code
(in a moment with aespipe as I still haven't finished rng_aes). The
changes are preserved and saved as 3.29.1beta, though, with the addition
of a very useful and useable rgb_kstest_test routine that can be used to
further debug and/or improve the kstest used to generate final test
pronouncements of pass/fail/weak etc. And we still need to decide if it
is time to move on to v3, as a lot of people are using it and it seems
to be stable and usable and has lots of bug fixes and feature
enhancements (including much better future scalability as I add tests
and generators).
rgb
------------------------------------------------------------------------
r483 | rgbatduke | 2009-10-06 09:08:59 -0400 (Tue, 06 Oct 2009) | 7 lines
The new rgb_kstest_test in this version actually works, but it looks
like we have some sort of libtool derived bug in the build. I'm
checking in clean so I can rerun libtoolize, which will hopefully get me
a new ltmain.sh, which will hopefully build a libtools script that
contains the correct ECHO/echo lines and perhaps deals with MODE
correctly.
------------------------------------------------------------------------
r482 | rgbatduke | 2009-10-04 09:44:12 -0400 (Sun, 04 Oct 2009) | 15 lines
Ignore today's checkins, Dirk. I'm adding a new test (rgb_kstest_test)
to test the kstest routines (as well as to MAYBE function as a new test
in the suite, but I doubt that it will be sensitive enough to be any
use). Basically, I plan to fill a vector with tsamples uniform
deviates, run a kstest on them (which tests for uniformity and generates
a pvalue that should itself be a uniform deviate) to fill in the usual
vector of pvalues and run the final kstest on that. A kstest SHOULD
recursively take uniform deviates to a uniform deviate, for a large
enough set of uniform deviates, and I want to find out a) if this is
true; and b) if it is true just what a "large enough set" is. This test
should help me find out both, and if a) is incorrect, to perhaps "fix"
the kstest as this is their THEORETICAL behavior and failure to
accomplish this indicates a bug in the code or a real problem in the
theory...
------------------------------------------------------------------------
r481 | rgbatduke | 2009-10-02 16:32:30 -0400 (Fri, 02 Oct 2009) | 3 lines
This checks in what might be a VERY IMPORTANT fix to kstest, due to
David Bauer. Needs more testing, though, with a world class crypt.
------------------------------------------------------------------------
r480 | rgbatduke | 2009-03-17 08:27:23 -0400 (Tue, 17 Mar 2009) | 2 lines
Checking in so I can leave.
------------------------------------------------------------------------
r479 | rgbatduke | 2009-03-17 00:28:25 -0400 (Tue, 17 Mar 2009) | 3 lines
This is broken as far as the aes generator is concerned, AND I'll
probably need to put libaes into the dieharder packaging.
------------------------------------------------------------------------
r478 | rgbatduke | 2009-01-29 10:57:43 -0500 (Thu, 29 Jan 2009) | 8 lines
Checking in a LOT of changes and additions associated with v3 -- I've
been holding them so as not to screw up the RDH side of things before
everything stabilizes. A lot of the stuff below is documentation
intended to guide future development and additions. Some of it is fixes
(data and otherwise) in diehard tests. Some of it fixes the way
dieharder (the binary, not library) initializes (and adds local tests)
and runs all tests.
------------------------------------------------------------------------
r477 | rgbatduke | 2008-10-08 15:11:30 -0400 (Wed, 08 Oct 2008) | 3 lines
Sending in a minor change to START fixing up parsecl.c to be more
robust.
------------------------------------------------------------------------
r476 | rgbatduke | 2008-09-29 22:22:38 -0400 (Mon, 29 Sep 2008) | 26 lines
This checkin should make Mattias "perfectly happy". It enables:
rgb@lilith|B:1140>./dieharder -a -D default -D -1 -D prefix -D
no_whitespace -D show_num -s 1
0|rng_name|num|rands/second|
1|mt19937|13|1.17e+08|
0|test_name|num|ntup|tsamples|psamples|p-value|Assessment|Seed
2|diehard_birthdays|0|0|100|100|0.16302070|PASSED|3542794731
2|diehard_operm5|1|5|1000000|100|0.04115096|PASSED|2304163927
2|diehard_rank_32x32|2|0|40000|100|0.92631752|PASSED|2245496723
2|diehard_rank_6x8|3|0|100000|100|0.86585575|PASSED|3223183182
2|diehard_bitstream|4|0|2097152|100|0.60520232|PASSED|2615297461
2|diehard_opso|5|0|2097152|100|0.05852624|PASSED|1542897414
which is, AFAICT, exactly what he wants. Oh, he wants the full test
name as an output field option instead of the short name, but he might
have to wait on that...
This also checks in a couple of minor bugfixes reported by Mattias and
Marc Abel. Marc has another feature request I haven't looked at yet.
Both of them are using dieharder quite heavily in the beta version, so
I'm hoping that it is shaking out. I'm also hoping this round of
changes didn't break anything.
Not quite ready for a release, but perhaps getting closer.
------------------------------------------------------------------------
r475 | rgbatduke | 2008-09-22 19:10:01 -0400 (Mon, 22 Sep 2008) | 2 lines
Added small section to man page on output control.
------------------------------------------------------------------------
r474 | rgbatduke | 2008-09-22 07:24:15 -0400 (Mon, 22 Sep 2008) | 12 lines
Small changes to add dieharder-config.in to the Makefile.am and to
get the rpm to autobuild with a split between @VERSION@ and
@DIEHARDER_LT_VERSION@ -- I basically twinned the latter into
@DIEHARDER_LIB_VERSION@. A slight pain, but it means the library can
have a different version (no beta) compared to the program (with beta).
The successful RPM build means that everything is in place, although
there is still cruft in include and probably libdieharder and there may
be NON-cruft that isn't in the repo. But I gotta go and won't find it
now.
------------------------------------------------------------------------
r473 | rgbatduke | 2008-09-22 03:29:33 -0400 (Mon, 22 Sep 2008) | 24 lines
OK, this is PARTIALLY decrufted -- I doubt that it is finished yet, and
I haven't even started to tackly the proper decrufting of the library.
I've cleaned up the dieharder man page, checked all the autodocumenting
features of dieharder, and run a bunch of tests. I've preemptively
fixed around three or four bugs, and finished implementing a couple of
features that were missing on the previous checkin, e.g. the ability to
use any of:
dieharder -d diehard_sums -g 6
dieharder -d 14 -g gfsr4
...
(all tests AND rngs selectable by name OR number. It is 3:20 am, and I
have to get up by 6:40. It is therefore bedtime.
If I haven't forgotten to checkin any files, it should build and run
pretty well. Probably not perfectly, but pretty well. Matthias should
be happy -- if he uses -c ',' and -D prefix, he'll get close to exactly
what he wants. Everything seems to be working as far as I can tell with
limited testing.
Might be a day or three before I can really tackle this again. G'night.
------------------------------------------------------------------------
r472 | rgbatduke | 2008-09-22 00:13:56 -0400 (Mon, 22 Sep 2008) | 92 lines
OK, this is a checkin of dieharder 3.28.0beta. It is NOT fully
decrufted, but seems to mostly nearly hopefully all work. That probably
means that there are only a dozen or two bugs. There are also a few
API features I haven't implemented in the UI yet -- specifically the
reporting of errors (like a rewind of a file in mid-test). So this
ISN'T really a beta -- more like an alpha.
Dirk, please do not start converting this over into Rdh yet. I'm
checking it in for two reasons -- one is that I have to remove a whole
pile of files to decruft and svn won't let me until I check in. Another
is that I NEED to checkin -- it makes me nervous to have this large a
delta not checked in. There are probably a couple of critical sources
I've forgotten to add entirely and I won't know until I check in and
check out and build a fresh clean copy.
MOST of what I've done, from Rdh's point of view, should be invisible
after Rdh is (fairly minimally) hacked one last time. Basically Rdh
should use its own version of set_globals.c (or patch mine, or ifdef
mine). Note that there are a lot fewer variables, and this list may
shrink a bit more.
ntuple's meaning hasn't changed, and you already handle that.
Seed and strategy work together -- the latter is a new variable and
SHOULDN'T affect Rdh, but just in case, here's what it does.
The default strategy for dieharder is to reseed once when a rng is
chosen. In the default output view, the seed is written to the rng
information part of the header. That way if one wishes to reproduce
a test result, one can enter the seed with -S seed.
However, this is actually a PROBLEM if one runs multiple tests from this
one seed. If one runs tests out of order, the results will be
different. This is true for me running all the tests in order via
dieharder -a if I should ever change test order, and is true for Rdh if
one runs first one test, then another in different orders, from the same
single specification of rng and/or Seed.
Also, there may be situations where one wants to run a single test
multiple times, each time from a newly selected seed to (in essence)
determine if some seeds are "bad" for a given rng. dieharder doesn't
yet support that, but I think that in R it would be pretty easy.
Setting strategy to anything nonzero (say, 1) causes dieharder to reseed
the random number generator at the beginning of any test. If -S seed
was NOT specified, it just generates a new random seed, so that one
could run e.g. diehard_birthdays 100 times in a loop and each one would
reseed anew with a new random seed. If -S seed IS specified, it uses
the specified seed. If a file is being used for input (not stdin) it
forces a rewind at the beginning of each test, which is actually not a
bad thing to do as it conserves rands. (I hope, I haven't yet tested
this latter feature much yet but it should work.:-).
SO, Rdh will probably just leave strategy = 0 alone and either set the
global value of Seed for one-time initialization with a fixed seed or
not, accepting a one-time random seed. But you CAN support strategy if
you ever think you need to.
(From my point of view its primary purpose is to make the creation of a
validation run trivial -- if I run
dieharder -a -S 1 -s 1
I generate a validation table. If I run
dieharder -a -S 1 -s 1 -D test_name -D pvalues
I generate a very sparse validation table (basically, just test name and
pvalue). You can throw a -c ' ' in there if you want white space
separation or -c ',' if you prefer comma separated values, etc. You
have nearly complete control over dieharder's output at this point, see:
dieharder -F
for a listing of output control flags. dieharder -l and dieharder -h
and dieharder -g -1 all work as before, but I've completely flattened
and rationalized test-space so it works just like rng-space (and I mean
JUST like it -- very similar setup.
I completely changed (seriously streamlined and cleaned up) the test
call procedure, so a SINGLE run_test() routine does pretty much all the
work, a SINGLE output() routine does all of THAT work, and so that all
the dieharder CLI-specific stuff is done in parsecl(), and then only if
you enter specific commands, or in dieharder.c (main()). I tried to
label things that are CLI specific there as well.
If you want to grab a copy of this and build it and play with it, feel
free. As soon as this checkin is complete, though, I'm going to start
decrufting and checking to be sure I have all the required modules
actually in the repo.
------------------------------------------------------------------------
r453 | rgb | 2008-09-10 07:16:17 -0400 (Wed, 10 Sep 2008) | 14 lines
Dearie me. This checkin actually works, although I still haven't
implemented the output.c patch needed by Dirk or fixed the missing .h
file in include/dieharder. Still, I >>HAVE<< turned all the tests into
type int's (still no returns) and stripped dieharder.h and split off a
globals.h file that I don't think I'm going to need, actually, although
it was useful while stripping dieharder.h as a reservoir of codelets
that I needed to put back.
Anyway, it is entirely possible that Dirk will read these words as I
ALSO have dieharder.googlecode.com set up with him on it, and while this
checkin is still local (about to be svnsync'd up, not directly checked
in) VERY SOON NOW I may try checkout out from the google repo, which
will of course check BACK into the google repo thereafter.
------------------------------------------------------------------------
r452 | rgb | 2008-09-09 18:38:14 -0400 (Tue, 09 Sep 2008) | 4 lines
This checks in a code fragment that reseeds the rng at the beginning of
each run_whatever segment. This fragment "guarantees" that every test
run uses a fixed Seed if it is set.
------------------------------------------------------------------------
r451 | rgb | 2008-09-08 01:18:25 -0400 (Mon, 08 Sep 2008) | 2 lines
This now works. 2.28.1 indeed I dub thee.
------------------------------------------------------------------------
r450 | rgb | 2008-09-07 23:53:50 -0400 (Sun, 07 Sep 2008) | 3 lines
This seems to fix the output of sts_serial so it is consistent. I do
have a few small bugs to clean up to get a "perfect" display.
------------------------------------------------------------------------
r449 | rgb | 2008-09-07 13:58:56 -0400 (Sun, 07 Sep 2008) | 2 lines
So this is 2.28.1, for the moment and sake of argument.
------------------------------------------------------------------------
r448 | rgb | 2008-09-07 13:58:00 -0400 (Sun, 07 Sep 2008) | 2 lines
This is ready to get some sort of rev boost.
------------------------------------------------------------------------
r447 | rgb | 2008-09-07 10:00:25 -0400 (Sun, 07 Sep 2008) | 6 lines
This adds a "new" test -- the rgb_lagged_sums test, which is the
user test but wrapped up to run on a whole sequence of values, the
way I need to make sts_serial run very shortly. It is sufficient
to CLEARLY SHOW that mt19937 is actually a weak generator -- it
is "too uniform".
------------------------------------------------------------------------
r444 | rgb | 2008-09-06 19:50:52 -0400 (Sat, 06 Sep 2008) | 2 lines
This SHOULD be everything. Table output mode should now work for -a.
------------------------------------------------------------------------
r443 | rgb | 2008-09-06 18:43:39 -0400 (Sat, 06 Sep 2008) | 3 lines
This fixes four more tests. I can now run a LOT of the way through -a
before reverting to the old style output.
------------------------------------------------------------------------
r442 | rgb | 2008-09-06 14:37:53 -0400 (Sat, 06 Sep 2008) | 8 lines
This is coming along nicely. I have pretty much everything set up for
table vs report output and am streamlining the run_whatever routines to
the point where only a tiny bit of test-specific initiation
differentiates them. If I could move it into the test itself, I could
pretty much completely simplify the dieharder CLI code to a single
generic test shell call plus a SMALL set of specialized calls for e.g.
benchmarks or non-standard tests that don't return pvalues per se.
------------------------------------------------------------------------
r441 | rgb | 2008-09-05 15:08:47 -0400 (Fri, 05 Sep 2008) | 7 lines
This actually now works to display tables. Time to hack hack hack
and make ALL the tests work with a table. Right now most of them will
ignore it. I really should combine report and table in "results" and
put a SINGLE CALL to results in run_whatever.c. Results must then
"do it all". Also, I think I'm going to run the benchmarker on ALL
non-filesystem tests to get a timing, IF the table/timing flag is set.
------------------------------------------------------------------------
r440 | rgb | 2008-09-05 08:39:13 -0400 (Fri, 05 Sep 2008) | 3 lines
This applies a small patch from Dirk that just cleans up a few issues in
dieharder.h.
------------------------------------------------------------------------
r439 | rgb | 2008-09-05 07:20:58 -0400 (Fri, 05 Sep 2008) | 3 lines
OK, we'll try ONE LAST change -- commenting out the fclose -- before
sending this to Dirk.
------------------------------------------------------------------------
r438 | rgb | 2008-09-05 07:16:01 -0400 (Fri, 05 Sep 2008) | 15 lines
This is a 2.27.14 checkin. I'll shoot it off to Dirk who is waiting on
it. I noted, however, that my NEW permutations test FAILS (or should
fail) mt19937! It produces too GOOD a spread of permutations,
consistently! This is fascinating information. It means that perhaps
operm5 is NOT broken; maybe getting permutations to work out
multinomially is the most difficult test one can put an rng to.
Something to look at later.
This checkin should fix dieharder and libdieharder so that one can run
multiple tests (including invocation of the same test or different rngs)
from a single dieharder call. This cannot happen in dieharder -- don't
worry. But it can and will happen in Rdh and in gdieharder, so it is an
important fix nonetheless.
------------------------------------------------------------------------
r437 | rgb | 2008-09-04 17:07:26 -0400 (Thu, 04 Sep 2008) | 20 lines
This actually checks in so that we can pop the snapshot number in just a
minute -- a major bugfix relative to Rdh. The two problems addressed
herein are:
a) In order to be able to run many tests, one after another, on many
rngs, one after another, I have to be more careful than I have been
about allocating and freeing test resources on the one hand, rngs (which
must be freed) on the other hand, and resetting the static bit buffers
used in bits.c on the third hand.
b) I also needed to set up startup with a split into two sections, one
that runs only one time period and one that runs every time a new test
is created, executed, and destroyed (with everything reset at the end to
run a new test).
The good thing beyond Rdh in these changes is that several of them are
equally necessary in a GUI version e.g. gdieharder. So nothing being
done here is wasted...
------------------------------------------------------------------------
r436 | rgb | 2008-08-19 12:44:28 -0400 (Tue, 19 Aug 2008) | 4 lines
This really, really should be IT! I dub thee 2.27.12 as I've got to get
on with things. This version works for BOTH x86_64 and i386. It should
build into debian packages. It should rpmbuild --rebuild.
------------------------------------------------------------------------
r435 | rgb | 2008-08-19 12:34:56 -0400 (Tue, 19 Aug 2008) | 5 lines
This is a nearly final (for now) checkin for 2.27.12. It builds
decently. I probably need to put back config.sub as it is one of the
many things that should get automagically rebuilt -- if I have a
placeholder for it already present.
------------------------------------------------------------------------
r434 | rgb | 2008-08-19 12:16:18 -0400 (Tue, 19 Aug 2008) | 4 lines
We'll give this a try -- this defines __auto_build_post at the top of
the specfile and SEEMS to prevent the check-buildroot crash in rpm
building without my particular .rpmmacros.
------------------------------------------------------------------------
r433 | rgb | 2008-08-19 11:50:28 -0400 (Tue, 19 Aug 2008) | 2 lines
With any luck at all, this will be ready to fly.
------------------------------------------------------------------------
r432 | rgb | 2008-08-19 11:42:31 -0400 (Tue, 19 Aug 2008) | 2 lines
We're working our way back to not losing everything we just did, dammit.
------------------------------------------------------------------------
r431 | rgb | 2008-08-19 11:34:04 -0400 (Tue, 19 Aug 2008) | 2 lines
Try try again...
------------------------------------------------------------------------
r430 | rgb | 2008-08-19 11:31:38 -0400 (Tue, 19 Aug 2008) | 3 lines
Sending this towards lucifer. We really need to make sure everything is
in subversion!
------------------------------------------------------------------------
r429 | rgb | 2008-08-19 07:40:00 -0400 (Tue, 19 Aug 2008) | 4 lines