; Generating decision trees to efficiently find a median of a sequence ; ; Given a sequence of n comparable elements, its median is the m-th ; minimum (the minimum of rank m) of the sequence, where the absolute ; minimum has rank 0 and m = floor((n-1)/2). ; ; We are given a list of values, and we have to _generate_ code (an ; AST) that returns the median of the sequence. To be more precise, ; we are to generate a decision tree. This approach needs no extra ; storage, has no loops, and particularly lends itself to an ; implementation as a combinational circuit. However, writing such ; decision trees is quite challenging even for 5 inputs, especially if ; we strive for optimality in terms of the maximum or average depths ; of the tree. ; ; The values themselves are not known until the run time. So we have ; to generate code that can handle an arbitrary combination of n ; values. The code has to be efficient in terms of the max and average ; number of comparsions. The minimal possible decision depth for a ; sequence of n elements is n-1. This is the number of comparisons ; needed to verify that the input sequence is sorted. Clearly an ; algorithm that achieves such lower bound will do poorly in terms of ; the maximal depth: The decision tree will always have n! leaves; ; therefore, shortening the minimal depth will lengthen the maximal ; one. The only sensible optimality criterion is the average or the ; max number of comparisons. ; It seems that our algorithm is nearly optimal. The algorithm ; generates code that requires at most 3 comparisons to find the ; median of a sequence of 3 terms, exactly 4 comparisons for a ; sequence of 4 elements, and no more than 7 comparisons to find the ; median of a sequence of 5 terms. ; ; Our algorithm: we split the sequence in two halves, "sort" both ; halves using the merge sort, and merge the halves to the extent to ; find the floor((n-1)/2)-th minimum. By "sort" above we mean the AST ; that will sort the sequence as run time. ; ; At the last stage, we need to merge two sorted sequences of m and m ; (or m+1 and m) items and find their m-th minimum. We can do that ; rather efficiently. To discover the general algorithm, we consider a ; few particular cases first: ; ; Merging sequences [a1 a2] and [b1 b2]. ; There are 6 alternatives ; a1 a2 b1 b2 ; a1 b1 a2 b2 ; b1 a1 a2 b2 ; a1 b1 b2 a2 ; b1 a1 b2 a2 ; b1 b2 a1 a2 ; To find the 1st min (zero-based) of the result of merging [a1 a2] [b1 b2] ; we use the general merge approach, which gives the optimum (i.e., two) ; average and the max number of comparisons: ; (if (less? a1 b1) ; a1 is the 0th min, drop ; (if (less? a2 b1) a2 b1) ; (if (less? a1 b2) a1 b2)) ; Find the 2nd min (zero-based) of the result of merging [a1 a2] [b1 b2] ; The general method gives ; (if (less? a1 b1) ; ; 1st min of [a2] [b1 b2] ; (if (less? a2 b1) b1 ; (if (less? a2 b2) a2 b2)) ; ; symmetric ; which is 16/6 average compasions and 3 max comparions ; In contrast, a more optimal method ; (if (less? a2 b1) b1 ; ; otherwise, b1 is always either the 0th or the 1st min ; ; find the 1st min of [a1 a2] [b2] ; (if (less? a1 b2) ; (if (less? a2 b2) a2 b2) ; a1)) ; gives 3 max comparisons and 15/6 average comparisons ; ; However, we can do better: examinination of the third column of the table ; with the all possible results of the merger of [a1 a2] and [b1 b2] ; shows that a2 and b2 are the most common values. We can start comparing them ; first. We arrive at the following: ; ; (if (less? a2 b2) ; (if (less? a2 b1) b1 a2) ; (if (less? b2 a1) a1 b2)) ; which gives the max and average of only two comparisons! ; ; Merging sequences [a1 a2 a3] and [b1 b2]. ; There are 10 alternatives (=5!/2!/3!) ; ; b1 b2 a1 a2 a3 ; b1 a1 b2 a2 a3 ; b1 a1 a2 b2 a3 ; b1 a1 a2 a3 b2 ; a1 b1 b2 a2 a3 ; a1 b1 a2 b2 a3 ; a1 b1 a2 a3 b2 ; a1 a2 b1 b2 a3 ; a1 a2 b1 a3 b2 ; a1 a2 a3 b1 b2 ; ; Finding the 2nd min ; The standard method gives ; (if (less? a1 b1) 6 cases ; ... ; first min of [a2 a3] [b1 b2]: 2 min each ; ; first min of [a1 a2 a3] [b2] ; (if (less? a1 b2) ; (if (less? a2 b2) a2 b2) ; 3 cases, 3 comp ; a1) ; 1 case, 2 comp ; max 3 comparisons, average (18+2+9) = 29/10 comparisons ; ; A better approach is ; (if (less? a2 b1) ; 3 cases ; ; 0th min of [a3] [b1 b2] ; (if (less? a3 b1) a3 b1) ; 3*2=6 comp ; ; b1 is never the 2nd min then ; ; 1st min of [a1 a2 a3] and [b2] ; (if (less? a1 b2) ; (if (less? a2 b2) a2 b2) ; 6 cases, 3 comp each ; a1)) ; 1 case, 2 comp ; total 6*3 + 3*2 + 1*2, which is 26/10 comparisons on average ; ; But we can do even better, recognizing that the most common values ; in the third column of the table are a2 and b2. So we start ; by comparing them first: ; ; (if (less? a2 b2) ; 7 cases ; (if (less? a2 b1) ; 3 cases ; (if (less? b1 a3) b1 a3) ; 3 cases, 3 comp ; a2) ; 4 cases, 2 comp ; ; if a2>=b2, a2 and a3 can't be medians and could be cut off ; (if (less? a1 b2) b2 a1) ; 3 cases, 2 comp ; ) ; which gives us 3*3 + 4*2 + 3*2 = 23/10 average comparisons ; merging [a1 a2 a3] and [b1 b2 b3]. There are 6!/3!/3!=20 alternatives ; b1 b2 b3 a1 a2 a3 ; b1 b2 a1 b3 a2 a3 ; b1 b2 a1 a2 b3 a3 ; b1 b2 a1 a2 a3 b3 ; b1 a1 b2 b3 a2 a3 ; b1 a1 b2 a2 b3 a3 ; b1 a1 b2 a2 a3 b3 ; b1 a1 a2 b2 b3 a3 ; b1 a1 a2 b2 a3 b3 ; b1 a1 a2 a3 b2 b3 ; a1 b1 b2 b3 a2 a3 ; a1 b1 b2 a2 b3 a3 ; a1 b1 b2 a2 a3 b3 ; a1 b1 a2 b2 b3 a3 ; a1 b1 a2 b2 a3 b3 ; a1 b1 a2 a3 b2 b3 ; a1 a2 b1 b2 b3 a3 ; a1 a2 b1 b2 a3 b3 ; a1 a2 b1 a3 b2 b3 ; a1 a2 a3 b1 b2 b3 ; Again, the best way to find the 0th and the 1st min is to use the standard ; method. ; For the 2nd min: ; Looking at the 3d column: the most common value is a2 and b2, ; the least common are b3 and a3. Therefore, we should compare a2 and b2 first ; (if (less? a2 b2) ; 10 cases, only a2, b1 and a3 can be median ; (if (less? a2 b1) ; 4 cases ; (if (less? b1 a3) b1 a3) ; 4 cases, 3 comp ; a2) ; 6 cases, 2 comp ; ; mirror image ; (if (less? b2 a1) ; (if (less? b3 a1) b3 a1) ; b2)) ; total 4*3+6*2+4*3+6*2=48/20 on average ; ; The third min (aka 2nd max) is symmetric ; (if (less? a2 b2) ; 10 cases ; (if (less? b2 a3) b2 ; 6 cases ; (if (less? b1 a3) a3 b1)) ; (if (less? a2 b3) a2 ; (if (less? a1 b3) b3 a1))) ; To see the general case more clearly, let us try merging of [a1 a2 ; a3 a4] and [b1 b2 b3] There are 7!/4!/3!=35 cases ; b1 b2 b3 a1 a2 a3 a4 ; b1 b2 a1 b3 a2 a3 a4 ; b1 b2 a1 a2 b3 a3 a4 ; b1 b2 a1 a2 a3 a4 b3 ; b1 b2 a1 a2 a3 b3 a4 ; b1 a1 b2 b3 a2 a3 a4 ; b1 a1 b2 a2 b3 a3 a4 ; b1 a1 b2 a2 a3 a4 b3 ; b1 a1 b2 a2 a3 b3 a4 ; b1 a1 a2 b2 b3 a3 a4 ; b1 a1 a2 b2 a3 a4 b3 ; b1 a1 a2 b2 a3 b3 a4 ; b1 a1 a2 a3 a4 b2 b3 ; b1 a1 a2 a3 b2 a4 b3 ; b1 a1 a2 a3 b2 b3 a4 ; a1 b1 b2 b3 a2 a3 a4 ; a1 b1 b2 a2 b3 a3 a4 ; a1 b1 b2 a2 a3 a4 b3 ; a1 b1 b2 a2 a3 b3 a4 ; a1 b1 a2 b2 b3 a3 a4 ; a1 b1 a2 b2 a3 a4 b3 ; a1 b1 a2 b2 a3 b3 a4 ; a1 b1 a2 a3 a4 b2 b3 ; a1 b1 a2 a3 b2 a4 b3 ; a1 b1 a2 a3 b2 b3 a4 ; a1 a2 b1 b2 b3 a3 a4 ; a1 a2 b1 b2 a3 a4 b3 ; a1 a2 b1 b2 a3 b3 a4 ; a1 a2 b1 a3 a4 b2 b3 ; a1 a2 b1 a3 b2 a4 b3 ; a1 a2 b1 a3 b2 b3 a4 ; a1 a2 a3 a4 b1 b2 b3 ; a1 a2 a3 b1 a4 b2 b3 ; a1 a2 a3 b1 b2 a4 b3 ; a1 a2 a3 b1 b2 b3 a4 ; we see that as far as 0th min is concerned, the sequence could ; be truncated to [a1] [b1] ; For the 1st minimum, we need only [a1 a2] [b1 b2] ; For the 2nd min, we need only [a1 a2 a3] [b1 b2 b3] ; ; For the third min, we need to compare a3 and b2. ; (if (less? a3 b2) ; 13 cases. Only in these cases a3 can be a median ; ; 1st median of [a3 a4] and [b1] ; ; otherwise, a3 and a4 can't be median, nor can be b1 ; ; 2nd median of [a1 a2] [b2 b3] ; ; ; So, we can generalize: ; to find the n-th min of two sequences [x0 ...] and [y0 ...] ; we truncate them to [x0 ... xn] and [y0 ... yn] (if the longer ; sequence is shorter than n+1, we need to switch to the ranked ; max). There are two cases, for the even and odd rank of the desired ; minimum. ; (2k-1)-th min (zero based) of [x0 x1 ... xk ...] [y0 y1 ... ykm1 yk ...]: ; Compare xk and ykm1 ; xk < ykm1: (2k-1-k)-th min of [xk ...] and [y0 ... ykm2] ; else: (2k-k)-th min of [x0 ... xkm1] and [ykm1 yk ...] ; ; (2k)-th min of [x0 x1 ... xkm1 xk ...] [y0 y1 ... ykm1 yk ...] ; Compare xk and yk ; xk < yk: (2k-k)-th min of [xk ...] [y0 ... ykm1] ; else: (2k-k)-th min of [x0 ... xkm1] [yk...] ; ; Special cases: ; The 0th min of [x0 ...] and [y0 ...], is the min of x0 and y0 ; To find the 1st min of [x0 x1 ...] and [y0 y1 ...], we compare ; x0 and y0. If x0 < y0, we take min(x1,y0), otherwise, min(x0,y1) ; ; ; $Id: median-filt.scm,v 3.0 2003/05/19 19:12:59 oleg Exp oleg $ (define rule-median `( ; compute the length of a sequence ( (M length (seq-empty)) => (zero) ) ( (M length (seq _ _r)) => (succ (M length _r))) ; divide the numeral by two (compute floor(num/2)) ( (M half _x) => (M half-aux _x (zero)) ) ( (M half-aux _x _x) => _x) ( (M half-aux (succ _x) _x) => _x) ( (M half-aux (succ (succ _x)) _y) => (M half-aux (succ _x) (succ _y))) ; Pick the num-th element of a non-empty sequence ; (indexing is zero-based) ( (M seq-ref (seq _x _r) (zero)) => _x) ( (M seq-ref (seq _ _r) (succ _y)) => (M seq-ref _r _y)) ; (M take n seq) ; return the prefix of seq of the size exactly n ; If n is (zero), return (seq-empty) ( (M take (zero) _) => (seq-empty) ) ( (M take (succ _n1) (seq _h _t)) => (seq _h (M take _n1 _t)) ) ; (M drop n seq) ; remove the prefix of size n and return the rest ; If n is (zero), return seq as it is ( (M drop (zero) _s) => _s) ( (M drop (succ _n1) (seq _ _t)) => (M drop _n1 _t)) ; Classify the number as even or odd ( (M even-odd (zero)) => even ) ( (M even-odd (succ _x)) => (M odd-even _x) ) ( (M odd-even (zero)) => odd ) ( (M odd-even (succ _x)) => (M even-odd _x) ) ; reverse the sequence ( (M reverse _s) => (M reverse-aux _s (seq-empty)) ) ( (M reverse-aux (seq-empty) _t) => _t ) ( (M reverse-aux (seq _x _r) _t) => (M reverse-aux _r (seq _x _t)) ) ; split a sequence in two: make a term (pair s1 s2) ; where lengths of s1 and s2 differ by no more than 1 ; To be more precise, either s1 and s2 are of the ; same length, or s1 is longer than s2 by one. ; s1 mainains the tail of s (in the same order) ; s2 has the opposite order ( (M split (seq-empty)) => (pair (seq-empty) (seq-empty)) ) ( (M split (seq _x (seq-empty))) => (pair (seq _x (seq-empty)) (seq-empty)) ) ( (M split _s) => (M split-aux _s (seq-empty) (M half (M length _s))) ) ( (M split-aux _s _t (zero)) => (pair _s _t) ) ( (M split-aux (seq _x _r) _t (succ _n)) => (M split-aux _r (seq _x _t) _n) ) ; The main median finder ; sequence-of-terms -> AST ; Implements the algorithm given above. ( (M median (seq _x (seq-empty))) => _x) ( (M median _s) => (M median-aux (M split _s) (M length _s)) ) ; Note that the last argument ; to (M merge ...) is a continuation. ; Thus the rule below acts as a ; "function call". ; _nfull is the length of the orig ; sequence. ( (M median-aux (pair _s1 _s2) _nfull) => (M merge (M encap _s1) (merge-cont1 (M encap _s2) _nfull)) ) ; We come here after the first half ; of the original sequence is sorted. ; We invoke (M merge ...) again to sort ; the second half. Now we pass ; the contuation merge-cont2, which ; contains the already sorted ; half _s1 ( (M median-aux1 (merge-res _s1 (merge-cont1 _s2 _nfull))) => (M merge _s2 (merge-cont2 _s1 _nfull))) ; We come here when both halves ; have been sorted. ( (M median-aux1 (merge-res _s2 (merge-cont2 _s1 (succ _nfull1)))) => (M merge-median _s1 _s2 (M half _nfull1)) ) ; Given two ordered sequences, s1 and s2 ; (M merge-median s1 s2 m) ; finds their m-th minimum ; Initially m = floor((n-1)/2) ; where n = length(s1) + length(s2) ; This m-th minimum is the median, because ; s1 and s2 are sorted halves of the orig. sequence ( (M merge-median (seq-empty) (seq _x _) _) => _x) ( (M merge-median (seq _x _) (seq-empty) _) => _x) ( (M merge-median (seq _x _) (seq _y _) (zero)) => (if (less? _x _y) _x _y)) ; Special case m = 1 ( (M merge-median (seq _x _xr) (seq _y _yr) (succ (zero))) => (if (less? _x _y) (M merge-median _xr (seq _y _yr) (zero)) (M merge-median _yr (seq _x _xr) (zero)))) ; Special case m = 2 and two two-element sequences ; (which is equiv to the 1st max) ( (M merge-median (seq _a1 (seq _a2 (seq-empty))) (seq _b1 (seq _b2 (seq-empty))) (succ (succ (zero)))) => (if (less? _a2 _b2) (if (less? _a2 _b1) _b1 _a2) (if (less? _b2 _a1) _a1 _b2)) ) ; General case. Apply the algorithm above ; First figure out the longer sequence and check ; that we have enought elements. ; We should switch to the ranked max if not enough ; elems. But for now, we just report an error ; (that is, provide no matching rule) ( (M merge-median _seq1 _seq2 _m) => (M merge-median-check _seq1 _seq2 _m _seq1 _seq2 _m)) ( (M merge-median-check (seq-empty) (seq _ _) (zero) _seq1 _seq2 _m) => (M merge-median-gen _seq2 _seq1 _m (M half (succ _m)) (M even-odd _m)) ) ; _seq2 is longer ( (M merge-median-check (seq _ _) (seq-empty) (zero) _seq1 _seq2 _m) => (M merge-median-gen _seq1 _seq2 _m (M half (succ _m)) (M even-odd _m)) ) ; _seq1 is longer ( (M merge-median-check (seq _ _) (seq _ _) (zero) _seq1 _seq2 _m) => (M merge-median-gen _seq1 _seq2 _m (M half (succ _m)) (M even-odd _m)) ) ; both sequences are long enough ( (M merge-median-check (seq _ _s1) (seq _ _s2) (succ _m1) _seq1 _seq2 _m) => (M merge-median-check _s1 _s2 _m1 _seq1 _seq2 _m) ) ; The generator for the general case: (2k-1)-th min ( (M merge-median-gen _seq1 _seq2 _m (succ _km1) odd) => (if (less? (M seq-ref _seq1 (succ _km1)) (M seq-ref _seq2 _km1)) (M merge-median (M drop (succ _km1) _seq1) (M take _km1 _seq2) _km1) (M merge-median (M take (succ _km1) _seq1) (M drop _km1 _seq2) (succ _km1))) ) ; The generator for the general case: 2k-th min ( (M merge-median-gen _seq1 _seq2 _m _k even) => (if (less? (M seq-ref _seq1 _k) (M seq-ref _seq2 _k)) (M merge-median (M drop _k _seq1) (M take _k _seq2) _k) (M merge-median (M take _k _seq1) (M drop _k _seq2) _k)) ) ; Convert [a b c ...] into ; [[a] [b] [c] ...] ; as a preparation for merge sort ( (M encap (seq-empty)) => (seq-empty) ) ( (M encap (seq _x _r)) => (seq (seq _x (seq-empty)) (M encap _r))) ; Generate code to merge-sort s. ; When merge completes, exit to ; (M median-aux1 (merge-res s c)). ( (M merge _s _c) => (M merge-c _s (M length _s) _c) ) ; (M merge-c s n c) ; Originally on entrance, ; s is a sequence [[a] [b] [c]...] ; c is the contunuation ; Generate code to merge-sort s. ; When merge completes, exit to ; (M median-aux1 (merge-res s c)). ; On the recursive input, s will be a sequence ; [[a b] [c d] ...] ; a sequence of ordered subsequences of length 2^i ; where i is the pass number ( (M merge-c (seq _s (seq-empty)) (succ (zero)) _c) => (M median-aux1 (merge-res _s _c)) ) ( (M merge-c (seq _s1 (seq _s2 _st)) (succ (succ _n)) _c) => (M merge-pass (seq _s1 (seq _s2 _st)) (seq-empty) (succ (succ _n)) _c) ) ; The i-th pass of merge sort is complete. ; Initiate the i+1-st pass ( (M merge-pass (seq _s (seq-empty)) _target _n _c) => (M merge-c (seq _s _target) _n _c)) ( (M merge-pass (seq-empty) _target _n _c) => (M merge-c _target _n _c)) ; given _s1 and _s2: sorted subseq of size ; 2^i, merge them and move the result ; to the _target ( (M merge-pass (seq _s1 (seq _s2 _r)) _target _n _c) => (M merge-two _s1 _s2 (seq-empty) _r _target _n _c) ) ; Merge two ordered sequences ; The last four arguments is the continuation ; to (M merge ...) ( (M merge-two (seq _x _s1) (seq _y _s2) _t _ur _ut _n _c) => (if (less? _x _y) (M merge-two _s1 (seq _y _s2) (seq _x _t) _ur _ut _n _c) (M merge-two (seq _x _s1) _s2 (seq _y _t) _ur _ut _n _c)) ) ( (M merge-two (seq-empty) (seq _y _s2) _t _ur _ut _n _c ) => (M merge-two (seq-empty) _s2 (seq _y _t) _ur _ut _n _c) ) ( (M merge-two (seq _x _s1) (seq-empty) _t _ur _ut _n _c) => (M merge-two _s1 (seq-empty) (seq _x _t) _ur _ut _n _c) ) ( (M merge-two (seq-empty) (seq-empty) (seq _th _tt) _ur _ut (succ (succ _n)) _c) => (M merge-pass _ur (seq (M reverse (seq _th _tt)) _ut) (succ _n) _c) ) )) ; Regression tests for auxiliary functions (run-test (assert (equal? '(zero) (reduce `(M half ,(make-numeral 0)) rule-median))) (assert (equal? '(zero) (reduce `(M half ,(make-numeral 1)) rule-median))) (assert (equal? '(succ (zero)) (reduce `(M half ,(make-numeral 2)) rule-median))) (assert (equal? '(succ (zero)) (reduce `(M half ,(make-numeral 3)) rule-median))) (assert (equal? '(succ (succ (succ (zero)))) (reduce `(M half ,(make-numeral 6)) rule-median))) (assert (equal? '(succ (succ (succ (zero)))) (reduce `(M half ,(make-numeral 7)) rule-median))) (assert (equal? '(seq c (seq b (seq a (seq-empty)))) (reduce '(M reverse (seq a (seq b (seq c (seq-empty))))) rule-median))) (assert (equal? '(pair (seq x2 (seq-empty)) (seq x1 (seq-empty))) (reduce '(M split (seq x1 (seq x2 (seq-empty)))) rule-median))) (assert (equal? '(pair (seq x3 (seq x4 (seq x5 (seq-empty)))) (seq x2 (seq x1 (seq-empty)))) (reduce '(M split (seq x1 (seq x2 (seq x3 (seq x4 (seq x5 (seq-empty))))))) rule-median))) (assert (equal? '(if (less? x2 x1) x2 x1) (reduce '(M median (seq x1 (seq x2 (seq-empty)))) rule-median))) (assert (equal? '(if (less? x2 x3) (if (less? x2 x1) (if (less? x3 x1) x3 x1) x2) (if (less? x3 x1) (if (less? x2 x1) x2 x1) x3)) (reduce '(M median (seq x1 (seq x2 (seq x3 (seq-empty))))) rule-median))) (assert (equal? '(seq-empty) (reduce `(M take (zero) (seq-empty)) rule-median))) (assert (equal? '(seq-empty) (reduce `(M take (zero) (seq e1 (seq-empty))) rule-median))) (assert (equal? '(seq e1 (seq-empty)) (reduce `(M take (succ (zero)) (seq e1 (seq-empty))) rule-median))) (assert (equal? '(seq e1 (seq-empty)) (reduce `(M take (succ (zero)) (seq e1 (seq e2 (seq-empty)))) rule-median))) (assert (equal? '(seq e1 (seq e2 (seq-empty))) (reduce `(M take (succ (succ (zero))) (seq e1 (seq e2 (seq-empty)))) rule-median))) (assert (equal? '(seq-empty) (reduce `(M drop (zero) (seq-empty)) rule-median))) (assert (equal? '(seq e1 (seq-empty)) (reduce `(M drop (zero) (seq e1 (seq-empty))) rule-median))) (assert (equal? '(seq-empty) (reduce `(M drop (succ (zero)) (seq e1 (seq-empty))) rule-median))) (assert (equal? '(seq e2 (seq-empty)) (reduce `(M drop (succ (zero)) (seq e1 (seq e2 (seq-empty)))) rule-median))) (assert (equal? '(seq-empty) (reduce `(M drop (succ (succ (zero))) (seq e1 (seq e2 (seq-empty)))) rule-median))) ) ; Ascertaining the complexity of our code for medial filtering: ; counting the min depth of the tree (min number of comparisons to find ; the median), the max number of comparisons, and the total number of ; comparsion statements in the code. ; We're counting (less? x y) nodes (define rule-median-complexity `( ( (M min-depth (if (less? _ _) _b1 _b2)) => (succ (M min (M min-depth _b1) (M min-depth _b2)))) ( (M min-depth _) => (zero)) ( (M max-depth (if (less? _ _) _b1 _b2)) => (succ (M max (M max-depth _b1) (M max-depth _b2)))) ( (M max-depth _) => (zero)) ( (M total (if (less? _ _) _b1 _b2)) => (succ (M add (M total _b1) (M total _b2)))) ( (M total _) => (zero)) ,@rule-arithm )) (run-test (let ((m2 (reduce '(M median (seq x1 (seq x2 (seq-empty)))) rule-median))) (cout nl "median of 2: " nl) (pp m2) (assert (equal? 1 (eval-numeral (reduce `(M min-depth ,m2) rule-median-complexity)))) (assert (equal? 1 (eval-numeral (reduce `(M max-depth ,m2) rule-median-complexity)))) (assert (equal? 1 (eval-numeral (reduce `(M total ,m2) rule-median-complexity)))) ) (let ((m3 (reduce '(M median (seq x1 (seq x2 (seq x3 (seq-empty))))) rule-median))) (cout nl "median of 3: " nl) (pp m3) (assert (equal? 2 (eval-numeral (reduce `(M min-depth ,m3) rule-median-complexity)))) (assert (equal? 3 (eval-numeral (reduce `(M max-depth ,m3) rule-median-complexity)))) (assert (equal? 5 (eval-numeral (reduce `(M total ,m3) rule-median-complexity)))) ) (let ((m4 (reduce '(M median (seq x1 (seq x2 (seq x3 (seq x4 (seq-empty)))))) rule-median))) (cout nl "median of 4: " nl) (pp m4) (assert (equal? 4 (eval-numeral (reduce `(M min-depth ,m4) rule-median-complexity)))) (assert (equal? 4 (eval-numeral (reduce `(M max-depth ,m4) rule-median-complexity)))) (assert (equal? 15 (eval-numeral (reduce `(M total ,m4) rule-median-complexity)))) ) (let ((m5 (reduce '(M median (seq x1 (seq x2 (seq x3 (seq x4 (seq x5 (seq-empty))))))) rule-median))) (cout nl "median of 5: " nl) (pp m5) (assert (equal? 5 (eval-numeral (reduce `(M min-depth ,m5) rule-median-complexity)))) (assert (equal? 7 (eval-numeral (reduce `(M max-depth ,m5) rule-median-complexity)))) (assert (equal? 59 ; before was 83 (eval-numeral (reduce `(M total ,m5) rule-median-complexity)))) ) (let ((m6 (reduce '(M median (seq x1 (seq x2 (seq x3 (seq x4 (seq x5 (seq x6 (seq-empty)))))))) rule-median))) (cout nl "median of 6: " nl) (pp m6) (assert (equal? 6 ; without optimization: 7 (eval-numeral (reduce `(M min-depth ,m6) rule-median-complexity)))) (assert (equal? 9 ; without optimization: 10 (eval-numeral (reduce `(M max-depth ,m6) rule-median-complexity)))) (assert (equal? 215 (eval-numeral (reduce `(M total ,m6) rule-median-complexity)))) ) ; (assert (equal? 8; without optimization: 9 ; (eval-numeral ; (reduce `(M min-depth ; ,(reduce ; '(M median (seq x1 (seq x2 (seq x3 (seq x4 (seq x5 ; (seq x6 (seq x7 (seq-empty))))))))) ; rule-median)) ; rule-median-complexity)))) ; (assert (equal? 11 ; without optimization: 12 ; (eval-numeral ; (reduce `(M max-depth ; ,(reduce ; '(M median (seq x1 (seq x2 (seq x3 (seq x4 (seq x5 ; (seq x6 (seq x7 (seq-empty))))))))) ; rule-median)) ; rule-median-complexity)))) ) ; To figure out how many comparisons does it take to find the median, ; let's analyze the merge-sort. ; Base case: ; ; (reduce '(M merge (seq (seq x1 (seq-empty)) (seq (seq x2 (seq-empty)) ; (seq-empty))) ; (seq-empty) dummy-cont) ; (append ; '( ((M merge (seq _s (seq-empty)) (seq-empty) _c) => ; (done _s _c)) ; ) ; rule-median)) ; ==> (if (less? x1 x2) ; (done (seq x1 (seq x2 (seq-empty))) dummy-cont) ; (done (seq x2 (seq x1 (seq-empty))) dummy-cont)) ; ; merge-max-depth(2) = 1 ; merge-min-depth(2) = 1 ; where merge-max-depth and merge-min-depth are the max and the min ; number of comparison operations. ; Suppose we merge-sort the sequence of 2^(n+1) elements. We merge ; 2^(n+1) sequences of 1 element, then 2^n sequences of 2 elements, ; eventually we merge two sequences of 2^n elements. ; It's obvious that ; merge-max-depth(2^(n+1)) = merge-two-max(2^n,2^n) + 2*merge-max-depth(2^n) ; merge-min-depth(2^(n+1)) = merge-two-min(2^n,2^n) + 2*merge-min-depth(2^n) ; where merge-two-max(N,M) is the max (corr. min) depth of merging ; two sequences of N and M elements. It's a symmetric function. ; From the definition of merge-two, it follows ; merge-two-max(N,M) = 1 + max(merge-two-max(N-1,M),merge-two-max(N,M-1)) ; merge-two-min(N,M) = 1 + min(merge-two-max(N-1,M),merge-two-max(N,M-1)) ; merge-two-max(1,N) = 1 merge-two-max(N,1) = 1 ; merge-two-min(1,N) = 1 merge-two-min(N,1) ; It follows: ; merge-two-min(N,N) = 1 + merge-two-min(N-1,N) = 1 + merge-two-min(N-1,N-1) ; This recursion relationship yields ; merge-two-min(N,N) = N ; Similarly, ; merge-two-max(N,M) = 1 + merge-two-max(N-1,N) = 2 + merge-two-max(N-1,N-1) ; = 2*(N-1) + 1 ; Thus ; merge-max-depth(2^(n+1)) = 2*(2^n-1) + 1 + 2*merge-max-depth(2^n) ; = 2*2^n - 1 + 2*merge-max-depth(2^n) ; = 2*2^n -1 +2*( 2^n - 1 + 2*merge-max-depth(2^n/2)) ; = 2*2^n + 2*2^n - 3 + 4*merge-max-depth(2^n/2) ; = k*2*2^n - (2^k-1) + 2^k*merge-max-depth(2^(n+1-k)) ; = n*2^(n+1) - 2^n + 1 + 2^n = n*2^(n+1)+1 ; Similarly, ; merge-min-depth(2^(n+1)) = 2^n + 2*merge-max-depth(2^n) = (n+1)*2^n ; ; From the definition of merge-median, ; merge-median-max(N,N) = 1+N, N> 1 ; merge-median-min(N,N) = N ; ; Thus the total complexity: ; max-depth: 1+N/2 + 2*( (logN-2)*N/2+1 ) = 3 - 3N/2 + N*logN ; min-depth: N/2 + 2*( N/4*(logN-1) ) = N/2*logN ; which conforms the experiemnt data above.