knownCase pass
Stephen Weeks
MLton@sourcelight.com
Tue, 15 Jan 2002 11:12:43 -0800
> One interesting aspect of knownCase, which I hadn't realized, is that it
> often has the effect of unrolling list traversals by one iteration, moving
> the nil/cons check to the end of the loop, rather than the beginning.
That is a good thing, and has been on my todo list for a while, to try
to cut the number of branches in most loops down from two (one for the
test, one to jump back to the header) to one (to do the test that
jumps back to the header. Unfortunately, knownCase doesn't lead to
the result that I'd hoped, at least for a simple fold loop like
----------------------------------------
val _ = List.foldl (op +) 0 [1, 2, 3, 4]
----------------------------------------
For this, the relevant part of the resulting assembly is
----------------------------------------
L_9:
movl (0+(8*1))(%ebp),%esi
addl (0+(12*1))(%ebp),%esi
jo L_22
L_14:
movl (0+(4*1))(%ebp),%edi
testl $0x3,%edi
jnz L_23
L_15:
movl %esi,(0+(12*1))(%ebp)
movl (0+(0*1))(%edi),%esi
movl %esi,(0+(8*1))(%ebp)
movl (0+(4*1))(%edi),%edi
movl %edi,(0+(4*1))(%ebp)
jmp L_9
----------------------------------------
So, there is still the null test (jnz L_23) and the jump to the loop
header (jmp L_9). What I'd like to see is something like
----------------------------------------
L_15:
movl %esi,(0+(12*1))(%ebp)
movl (0+(0*1))(%edi),%esi
movl %esi,(0+(8*1))(%ebp)
movl (0+(4*1))(%edi),%edi
movl %edi,(0+(4*1))(%ebp)
L_9:
movl (0+(8*1))(%ebp),%esi
addl (0+(12*1))(%ebp),%esi
jo L_22
L_14:
movl (0+(4*1))(%ebp),%edi
testl $0x3,%edi
jz L_15
L_23:
...
----------------------------------------
Now, L_15 is the header, and there is only one loop related test, the
jz at the end of the loop. L_9 still needs to exist because that's
how the loop is entered in the first place (after the unrolling due to
knownCase).
It looks to me like knownCase did the right thing. I don't think any
code duplication is required to get this right. It looks to me like a
layout problem in the codegen.
One other thing that I noticed about knownCase is in compiling the
last function.
----------------------------------------
val rec last =
fn [] => 0
| [x] => x
| _ :: l => last l
val _ = 1 + last [2, 3, 4, 5, 6, 7]
----------------------------------------
The resulting SSA, for which I have included the dot below, has a
knownCase-style optimization missing. I would like to see the default
branch for L_13 jump directly back to L_13, instead of going through
L_11 and L_12. Would another round of knownCase get it? Could it be
gotten in the first round?
--------------------------------------------------------------------------------
%!PS-Adobe-2.0
%%Creator: dot version 1.7.4 (Thu Mar 15 15:16:04 CET 2001)
%%For: (sweeks) Stephen Weeks
%%Title: main_0 control-flow graph
%%Pages: (atend)
%%BoundingBox: 36 36 344 1085
%%EndComments
%%BeginProlog
save
/DotDict 200 dict def
DotDict begin
%%BeginResource: procset
/coord-font-family /Times-Roman def
/default-font-family /Times-Roman def
/coordfont coord-font-family findfont 8 scalefont def
/InvScaleFactor 1.0 def
/set_scale {
dup 1 exch div /InvScaleFactor exch def
dup scale
} bind def
% styles
/solid { } bind def
/dashed { [9 InvScaleFactor mul dup ] 0 setdash } bind def
/dotted { [1 InvScaleFactor mul 6 InvScaleFactor mul] 0 setdash } bind def
/invis {/fill {newpath} def /stroke {newpath} def /show {pop newpath} def} bind def
/bold { 2 setlinewidth } bind def
/filled { } bind def
/unfilled { } bind def
/rounded { } bind def
/diagonals { } bind def
% hooks for setting color
/nodecolor { sethsbcolor } bind def
/edgecolor { sethsbcolor } bind def
/graphcolor { sethsbcolor } bind def
/nopcolor {pop pop pop} bind def
/beginpage { % i j npages
/npages exch def
/j exch def
/i exch def
/str 10 string def
npages 1 gt {
gsave
coordfont setfont
0 0 moveto
(\() show i str cvs show (,) show j str cvs show (\)) show
grestore
} if
} bind def
/set_font {
findfont exch
scalefont setfont
} def
% draw aligned label in bounding box aligned to current point
% alignfactor tells what fraction to place on the left.
% -.5 is centered.
/alignedtext { % text labelwidth fontsz alignfactor
/alignfactor exch def
/fontsz exch def
/width exch def
/text exch def
gsave
% even if node or edge is dashed, don't paint text with dashes
[] 0 setdash
currentpoint newpath moveto
text stringwidth pop
alignfactor mul fontsz -.3 mul rmoveto
text show
grestore
} def
/boxprim { % xcorner ycorner xsize ysize
4 2 roll
moveto
2 copy
exch 0 rlineto
0 exch rlineto
pop neg 0 rlineto
closepath
} bind def
/ellipse_path {
/ry exch def
/rx exch def
/y exch def
/x exch def
matrix currentmatrix
newpath
x y translate
rx ry scale
0 0 1 0 360 arc
setmatrix
} bind def
/endpage { showpage } bind def
/layercolorseq
[ % layer color sequence - darkest to lightest
[0 0 0]
[.2 .8 .8]
[.4 .8 .8]
[.6 .8 .8]
[.8 .8 .8]
]
def
/setlayer {/maxlayer exch def /curlayer exch def
layercolorseq curlayer get
aload pop sethsbcolor
/nodecolor {nopcolor} def
/edgecolor {nopcolor} def
/graphcolor {nopcolor} def
} bind def
/onlayer { curlayer ne {invis} if } def
/onlayers {
/myupper exch def
/mylower exch def
curlayer mylower lt
curlayer myupper gt
or
{invis} if
} def
/curlayer 0 def
%%EndProlog
%%BeginSetup
14 default-font-family set_font
1 setmiterlimit
% /arrowlength 10 def
% /arrowwidth 5 def
% make sure pdfmark is harmless for PS-interpreters other than Distiller
/pdfmark where {pop} {userdict /pdfmark /cleartomark load put} ifelse
% make '<<' and '>>' safe on PS Level 1 devices
/languagelevel where {pop languagelevel}{1} ifelse
2 lt {
userdict (<<) cvn ([) cvn load put
userdict (>>) cvn ([) cvn load put
} if
%%EndResource
%%EndSetup
%%Page: 1 1
%%PageBoundingBox: 36 36 344 1085
%%PageOrientation: Portrait
gsave
35 35 309 1050 boxprim clip newpath
36 36 translate
0 0 1 beginpage
0 0 translate 0 rotate
[ /CropBox [36 36 344 1085] /PAGES pdfmark
0.000 0.000 0.000 graphcolor
14.00 /Times-Roman set_font
gsave 10 dict begin
153 9 moveto (main_0 control-flow graph) 150 14.00 -0.50 alignedtext
end grestore
% n0
gsave 10 dict begin
newpath 113 1024 moveto
15 1024 lineto
15 988 lineto
113 988 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
15 1015 moveto (L_0 \(\)) 35 14.00 0.00 alignedtext
15 999 moveto (L_12 \(::_0\(...\)\)) 83 14.00 0.00 alignedtext
end grestore
end grestore
% n11
gsave 10 dict begin
newpath 102 936 moveto
26 936 lineto
26 900 lineto
102 900 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
26 927 moveto (L_12 \(x_5\)) 63 14.00 0.00 alignedtext
26 911 moveto (case x_5) 48 14.00 0.00 alignedtext
end grestore
end grestore
% n0 -> n11
gsave 10 dict begin
solid
newpath 64 988 moveto
64 976 64 959 64 945 curveto
stroke
newpath 62 946 moveto
64 936 lineto
67 946 lineto
closepath
fill
gsave 10 dict begin
64 963 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n1
gsave 10 dict begin
newpath 210 60 moveto
150 60 lineto
150 24 lineto
210 24 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
150 51 moveto (L_1 \(\)) 35 14.00 0.00 alignedtext
150 35 moveto (return \(\)) 46 14.00 0.00 alignedtext
end grestore
end grestore
% n2
gsave 10 dict begin
newpath 259 160 moveto
101 160 lineto
101 112 lineto
259 112 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
101 153 moveto (L_2 \(\)) 35 14.00 0.00 alignedtext
101 137 moveto (Stdio_print \("Toplevel ...\)) 145 14.00 0.00 alignedtext
101 121 moveto (MLton_halt \(1\)) 86 14.00 0.00 alignedtext
end grestore
end grestore
% n2 -> n1
gsave 10 dict begin
solid
newpath 180 112 moveto
180 99 180 83 180 69 curveto
stroke
newpath 178 70 moveto
180 60 lineto
183 70 lineto
closepath
fill
gsave 10 dict begin
180 87 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n3
gsave 10 dict begin
newpath 238 260 moveto
122 260 lineto
122 212 lineto
238 212 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
122 253 moveto (L_3 \(\)) 35 14.00 0.00 alignedtext
122 237 moveto (Stdio_print \("\\n"\)) 102 14.00 0.00 alignedtext
122 221 moveto (MLton_halt \(1\)) 86 14.00 0.00 alignedtext
end grestore
end grestore
% n3 -> n2
gsave 10 dict begin
solid
newpath 180 212 moveto
180 199 180 184 180 170 curveto
stroke
newpath 178 170 moveto
180 160 lineto
183 170 lineto
closepath
fill
gsave 10 dict begin
180 187 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n4
gsave 10 dict begin
newpath 163 368 moveto
9 368 lineto
9 320 lineto
163 320 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
9 361 moveto (L_5 \(\)) 35 14.00 0.00 alignedtext
9 345 moveto (Stdio_print \("Overflow"\)) 140 14.00 0.00 alignedtext
9 329 moveto (L_3 \(\)) 35 14.00 0.00 alignedtext
end grestore
end grestore
% n4 -> n3
gsave 10 dict begin
solid
newpath 107 320 moveto
121 304 139 284 153 267 curveto
stroke
newpath 151 266 moveto
159 260 lineto
154 269 lineto
closepath
fill
gsave 10 dict begin
143 287 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n5
gsave 10 dict begin
newpath 307 376 moveto
181 376 lineto
181 312 lineto
307 312 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
181 369 moveto (L_4 \(x_0\)) 56 14.00 0.00 alignedtext
181 353 moveto (Stdio_print \("Fail "\)) 112 14.00 0.00 alignedtext
181 337 moveto (Stdio_print \(x_0\)) 96 14.00 0.00 alignedtext
181 321 moveto (L_3 \(\)) 35 14.00 0.00 alignedtext
end grestore
end grestore
% n5 -> n3
gsave 10 dict begin
solid
newpath 225 312 moveto
217 298 207 282 199 268 curveto
stroke
newpath 197 270 moveto
194 260 lineto
201 267 lineto
closepath
fill
gsave 10 dict begin
214 287 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n6
gsave 10 dict begin
newpath 233 476 moveto
69 476 lineto
69 428 lineto
233 428 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
69 469 moveto (L_6 \(x_1\)) 56 14.00 0.00 alignedtext
69 453 moveto (Stdio_print \("unhandled...\)) 150 14.00 0.00 alignedtext
69 437 moveto (case x_1) 48 14.00 0.00 alignedtext
end grestore
end grestore
% n6 -> n4
gsave 10 dict begin
solid
newpath 136 428 moveto
127 413 115 393 105 376 curveto
stroke
newpath 103 378 moveto
100 368 lineto
107 375 lineto
closepath
fill
gsave 10 dict begin
159 403 moveto (Overflow) 53 14.00 -0.50 alignedtext
end grestore
end grestore
% n6 -> n5
gsave 10 dict begin
solid
newpath 181 428 moveto
188 422 194 415 199 410 curveto
204 403 211 394 217 385 curveto
stroke
newpath 215 383 moveto
223 376 lineto
220 386 lineto
closepath
fill
gsave 10 dict begin
235 403 moveto (Fail_0) 35 14.00 -0.50 alignedtext
end grestore
end grestore
% n7
gsave 10 dict begin
newpath 253 564 moveto
143 564 lineto
143 528 lineto
253 528 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
143 555 moveto (L_8 \(\)) 35 14.00 0.00 alignedtext
143 539 moveto (L_6 \(Fail_0\(......\)) 96 14.00 0.00 alignedtext
end grestore
end grestore
% n7 -> n6
gsave 10 dict begin
solid
newpath 189 528 moveto
182 516 174 499 167 485 curveto
stroke
newpath 165 486 moveto
163 476 lineto
169 484 lineto
closepath
fill
gsave 10 dict begin
180 503 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n8
gsave 10 dict begin
newpath 164 618 moveto
62 618 lineto
62 582 lineto
164 582 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
62 609 moveto (L_7 \(\)) 35 14.00 0.00 alignedtext
62 593 moveto (L_6 \(Overflow\)) 88 14.00 0.00 alignedtext
end grestore
end grestore
% n8 -> n6
gsave 10 dict begin
solid
newpath 118 582 moveto
124 558 135 516 142 486 curveto
stroke
newpath 140 485 moveto
145 476 lineto
145 486 lineto
closepath
fill
gsave 10 dict begin
131 547 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n9
gsave 10 dict begin
newpath 260 672 moveto
160 672 lineto
160 636 lineto
260 636 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
160 663 moveto (L_9 \(x_2\)) 56 14.00 0.00 alignedtext
160 647 moveto (MLton_halt \(0\)) 86 14.00 0.00 alignedtext
end grestore
end grestore
% n9 -> n7
gsave 10 dict begin
solid
newpath 208 636 moveto
206 619 203 593 201 573 curveto
stroke
newpath 199 574 moveto
200 564 lineto
204 574 lineto
closepath
fill
gsave 10 dict begin
206 601 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n10
gsave 10 dict begin
newpath 81 760 moveto
5 760 lineto
5 724 lineto
81 724 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
5 751 moveto (L_11 \(\)) 42 14.00 0.00 alignedtext
5 735 moveto (L_12 \(x_4\)) 63 14.00 0.00 alignedtext
end grestore
end grestore
% n10 -> n11
gsave 10 dict begin
solid
newpath 49 894 moveto
39 879 28 861 24 848 curveto
15 825 27 784 36 760 curveto
stroke
newpath 50 890 moveto
52 900 lineto
45 892 lineto
closepath
fill
gsave 10 dict begin
24 831 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n13
gsave 10 dict begin
newpath 92 848 moveto
36 848 lineto
36 812 lineto
92 812 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
36 839 moveto (L_14 \(\)) 42 14.00 0.00 alignedtext
36 823 moveto (bug) 21 14.00 0.00 alignedtext
end grestore
end grestore
% n11 -> n13
gsave 10 dict begin
solid
newpath 64 900 moveto
64 888 64 871 64 857 curveto
stroke
newpath 62 858 moveto
64 848 lineto
67 858 lineto
closepath
fill
gsave 10 dict begin
90 875 moveto (default) 38 14.00 -0.50 alignedtext
end grestore
end grestore
% n14
gsave 10 dict begin
newpath 212 848 moveto
110 848 lineto
110 812 lineto
212 812 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
110 839 moveto (L_13 \(x_4,x_3\)) 87 14.00 0.00 alignedtext
110 823 moveto (case x_4) 48 14.00 0.00 alignedtext
end grestore
end grestore
% n11 -> n14
gsave 10 dict begin
solid
newpath 97 900 moveto
107 894 117 888 124 882 curveto
130 876 137 866 144 857 curveto
stroke
newpath 142 855 moveto
150 848 lineto
147 858 lineto
closepath
fill
gsave 10 dict begin
154 875 moveto (::_0) 21 14.00 -0.50 alignedtext
end grestore
end grestore
% n12
gsave 10 dict begin
newpath 189 760 moveto
133 760 lineto
133 724 lineto
189 724 lineto
closepath
stroke
gsave 10 dict begin
0.000 0.000 0.000 nodecolor
133 751 moveto (L_10 \(\)) 42 14.00 0.00 alignedtext
133 735 moveto (1 + x_3) 42 14.00 0.00 alignedtext
end grestore
end grestore
% n12 -> n8
gsave 10 dict begin
dashed
newpath 133 725 moveto
123 719 113 712 107 706 curveto
91 691 97 653 105 626 curveto
stroke
newpath 102 627 moveto
107 618 lineto
107 628 lineto
closepath
fill
gsave 10 dict begin
141 699 moveto (Overflow) 53 14.00 -0.50 alignedtext
end grestore
end grestore
% n12 -> n9
gsave 10 dict begin
solid
newpath 171 724 moveto
178 712 188 694 196 680 curveto
stroke
newpath 193 680 moveto
200 672 lineto
198 682 lineto
closepath
fill
gsave 10 dict begin
189 699 moveto () 0 14.00 -0.50 alignedtext
end grestore
end grestore
% n14 -> n10
gsave 10 dict begin
solid
newpath 129 812 moveto
119 806 108 799 101 794 curveto
92 788 80 777 69 767 curveto
stroke
newpath 67 769 moveto
62 760 lineto
71 765 lineto
closepath
fill
gsave 10 dict begin
127 787 moveto (default) 38 14.00 -0.50 alignedtext
end grestore
end grestore
% n14 -> n12
gsave 10 dict begin
solid
newpath 161 812 moveto
161 800 161 783 161 769 curveto
stroke
newpath 159 770 moveto
161 760 lineto
164 770 lineto
closepath
fill
gsave 10 dict begin
182 787 moveto (nil_0) 28 14.00 -0.50 alignedtext
end grestore
end grestore
endpage
grestore
%%PageTrailer
%%EndPage: 1
%%Trailer
%%Pages: 1
end
restore
%%EOF