Lazy-Batched Stream Processing
Raphael Collet
raph at info.ucl.ac.be
Tue Jul 26 11:26:15 CEST 2005
Denys Duchier wrote:
> The problem here is that we may have a "runaway" stream generator, i.e. where
> the stream maybe generated faster than it can be consumed. In the old regime,
> we would fix this using "laziness":
>
> fun lazy {OldLazyMap IN F}
> case IN
> of nil then nil
> [] H|T then {F H}|{OldLazyMap IN T}
> end
> end
>
> The problem here is that this is very expensive: it costs 1 closure, 1 thread
> creation and 2 context switches for every element. In the new regime, we would
> fix this using the wonderful WaitNeeded primitive:
>
> proc {NewLazyMap IN F OUT}
> proc {Loop IN OUT}
> {WaitNeeded OUT}
> case IN
> of nil then OUT=nil
> [] H|T then OUT2 in
> OUT = {F H}|OUT2
> {Loop T OUT}
> end
> end
> in
> thread {Loop IN OUT} end
> end
FYI, I have done a few comparisons between the two latter versions when
I introduced the WaitNeeded primitive. There is an improvement, but the
speedup factor is less than 2.
> The problem here is that it still costs 2 context switches for every element.
> The solution that I have been playing with is "amortized cost through batch
> generation": instead of generating just 1 element every time we switch context,
> we generate several elements:
>
> proc {LazyBatchedMap BY IN F OUT}
> proc {Loop IN OUT OK}
> if OK > 0 then
> case IN
> of nil then OUT=nil
> [] H|T then OUT2 in
> OUT = {F H}|OUT2
> {Loop T OUT2 OK-1}
> end
> else
> {WaitNeeded OUT}
> {Loop IN OUT BY}
> end
> end
> in
> thread {Loop IN OUT 0} end
> end
Have you done some benchmark comparisons between the two latter
versions? I am curious to know the actual overhead of this double
context switch...
Cheers,
raph
More information about the mozart-hackers
mailing list