Lazy-Batched Stream Processing

Raphael Collet raph at info.ucl.ac.be
Tue Jul 26 11:26:15 CEST 2005


Denys Duchier wrote:
> The problem here is that we may have a "runaway" stream generator, i.e. where
> the stream maybe generated faster than it can be consumed.  In the old regime,
> we would fix this using "laziness":
> 
> fun lazy {OldLazyMap IN F}
>    case IN
>    of nil then nil
>    [] H|T then {F H}|{OldLazyMap IN T}
>    end
> end
> 
> The problem here is that this is very expensive: it costs 1 closure, 1 thread
> creation and 2 context switches for every element.  In the new regime, we would
> fix this using the wonderful WaitNeeded primitive:
> 
> proc {NewLazyMap IN F OUT}
>    proc {Loop IN OUT}
>       {WaitNeeded OUT}
>       case IN
>       of nil then OUT=nil
>       [] H|T then OUT2 in
>          OUT = {F H}|OUT2
>          {Loop T OUT}
>       end
>    end
> in
>    thread {Loop IN OUT} end
> end

FYI, I have done a few comparisons between the two latter versions when 
I introduced the WaitNeeded primitive.  There is an improvement, but the 
speedup factor is less than 2.

> The problem here is that it still costs 2 context switches for every element.
> The solution that I have been playing with is "amortized cost through batch
> generation": instead of generating just 1 element every time we switch context,
> we generate several elements:
> 
> proc {LazyBatchedMap BY IN F OUT}
>    proc {Loop IN OUT OK}
>       if OK > 0 then
>          case IN
>          of nil then OUT=nil
>          [] H|T then OUT2 in
>             OUT = {F H}|OUT2
>             {Loop T OUT2 OK-1}
>          end
>       else
>          {WaitNeeded OUT}
>          {Loop IN OUT BY}
>       end
>    end
> in
>    thread {Loop IN OUT 0} end
> end

Have you done some benchmark comparisons between the two latter 
versions?  I am curious to know the actual overhead of this double 
context switch...

Cheers,
raph



More information about the mozart-hackers mailing list