Improving the open connection procedure (suggestions)

Erik Klintskog erik at sics.se
Wed Jan 21 13:59:35 CET 2004


Donatien Grolaux wrote:

> erik at sics.se wrote:
>
> >So, a proper solution to the problem would be to declare the target process
> >PermFailed.
> >
> As a generalisation, wouldn't it be a good idea to add the possibility
> to inject a permFail to a distributed entity at the Oz level ? Oz
> applications would otherwise just continuously try to connect to failed
> sites which is a waste of ressource that is very bad for continuously
> running servers for example.
>

I agree. That can easily be provided. However, the semantics of such an
operation on
any entity except a port is unclear. Defining an entity as being permfail, will
result in
defining the process the entity was created at as crashed. This will affect
other
entities orginating from this process.

The timeout between unsuccesfull connection atempts is increased, according to
the
values of some properties(don't reacll them right now). You can define the
maximum timeout allowed.
In practice this means that the used resources for a connection that is
impossible to establish will
eventually be neglectable.


>
> >The OpenTimeout is a safety mechanism to abort connection procedures that
> >hangs due to any reason. This to minimize the utilization of fd resources.
> >Actually,
> >it is a tool to protect the DL from buggy connection procedures. I assume
> >that you know
> > that the user can suply the DL with customized connection procedures.
> >
> Could you document that feature, the documentation states one can do it,
> but not how to. The source is also complicated to understand and not
> very helpful in how to implement a connection procedure.
>

It is actually not that complicated. Have a look at ConnectionProcedure.oz in
the
source.

>
> >>The way it is designed today, the connection opening procedure gives up
> >>without considering the TCP status. That is, the connection procedure is
> >>aborted after a default OpenTimeout=3s. In the case of connections that
> >>involve NAT or wireless peer-to-peer, opening a connection may take more
> >>than 3s. As a result, the connection procedure gives up and retries
> >>again without ever succeeding. Of course, the programmer can set
> >>OpenTimeout greater than 3s and establish the connection. However, the
> >>useless reconnecting problem
> >>remains in case of connecting to inaccessible machines.
> >>
> >>Q2: When establishing a connection, why not just rely on the transport
> >>protocol timeouts?
> >>
> >>
> >
> >See above. Just raise the time out.
> >
> Raising the timeout is a way of avoiding the problem, not a way to solve
> it !

No, don't agree.

The problem the timeout solves is to protect the DSS from connection procedures
that
does not terminate. It solves this perfectly fine.

Raising the timeout solves your problem of slow connection establishments.

> I believe the writer of an Oz application shouldn't have to decide
> a value by himself. And also the DL shouldn't try to protect itself in
> such a way from a buggy connection procedure : if a developper decides
> to change the default one, it's his responsability to write one that is
> not buggy. Instead it would be a very nice feature if two computers can
> open a tcp connection => Mozart processes can communicate together.
>  Unfortunately, its not the case for now.

It is, if you raise the timeout to a higher value.

>
> >>Q3: Wouldn't be better to support multihoming?
> >>
> >>
> >>
> >
> >Isn't that obvious: Yes. The problem, however, is that the IP number is
> >closely associated
> >with the identity of a Mozart process. I don't think this is easy to achive
> >at DL level.
> >
> What about the DSS then ? Will there be this restriction again ?

Nope. I think you should be familar with the possibility of the DSS, since
Valentine wrote a paper together with me on the new component based messaging
layer of the DSS. Please read the paper if you're interested.

>
> In the meantime would it be possible to change the way Mozart processes
> connect together to remove the need to connect through a particular IP
> only (ie if a computer has network interfaces 192.168.0.13 and
> 130.104.8.9 and the ticket was created using 130.104.8.9 then an
> incoming connection from 192.168.0.13 is still accepted) ?

Everything is possible, it is just a matter of resources. I don't have the
resources to restructure the connection mechanisms of Mozart. All my time
goes into the DSS and the schedule is quite tight, a lot of things have to be
done this year.  However, I think that Valentin could do the job.

/Erik


-
Please send submissions to hackers at mozart-oz.org
and administriva mail to hackers-request at mozart-oz.org.
The Mozart Oz web site is at http://www.mozart-oz.org/.





More information about the mozart-hackers mailing list