EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: Question about asyn and offline device
From: Sonya Hoobler <[email protected]>
To: Torsten Bögershausen <[email protected]>
Cc: [email protected]
Date: Tue, 16 Sep 2014 10:33:16 -0700 (PDT)
Hi Torsten,

Thanks for you response and sorry for my delayed reply; I'm catching up after being away.

Just curious: Why are they offlien ?
Is this intended, is it a problem with the network, or do the devices have problems their selves?

They can be offline because of temporary outages or being added to the monitoring soft ioc in advance of coming online.


When the soft IOC is exiting, why does it try to connect to a device ?

(Or is it at the start of the IOC ?)
(Or do you start the IOC and stop it shortly after the start ?

The soft ioc does not try to connect to the device as part of exit. The soft ioc is repeatedly trying to connect to the device and so it is usually in the middle of connect() when a user tries to exit.


Do we know why does it take 3 minutes ?
On a Debian Linux
"time telnet 10.0.0.1" reports 1 minute, 3 seconds, and the Scientific Linux 6.5 system reports the same
What does
"time telnet 10.0.0.2" give on your system ?

On our system, time telnet 10.0.0.2 returns 3m9.048s.


What would happen if we do like this:
(diff is taken from latest, so the line number may by different)


diff --git a/asyn/drvAsynSerial/drvAsynIPPort.c b/asyn/drvAsynSerial/drvAsynIPPort.c
index 07b96d3..2b2cfe6 100644
--- a/asyn/drvAsynSerial/drvAsynIPPort.c
+++ b/asyn/drvAsynSerial/drvAsynIPPort.c
@@ -236,6 +236,7 @@ cleanup (void *arg)
    ttyController_t *tty = (ttyController_t *)arg;
     if (!tty) return;
+    epicsSocketDestroy(tty->fd);
    status=pasynManager->lockPort(tty->pasynUser);
    if(status!=asynSuccess)
asynPrint(tty->pasynUser, ASYN_TRACE_ERROR, "%s: cleanup locking error\n", tty->portName);
@@ -243,7 +244,6 @@ cleanup (void *arg)
    if (tty->fd != INVALID_SOCKET) {
asynPrint(tty->pasynUser, ASYN_TRACE_FLOW, "%s: shutdown socket\n", tty->portName);
        tty->flags |= FLAG_SHUTDOWN; /* prevent reconnect */
-        epicsSocketDestroy(tty->fd);
        tty->fd = INVALID_SOCKET;
/* If this delay is not present then the sockets are not always really closed cleanly */
        epicsThreadSleep(CLOSE_SOCKET_DELAY);

This would not work because while the device is not connected, there is no active socket and so no socket to destroy. You can see this in the connectIt() routine.

Sonya






On Thu, 11 Sep 2014, Torsten Bögershausen wrote:



On 10/09/14 00:28, Sonya Hoobler wrote:
Hi Mark,

At LCLS we have VME crates (and other devices) that we monitor and control with soft iocs using asyn + streamdevice.
There are many crates and it is not uncommon for one or more to be offline.
Just curious: Why are they offlien ?
Is this intended, is it a problem with the network, or do the devices have problems their selves?


We have noticed that if there is an offline device, it can take minutes for the monitoring soft ioc to exit. This delay is the time it takes for the connect() networking routine to try to connect to the offline device.

When the soft IOC is exiting, why does it try to connect to a device ?

(Or is it at the start of the IOC ?)
(Or do you start the IOC and stop it shortly after the start ?

the socket is blocking and connect() takes about 3 minutes on our system.
Do we know why does it take 3 minutes ?
On a Debian Linux
"time telnet 10.0.0.1" reports 1 minute, 3 seconds, and the Scientific Linux 6.5 system reports the same
What does
"time telnet 10.0.0.2" give on your system ?


(More detail: the delay occurs while drvAsynIPPort cleanup() waits for the synchronousLock mutex.

The mutex is held by asynManager portThread() or connectAttempt() while drvAsynIPPort connectIt() calls connect().)


I am curious what your thoughts are on this and whether you think it is worthwhile to pursue a change to reduce the shutdown time under these conditions.

One option could be to modify connectIt() to temporarily set the socket to non-blocking during connect() and use select() to enforce a specified timeout.

Then the socket could be set back to blocking. One potential downside of this could be introducing a new timeout which might not suit all systems.

A little more detail about our system: the soft iocs run on Linux RHEL5 32-bit servers,
Isn't it that the cleanup() want to call epicsSocketDestroy(), which should result in a a simple close().

But, it is blocked because of the lockPort(), we need to wait for connect() to fail.

What would happen if we do like this:
(diff is taken from latest, so the line number may by different)


diff --git a/asyn/drvAsynSerial/drvAsynIPPort.c b/asyn/drvAsynSerial/drvAsynIPPort.c
index 07b96d3..2b2cfe6 100644
--- a/asyn/drvAsynSerial/drvAsynIPPort.c
+++ b/asyn/drvAsynSerial/drvAsynIPPort.c
@@ -236,6 +236,7 @@ cleanup (void *arg)
    ttyController_t *tty = (ttyController_t *)arg;
     if (!tty) return;
+    epicsSocketDestroy(tty->fd);
    status=pasynManager->lockPort(tty->pasynUser);
    if(status!=asynSuccess)
asynPrint(tty->pasynUser, ASYN_TRACE_ERROR, "%s: cleanup locking error\n", tty->portName);
@@ -243,7 +244,6 @@ cleanup (void *arg)
    if (tty->fd != INVALID_SOCKET) {
asynPrint(tty->pasynUser, ASYN_TRACE_FLOW, "%s: shutdown socket\n", tty->portName);
        tty->flags |= FLAG_SHUTDOWN; /* prevent reconnect */
-        epicsSocketDestroy(tty->fd);
        tty->fd = INVALID_SOCKET;
/* If this delay is not present then the sockets are not always really closed cleanly */
        epicsThreadSleep(CLOSE_SOCKET_DELAY);


soon to be RHEL6 64-bit. We are using base R3-14-12, asyn4-21, streamdevice-R2-5, TCP/IP telnet-style connections, e.g.:

drvAsynIPPortConfigure ("crat-test-bd01","crat-test-bd01:23",0,0,0)

Thanks,
   Sonya


Sonya Hoobler
SLAC National Accelerator Laboratory
[email protected]


References:
Question about asyn and offline device Sonya Hoobler
Re: Question about asyn and offline device Torsten Bögershausen

Navigate by Date:
Prev: Re: Is RULES_JAVA broken? J. Lewis Muir
Next: Re: Is RULES_JAVA broken? Nerses Gevorgyan
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: Re: Question about asyn and offline device Torsten Bögershausen
Next: EPICSQt version 2.9.0 released Andrew Rhyder
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 17 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·