EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: RE: Asyn disconnects, reconnects to serial device server
From: Mark Rivers <[email protected]>
To: "'[email protected]'" <[email protected]>
Cc: "[email protected]" <[email protected]>
Date: Fri, 26 Sep 2014 21:11:09 +0000
It really does seem like there was some sort of interruption to the Moxa: network switch glitch, power-glitch, etc.

I can't explain why one device disconnected and the other did not.  But I believe this was below the EPICS level, at the system socket level.

If it happens repeatedly we can try some more diagnostics.

Mark


-----Original Message-----
From: [email protected] [mailto:[email protected]] 
Sent: Friday, September 26, 2014 3:48 PM
To: Mark Rivers
Cc: [email protected]
Subject: RE: Asyn disconnects, reconnects to serial device server



On Fri, 26 Sep 2014, Mark Rivers wrote:

> - Was EPICS communicating with other devices on this same Moxa terminal server during the time that you had problems?

There are two other serial devices on the same Moxa.

> If so, did they keep communicating normally?

>From the second device, iocsh reported one message at about the same time:

  2014/09/24 15:40:22.174 ctrl2 Ctlr2:Poll: No reply from device within 5000 ms

But no messages about the port to this device disconnecting.  There were
no iocsh messages about the third device, but the time between commands is long
enough so that there may not have been any messages attempted during the brief disconnection.  We only had asynTrace enabled for the one device, reported in the first message.

> - What was the time between the first communication failure and when communications were restored?

About 1 minute and 30 seconds.

> - I am confused about the 2 types of log messages you showed: ....
> Are all of these messages coming from the same IOC, but some are from the iocLogServer and some are directly copied from the iocsh output?

Yes. The iocLogServer writing the asynTrace output runs on a different host from the IOC where asyn is running and the iocsh output is written.

> ...
> ________________________________________
> From: [email protected] [[email protected]] on behalf of [email protected] [[email protected]]
> Sent: Friday, September 26, 2014 1:38 PM
> To: [email protected]
> Subject: Asyn disconnects, reconnects to serial device server
>
> We have a soft IOC that we use to control several serial devices, via
> StreamDevice on Asyn communicating over Ethernet to TCP/IP ports on a
> Moxa serial device server.
>
> Usually this works, but recently a serial device did not respond to
> several commands from the IOC.  We saw several messages indicating
> that Asyn disconnected, then reconnected to its port on the server.
> Messages logged by asynTrace and other details appear below.
>
> Does anyone have an explanation for this?  We did not notice any
> unusual events on the network, other than this.
>
> Jon Jacky
>
> --------------------------
>
> This is asyn-4-17 running on EPICS 3.14.12 on Debian Linux with kernel
> 3.2.0-3-686-pae running on an i686.  Code etc. appears below.
>
> Here are excerpts from asynTrace logs.  A normal command/reponse looks
> like this. The IOC named process-ioc sends an escape character (33
> hex) and the controller responds with a banner:
>
> process-ioc:47630 Wed Sep 24 13:57:12 2014 2014/09/24 13:57:08.476 ctlr wrote
> process-ioc:47630 Wed Sep 24 13:57:12 2014 \033\r
> process-ioc:47630 Wed Sep 24 13:57:12 2014 2014/09/24 13:57:09.480 ctlr asynOctetB\
> ase interrupt
> process-ioc:47630 Wed Sep 24 13:57:12 2014 2014/09/24 13:57:09.480 ctlr read
> process-ioc:47630 Wed Sep 24 13:57:12 2014 \"CTLR VER 1.2\"\n\r
> process-ioc:47630 Wed Sep 24 13:57:12 2014 2014/09/24 13:57:09.588 ctlr asynOctetB\
> ase interrupt
> process-ioc:47630 Wed Sep 24 13:57:12 2014 2014/09/24 13:57:09.588 ctlr read
> process-ioc:47630 Wed Sep 24 13:57:12 2014 $\n\r
>
> Later the IOC sent the escape character but received no reply, instead we logged this:
>
> process-ioc:47630 Wed Sep 24 15:39:47 2014 2014/09/24 15:39:45.586 ctlr wrote
> process-ioc:47630 Wed Sep 24 15:39:47 2014 \033\r
> process-ioc:47630 Wed Sep 24 15:39:57 2014 2014/09/24 15:39:51.839 Can't connect \
> to 192.168.0.129:4003: Connection refused ctlr -1 autoConnect could not connect
> process-ioc:47630 Wed Sep 24 15:40:12 2014 2014/09/24 15:40:07.228 Can't connect \
> to 192.168.0.129:4003: Connection refused ctlr -1 autoConnect could not connect
> process-ioc:47630 Wed Sep 24 15:40:17 2014 2014/09/24 15:40:13.228 Can't connect \
> to 192.168.0.129:4003: Connection refused ctlr -1 autoConnect could not connect
> process-ioc:47630 Wed Sep 24 15:40:37 2014 2014/09/24 15:40:35.882 Can't connect \
> to 192.168.0.129:4003: Connection refused ctlr -1 autoConnect could not connect
> process-ioc:47630 Wed Sep 24 15:41:12 2014 2014/09/24 15:41:10.316 Can't connect \
> to 192.168.0.129:4003: Connection refused ctlr -1 autoConnect could not connect
>
> The IOC continued to send escapes and then it apparently reconnected;
> we logged this:
>
> process-ioc:47630 Wed Sep 24 15:41:22 2014 2014/09/24 15:41:19.259 ctlr wrote
> process-ioc:47630 Wed Sep 24 15:41:22 2014 \033\r
> ... etc., same as normal interaction above ...
>
> Meanwhile, on the IOC, iocsh printed this:
>
> 2014/09/24 15:39:48.034 ctlr Ctlr:Reset: connection closed in read
> 2014/09/24 15:39:48.034 ctlr Ctlr:Reset: I/O error after reading 0 bytes: ""
> 2014/09/24 15:39:48.034 ctlr Ctlr:Reset: Protocol aborted
> 2014/09/24 15:40:06.828 ctlr Ctlr:Reset: pasynCommon->connect() failed: Can't connect to 192.168.0.129:4003: Connection refused
> 2014/09/24 15:40:06.828 ctlr Ctlr:Reset: Protocol aborted
> 2014/09/24 15:40:33.093 ctlr Ctlr:Reset: pasynCommon->connect() failed: Can't connect to 192.168.0.129:4003: Connection refused
> 2014/09/24 15:40:33.093 ctlr Ctlr:Reset: Protocol aborted
>
> Here is the line in st.cmd that configures asyn for this controller
>
> drvAsynIPPortConfigure ("ctlr", "192.168.0.129:4003",0,0)
>
> Here are the records in the database.  Processing Ctlr:Reset invokes
> the reset protocol in ctlr.proto.
>
> record(asyn, "Ctlr:Asyn"){
>  field(PORT, "ctlr") # must match port in drvAsynIPPortConfigure
>  # let ADDR, OMAX, IMAX fields default to 0
> }
>
> record(stringin, "Ctlr:Reset") {
>  field(SCAN, "Passive")
>  field(DTYP, "stream")
>  field(INP, "@ctlr.proto reset ctlr") # out ESC CR, in to stringin VAL
>  field(PINI, "1") # force reset at startup
>  field(FLNK, "Ctlr:ResetCalc") # set Ctlr intlk if reset fails
> }
>
> Here is the reset protocol in ctlr.proto etc., which sends the reset
> command to the controller and collects the response.
>
> OutTerminator = CR;
> InTerminator = LF CR;
> ExtraInput = Error;
>
> ReplyTimeout = 5000;
> ReadTimeout  =  5000;
>
> # 'in' is the entire expected banner string
> # so any other string will raise @mismatch exception
> reset {
>  out ESC;     # just ESC is the reset command
>  in '"CTLR VER 1.2"';   # banner then LF CR
>  in '$';  # $ then LF CR completion sequence
>  exec "dbpf Process:Phase.VAL 0"; # Phase PV indicates Reset was processed
> }
>
> Here at the commands to create the asyn log shown above
>
> asynSetTraceIOMask("ctlr",0,0x2)
> asynSetTraceFile("ctlr",0)
>
> The log messages are sent from the IOC to an iocLogServer running on a
> different Linux host, which writes them to a file.



References:
RE: Asyn disconnects, reconnects to serial device server Mark Rivers
RE: Asyn disconnects, reconnects to serial device server jon

Navigate by Date:
Prev: RE: Asyn disconnects, reconnects to serial device server jon
Next: Re: EDM priya tiwari
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: Asyn disconnects, reconnects to serial device server jon
Next: EPICS Python Question Elder Matias
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  <20142015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 17 Dec 2015 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·