EPICS Home

Experimental Physics and Industrial Control System


 
1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Re: ioc crashing while using asyn & modbus
From: Gary Croke <[email protected]>
To: Mark Rivers <[email protected]>
Cc: Eric Norum <[email protected]>, [email protected]
Date: Fri, 14 Jul 2017 09:00:47 -0400 (EDT)
Hi Mark,

Another dump from last night listed below with values of those pointers.  Two seem to be null.  This configuration was with 10 second polling, I/O Intr scanning, still R2-8.  This ioc still had our system database running.  I'll try removing that.  Then try R2-9.


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb6bffb70 (LWP 499)]
0x001522d0 in writeRead (pasynUser=0x80ef4ac, write_buffer=0x80efc14 "\002\003\004K", write_buffer_len=6, 
    read_buffer=0x80efe6c "\003\070C\215fkC\215\177oC\216\024²Côö.Cõ\215+Cõw§@©ÚH@h\tq@\227ü\202", read_buffer_len=38, timeout=2, nbytesOut=0xb6bff1bc, nbytesIn=0xb6bff1b8, 
    eomReason=0xb6bff1b4) at ../../asyn/interfaces/asynOctetSyncIO.c:252
252	    status = pioPvt->pasynOctet->flush(pioPvt->octetPvt,pasynUser);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.i686 libgcc-4.4.7-17.el6.i686 libstdc++-4.4.7-17.el6.i686 ncurses-libs-5.7-4.20090207.el6.i686 readline-6.0-4.el6.i66

(gdb) bt
#0  0x001522d0 in writeRead (pasynUser=0x80ef4ac, write_buffer=0x80efc14 "\002\003\004K", write_buffer_len=6, 
    read_buffer=0x80efe6c "\003\070C\215fkC\215\177oC\216\024²Côö.Cõ\215+Cõw§@©ÚH@h\tq@\227ü\202", read_buffer_len=38, timeout=2, nbytesOut=0xb6bff1bc, nbytesIn=0xb6bff1b8, 
    eomReason=0xb6bff1b4) at ../../asyn/interfaces/asynOctetSyncIO.c:252
#1  0x001a83b0 in doModbusIO (pPlc=<value optimized out>, slave=38, function=3, start=1099, data=0x80edc90, len=18) at ../drvModbusAsyn.c:1730
#2  0x001aa6da in readPoller (pPlc=0x80efad0) at ../drvModbusAsyn.c:1322
#3  0x0030c7ea in start_routine (arg=0x80f1610) at ../../../src/libCom/osi/os/posix/osdThread.c:385
#4  0x0032ebc9 in start_thread () from /lib/libpthread.so.0
#5  0x00491dee in clone () from /lib/libc.so.6

(gdb) print pioPvt
$1 = (ioPvt *) 0x80edcb8

(gdb) print pioPvt->pasynOctet
$3 = (asynOctet *) 0x0

(gdb) print pioPvt->octetPvt
$4 = (void *) 0x0

(gdb) print pasynUser
$5 = (asynUser *) 0x80ef4ac




----- Original Message -----
From: "Mark Rivers" <[email protected]>
To: "gcroke" <[email protected]>
Cc: [email protected]
Sent: Thursday, July 13, 2017 2:04:25 PM
Subject: RE: ioc crashing while using asyn & modbus

Gary,

> Maybe not surprisingly, crash occurred more frequently.  I'll revert to that to see if I can get to the bottom of this more quickly.

I would actually be surprised if changing from SCAN=10 second to SCAN=I/O Intr increased the frequency of the crash.  The crash you showed was happening in the low-level communication layer.  The communication with the device is not affected by the SCAN value, because it is done in the poller thread which is independent of SCAN.

Mark


-----Original Message-----
From: Gary Croke [mailto:[email protected]] 
Sent: Thursday, July 13, 2017 9:14 AM
To: Mark Rivers
Cc: [email protected]
Subject: Re: ioc crashing while using asyn & modbus

Mark,

Thanks, I wasn't sure how to interpret the polling delay.  This application doesn't require a high update rate, once done with this testing I'll modify it accordingly.

I had been using I/O Intr scanning previously (with the same polling time).  Maybe not surprisingly, crash occurred more frequently.  I'll revert to that to see if I can get to the bottom of this more quickly.




----- Original Message -----
From: "Mark Rivers" <[email protected]>
To: "gcroke" <[email protected]>, "Torsten Bögershausen" <[email protected]>
Cc: [email protected], "Eric Norum" <[email protected]>
Sent: Thursday, July 13, 2017 9:30:58 AM
Subject: RE: ioc crashing while using asyn & modbus

Hi Gary,

I don't think it is a problem with the read buffer size.  The size of the Modbus packet returned is well defined and the proper size buffer is being allocated.

I looked at the bug fix in R2-9 and I don't think it will help either.  The symptom there was a crash when an bounds violation happened when corrupted size data was returned by the modbus server.  It's still worth trying, but I am doubtful it will fix the problem. 

I have a comment on your configuration which is unrelated to the crash.  You are doing the following:

drvModbusAsynConfigure("SA1-SB_1000", "pxg27", 1, 3, 999, 28, 8, 100, "iq150")

record(ai,"$(meter):rcvWh") {
    field(SCAN,"10 second")
    field(DTYP,"asynInt32")
    field(FLNK,"$(meter):rcvkWh")
    field(INP,"@asyn($(meter)_1100, 0)")
    field(EGU,"Wh")
}

In your drvModbusAsynConfigure command you are configuring the poll time to be 100 ms, so 10 Hz polling.  This is how fast the driver will be reading from the hardware.  However, you have configured your record to have 10 second scan period.  The record is just reading the result of the most recent poll every 10 seconds.  This is not how it would normally be done.  Normally the record will have SCAN=I/O Intr so it will process whenever the reading changes.  If you want it to process every 10 seconds you could change the drvModbusAsynConfigure to have 10000 ms poll time and use SCAN=I/O Intr.  This will slow down the reading of the hardware.

drvModbusAsynConfigure("SA1-SB_1000", "pxg27", 1, 3, 999, 28, 8, 10000, "iq150")

Mark

________________________________________
From: Gary Croke [[email protected]]
Sent: Thursday, July 13, 2017 5:45 AM
To: Torsten Bögershausen
Cc: Mark Rivers; [email protected]; Eric Norum
Subject: Re: ioc crashing while using asyn & modbus

Hi Torsten,

Thanks for the feedback.  Eric also wondered if there was some other code corrupting the stack.  The only other thing running on the ioc is our system database.  I'll remove that to see if it helps.  I will also try increasing the read buffer size if that doesn't work.  Maybe it's as Mark pointed out and the problem goes away with Modbus 2.9, but would be nice to identify the issue here.

Thanks again,

Gary



----- Original Message -----
From: "Torsten Bögershausen" <[email protected]>
To: "gcroke" <[email protected]>, "Mark Rivers" <[email protected]>
Cc: [email protected]
Sent: Thursday, July 13, 2017 2:15:04 AM
Subject: Re: ioc crashing while using asyn & modbus

Just speculating:
Is the read buffer long enough ?

I count here 10 bytes,
"\003\070C\216ñÅC\217\221L"
  1   2   34   5678   9   10

Does it help to increase the buffer size ?

(The dump seems to indicate that the crash occurred after a
  thread-switch, so it may be that the stack was corrupted by
some other thread)


On 12/07/17 22:00, Gary Croke wrote:
> Hi Mark,
>
> Thanks for quick response.  Here's the output from gdb at the fault, and the output of the bt command:
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0xb1d45b70 (LWP 16755)]
> 0x001562d0 in writeRead (pasynUser=0x817839c, write_buffer=0x8178b04 "\004\003\f-", write_buffer_len=6, read_buffer=0x8178d5c "\003\070C\216ñÅC\217\221L", read_buffer_len=10, timeout=2,
>      nbytesOut=0xb1d451bc, nbytesIn=0xb1d451b8, eomReason=0xb1d451b4) at ../../asyn/interfaces/asynOctetSyncIO.c:252
> 252       status = pioPvt->pasynOctet->flush(pioPvt->octetPvt,pasynUser);
> Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.i686 libgcc-4.4.7-17.el6.i686 libstdc++-4.4.7-17.el6.i686 ncurses-libs-5.7-4.20090207.el6.i686 readline-6.0-4.el6.i686
> (gdb) bt
> #0  0x001562d0 in writeRead (pasynUser=0x817839c, write_buffer=0x8178b04 "\004\003\f-", write_buffer_len=6, read_buffer=0x8178d5c "\003\070C\216ñÅC\217\221L", read_buffer_len=10,
>      timeout=2, nbytesOut=0xb1d451bc, nbytesIn=0xb1d451b8, eomReason=0xb1d451b4) at ../../asyn/interfaces/asynOctetSyncIO.c:252
> #1  0x001ac3b0 in doModbusIO (pPlc=<value optimized out>, slave=10, function=3, start=3117, data=0x8176b78, len=4) at ../drvModbusAsyn.c:1730
> #2  0x001ae6da in readPoller (pPlc=0x81789c0) at ../drvModbusAsyn.c:1322
> #3  0x0030c7ea in start_routine (arg=0x817a340) at ../../../src/libCom/osi/os/posix/osdThread.c:385
> #4  0x0032ebc9 in start_thread () from /lib/libpthread.so.0
> #5  0x00491dee in clone () from /lib/libc.so.6
> (gdb)
>
>
>
>
>
> ----- Original Message -----
> From: "Mark Rivers" <[email protected]>
> To: "gcroke" <[email protected]>
> Cc: [email protected]
> Sent: Wednesday, July 12, 2017 10:44:32 AM
> Subject: Re: ioc crashing while using asyn & modbus
>
> Hi Gary,
>
> Can you run gdb on the core dump files and send the output of the "back trace" command?
>
> Mark
>
>
> Sent from my iPhone
>
>> On Jul 12, 2017, at 7:35 AM, Gary Croke <[email protected]> wrote:
>>
>> Hi All,
>>
>> I am using Asyn and Mark's modbus driver to talk a number of utility meters (primarily Eaton electricity meters).  My soft ioc is frequently crashing with a seg fault.  Connectivity of the meters is patchy in some cases and I cannot guarantee their availability at all times, resulting in a lot of communication errors.  In the instances I am having trouble, an intermediate gateway device hosts the modbus server and supports multiple downstream meters.  I can mitigate the situation by dedicating a soft ioc per IP address (gateway), but I'd like to identify the source of the problem if possible.  I have other iocs using the same software to talk to other models of meter without any issues.
>>
>> I've tried to detail some of my configuration and the errors I'm seeing below.  Thanks for any feedback.
>>
>> Gary
>>
>>
>> Versions:
>> =========
>> EPICS:  3.14.12.3
>> ASYN:   4-31
>> MODBUS: 2-8
>>
>>
>> Asyn & modbus driver initialization:
>> ====================================
>> drvAsynIPPortConfigure("pxg27", "129.57.15.27:502", 0, 1, 1)
>>
>> modbusInterposeConfig("pxg27", 0, 5000)
>>
>> drvModbusAsynConfigure("SA1-SB_1000", "pxg27", 1, 3, 999, 28, 8, 100, "iq150")
>> drvModbusAsynConfigure("SA1-SB_1100", "pxg27", 1, 3, 1099, 18, 6, 100, "iq150")
>> drvModbusAsynConfigure("SA1-SB_3118", "pxg27", 1, 3, 3117, 4, 8, 100, "iq150")
>>
>> drvModbusAsynConfigure("SA2-SB_1000", "pxg27", 2, 3, 999, 28, 8, 100, "iq150")
>> drvModbusAsynConfigure("SA2-SB_1100", "pxg27", 2, 3, 1099, 18, 6, 100, "iq150")
>> drvModbusAsynConfigure("SA2-SB_3118", "pxg27", 2, 3, 3117, 4, 8, 100, "iq150")
>>
>> ... several other asyn ports, etc. ...
>>
>> dbLoadRecords("powerMetersV/db/iq150.db", "meter=SA1-SB")
>> dbLoadRecords("powerMetersV/db/iq150.db", "meter=SA2-SB")
>>
>>
>> Example Record:
>> ===============
>> record(ai,"$(meter):rcvWh") {
>>     field(SCAN,"10 second")
>>     field(DTYP,"asynInt32")
>>     field(FLNK,"$(meter):rcvkWh")
>>     field(INP,"@asyn($(meter)_1100, 0)")
>>     field(EGU,"Wh")
>> }
>>
>>
>> Error trace #1:
>> ===============
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:1807] [Compton_1000Read,0x8679f20,50] pxg12 queueUnlockPort
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:1838] [Compton_1000Read,0x8679f20,50] pxg12 asynManager::queueUnlockPort waiting for event
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:860] [pxg12,0x8660100,90] asynManager::portThread port=pxg12 callback
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:914] [pxg12,0x8660100,90] pxg12 asynManager::queueLockPortCallback signaling begin event
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:919] [pxg12,0x8660100,90] pxg12 asynManager::queueLockPortCallback waiting for mutex from queueUnlockPort
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:1841] [Compton_1000Read,0x8679f20,50] pxg12 queueUnlockPort unlock mutex 0xb4f006e0 complete.
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:1786] [HNSS_Moeller_1000Read,0x8673ee8,50] pxg12 asynManager::queueLockPort got event from callback
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/drvAsynSerial/drvAsynIPPort.c:795] [HNSS_Moeller_1000Read,0x8673ee8,50] 129.57.15.12:502 flush
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/drvAsynSerial/drvAsynIPPort.c:556] [HNSS_Moeller_1000Read,0x8673ee8,50] 129.57.15.12:502 write.
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/drvAsynSerial/drvAsynIPPort.c:558] [HNSS_Moeller_1000Read,0x8673ee8,50] 129.57.15.12:502 write 12
>>
>> \000\001\000\000\000\006\002\003\003ç\000\034
>>
>> 00 01 00 00 00 06 02 03 03 e7 00 1c
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/drvAsynSerial/drvAsynIPPort.c:654] [HNSS_Moeller_1000Read,0x8673ee8,50] wrote 12 to 129.57.15.12:502, return asynSuccess.
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/interfaces/asynOctetSyncIO.c:262] [HNSS_Moeller_1000Read,0x8673ee8,50] asynOctetSyncIO wrote:
>> ç
>> \002\003\003ç\000\034
>>
>> 02 03 03 e7 00 1c
>> 2017/07/05 15:58:06.942 [pxg12,-1,0] [../../asyn/drvAsynSerial/drvAsynIPPort.c:674] [HNSS_Moeller_1000Read,0x8673ee8,50] 129.57.15.12:502 read.
>> 2017/07/05 15:58:06.942 drvModbusAsyn::doModbusIO port Compton_1000 error calling writeRead, error=129.57.15.12:502 timeout: Resource temporarily unavailable, nwrite=6/6, nread=0
>> 2017/07/05 15:58:06.942 Compton:aAmps devAsynFloat64 process read error
>> 2017/07/05 15:58:06.942 Compton:abVolt devAsynFloat64 process read error
>> 2017/07/05 15:58:06.942 Compton:anVolt devAsynFloat64 process read error
>> 2017/07/05 15:58:06.942 Compton:bAmps devAsynFloat64 process read error
>> 2017/07/05 15:58:06.942 Compton:bcVolt devAsynFloat64 process read error
>> 2017/07/05 15:58:06.942 Compton:bnVolt devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:cAmps devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:caVolt devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:cnVolt devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:frequency devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:totVA devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:totVAR devAsynFloat64 process read error
>> 2017/07/05 15:58:06.943 Compton:totW devAsynFloat64 process read error
>> 2017/07/05 15:58:07.042 drvModbusAsyn::doModbusIO port Compton_1000 has I/O error
>> 2017/07/05 15:58:07.042 [pxg12,-1,0] [../../asyn/asynDriver/asynManager.c:1518] [Compton_1000Read,0x8679f20,50] pxg12 addr -1 queueRequest priority 3 not lockHolder
>> Segmentation fault (core dumped)
>>
>>
>> Error trace #2:
>> ===============
>> 2017/07/11 10:03:48.585 pxg13 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:48.664 pxg12 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:48.664 drvModbusAsyn::doModbusIO port C_UPH5_3118 error calling writeRead, error=129.57.15.12:502 read error: Connection reset by peer, nwrite=6/6, nread=0
>> 2017/07/11 10:03:48.664 C_UPH5:maxDemPosVAR devAsynFloat64 process read error
>> 2017/07/11 10:03:48.664 C_UPH5:maxDemPosW devAsynFloat64 process read error
>> 2017/07/11 10:03:48.686 pxg30 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:48.756 pxg11 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:48.764 drvModbusAsyn::doModbusIO port C_UPH5_3118 has I/O error
>> 2017/07/11 10:03:48.780 pxg13 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:48.838 pxg11 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> ...
>> 2017/07/11 10:03:52.104 pxg13 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:52.104 pxg27 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:52.104 drvModbusAsyn::doModbusIO port SA1-SB_1100 error calling writeRead, error=129.57.15.27:502 timeout: Resource temporarily unavailable, nwrite=6/6, nread=0
>> 2017/07/11 10:03:52.104 SA1-SB:dlvWh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:negVARh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:netVARh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:netWh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:posVARh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:rcvWh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:totVARh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:totVAh devAsynInt32 process read error
>> 2017/07/11 10:03:52.104 SA1-SB:totWh devAsynInt32 process read error
>> 2017/07/11 10:03:52.145 pxg30 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:52.153 pxg13 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:52.189 pxg11 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:52.204 drvModbusAsyn::doModbusIO port SA1-SB_1100 has I/O error
>> 2017/07/11 10:03:52.225 pxg30 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:52.278 pxg11 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> ...
>> 2017/07/11 10:03:53.463 pxg13 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> 2017/07/11 10:03:53.467 pxg27 asynManager::queueLockPortCallback got mutex from queueUnlockPort, signaling end event
>> Segmentation fault (core dumped)
>>
>>
>> Also seen once or twice:
>> ========================
>> A call to 'assert(status == epicsMutexLockOK)'
>>     by thread 'A/MP/HAD_9018Read' failed in ../../asyn/asynDriver/asynManager.c line 2342.
>> EPICS Release EPICS R3.14.12.3 $Date: Mon 2012-12-17 14:11:47 -0600$.
>> Local time is 2017-06-29 14:43:11.601380355 EDT
>> Please E-mail this message to the author or to [email protected]
>> Calling epicsThreadSuspendSelf()

Replies:
RE: ioc crashing while using asyn & modbus Mark Rivers
References:
ioc crashing while using asyn & modbus Gary Croke
Re: ioc crashing while using asyn & modbus Mark Rivers
Re: ioc crashing while using asyn & modbus Gary Croke
Re: ioc crashing while using asyn & modbus Torsten Bögershausen
Re: ioc crashing while using asyn & modbus Gary Croke
RE: ioc crashing while using asyn & modbus Mark Rivers
Re: ioc crashing while using asyn & modbus Gary Croke
RE: ioc crashing while using asyn & modbus Mark Rivers

Navigate by Date:
Prev: EPICS Qt release 3.5.1 Andrew C. Starritt
Next: RE: ioc crashing while using asyn & modbus Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: RE: ioc crashing while using asyn & modbus Mark Rivers
Next: RE: ioc crashing while using asyn & modbus Mark Rivers
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  2009  2010  2011  2012  2013  2014  2015  2016  <20172018  2019  2020  2021  2022  2023  2024