Experimental Physics and Industrial Control System
|
Chestnut, Ronald P. wrote:
Ok, I'll bite --- why 11?
Ron,
Three people have asked me this question so far, so I guess it's worth
posting the answer.
There's an event loop that runs about 20 times per second. Within the
body of that loop is another loop that goes through the list of all the
displays. There's a counter that is intended to cause a check to be done
for each display once per ~10 passes through the outer loop. The bug was
to update and reset that counter inside the inner (per-display) loop
rather than in the outer (per time) loop. So the intention was to check
each display about twice per second. The result was actually to check
every 11th window (with the list wrapping around at the end) on every
pass through the outer loop. The eleven comes from casually checking for
the counter to be ">10" rather than ">=10", which is not in itself a
problem.
Since the display list is a closed loop and 11 is prime, you eventually
check every display unless there are exactly 11, or 22, or 33, etc. In
those cases you only check the 11th (and 22nd, and 33rd, etc.).
Ironically, if it had been 10 rather than 11 we might have found it
sooner, since then you could see it with 2 displays, or 5 displays, or
10 displays. 12 would have been ever easier to notice. But then I
wouldn't have had nearly as much fun. :)
BTW, John has posted version 1-11-1zp which is free of this problem with
no special environment settings. Thanks, John!
--Brian
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Brian Bevins
Sent: Wednesday, March 11, 2009 9:01 AM
To: EPICS Techtalk
Subject: EDM Freeze Problem Resolved
Back in January I described a problem we have been having with EDM ever since we moved from HP-UX to RedHat Enterprise Linux in 2006. It was serious enough to prevent us from completing our Linux deployments in some areas. With help from John this has now been resolved.
The problem was that operators would sometimes find at random times that their EDM displays were showing old data that didn't match displays on other workstations. In fact, the EDM screens were completely frozen with no graphical updates. Opening or closing a display would always resolve the problem, but for a long time we couldn't reproduce it deliberately.
To help us diagnose the problem, John added an option in EDM 1-11-1zm that allows you to set an environment variable EDMIGNOREICONIC that causes EDM to ignore the iconified state of windows. That is, with the variable set, all widgets get updated even if the window manager reports that the window is iconified. This made our frozen screen problem go away and pointed me to the source of the problem. There is a bug in the code that checks windows for their iconified state.
Using EDM 1-11-1zm or later with EDMIGNOREICONIC avoids the problem. I have sent John a patch for his review that corrects the bug and eliminates the need for setting EDMIGNOREICONIC.
To see the bug in action (or lack thereof) you need to open a number of EDM displays that is a multiple of 11. Make sure that at least one display is on a different desktop workspace (which is equivalent to being iconified under many window managers). Once the 11th display is open, switch workspaces and you will find that any displays that were on other workspaces are now frozen. Opening or closing a display will unfreeze all the displays (since the total is no longer 11). Note that when counting the open displays, embedded displays count separately. So, for example, a single EDM window that contains 3 embedded displays will count as 4 displays total. We have observed this on RHEL 3, 4, and 5 using GNOME or KDE. There are also reports of it on Scientific Linux and Debian. We did not see the problem using CDE on HP-UX probably due to different window/workspace handling.
My hearty thanks to John for helping us track this down and to all of the operators here who diligently reported the problem with enough detail to finally nail it.
--Brian
--
Brian S. Bevins, PE
Computer Scientist / Mechanical Engineer Thomas Jefferson National Accelerator Facility
"Nothing in all the world is more dangerous than
sincere ignorance and conscientious stupidity."
--Martin Luther King Jr.
--
Brian S. Bevins, PE
Computer Scientist / Mechanical Engineer
Thomas Jefferson National Accelerator Facility
"Nothing in all the world is more dangerous than
sincere ignorance and conscientious stupidity."
--Martin Luther King Jr.
- References:
- EDM Freeze Problem Resolved Brian Bevins
- RE: EDM Freeze Problem Resolved Chestnut, Ronald P.
- Navigate by Date:
- Prev:
RE: EDM Freeze Problem Resolved Chestnut, Ronald P.
- Next:
EPICS Meeting April 2009 Rolf Keitel
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
<2009>
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
- Navigate by Thread:
- Prev:
RE: EDM Freeze Problem Resolved Chestnut, Ronald P.
- Next:
EPICS Meeting April 2009 Rolf Keitel
- Index:
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
<2009>
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
|
ANJ, 31 Jan 2014 |
·
Home
·
News
·
About
·
Base
·
Modules
·
Extensions
·
Distributions
·
Download
·
·
Search
·
EPICS V4
·
IRMIS
·
Talk
·
Bugs
·
Documents
·
Links
·
Licensing
·
|