EPICS Controls Argonne National Laboratory

Experimental Physics and
Industrial Control System

1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024  Index 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
<== Date ==> <== Thread ==>

Subject: Help with streamDevice parsing HTML
From: Rod Nussbaumer <[email protected]>
To: [email protected]
Date: Wed, 04 Mar 2009 11:45:18 -0800
Greetings, EPICS users.

I am trying to use streamDevice with regular expression support to parse input from an instrument containing an embedded web server. So far, I've successfully built and run the streamApp sample IOC that gets built with the streamDevice distribution. I am able to parse a single HTML element (TITLE, with minor mods to the example protocol file) from a static web page served by an Apache web server. So, I think I've built everything properly (streamDevice 2.4, asyn 4-9, EPICS 3.14.10, Linux=Scientific Linux 4.x).

My difficulty is that the instrument produces a page containing three strings that I wish to extract into three EPICS records. I'm testing with string-in records for now, but eventually I would prefer to scan to analog-in records. I am able to extract the first element from the HTML, although I do get error messages from streamDevice:

"Input "HTTP/1.0 200 OK<0d><0a>Con..." does not match format %.1/^MeasurePM*([0-9.]+)/"

However, I am unsure about how to get multiple fields scanned from the HTML using a single HTTP GET. I would really like to avoid doing one whole HTTP fetch per record, since the instrument is fairly primitive, and can probably become overwhelmed by excessive traffic, and it may be the case that readings on a single fetch are synchronized in time. From my interpretation of the documentation, the scanner should stop, leaving unread text in the pipeline once it has made a successful match. However, I don't understand how to continue scanning the remainder of the HTML page to extract the remaining data.

I attach below, the streamDevice protocol file I'm presently using (the latest flavor that gets me closest to working), the relevant database, and the entire HTML that the instrument returns. Much of this is based on the examples from streamDevice.

#=========================<protocol file>=============================
# regular expression example
# extract the title of from a web page

outterminator = NL;
interminator = "</HTML>";  # terminators can have arbitrary length

# Web servers close the connection after sending a page.
# Thus, we can't use autoconnect (see drvAsynIPPortConfigure)
# Handle connection manually in protocol.

readTitle {
    extraInput=ignore;
    interminator = "</HTML>";  # terminators can have arbitrary length
    connect 1000;	        # connect to server, 1 second timeout
    out "GET http://isacwserv.triumf.ca";;    # HTTP request
    in "%.1/<TITLE>(.+)<\/TITLE>/";  # get string in <TITLE></TITLE>
    disconnect;
}

readPPM {
    extraInput=ignore;
    interminator = "</HTML>";  # terminators can have arbitrary length
    connect 1000;              # connect to server, 1 second timeout
    out "GET http://142.90.132.71/lcd.cgi";;    # HTTP request
in "%.1/^Measure\(PPM\):\s*([0-9.]+)/"; # get string in Measure field
}

readFLOW {
    extraInput=ignore;
    in "%.1/^Sample flow\(CC\):\\s*([0-9.]+)/"; # get string in Flow field
}

readSCALE {
    extraInput=ignore;
    in "%.1/Scale:\s*(.+)\n/";       # get string in Scale field
    disconnect;                      # servers closes, so do we.
}

#==================< EPICS db file >===========================
record (stringin, "ISAC:WSERV:TITLE")
{
    field (DTYP, "stream")
    field (INP,  "@regexp.proto readTITLE isacwserv")
}

record (stringin, "ISAC2:N2ANAL1:PPM")
{
    field (DTYP, "stream")
    field (FLNK, "ISAC2:N2ANAL1:FLOW")
    field (INP,  "@regexp.proto readPPM n2anal1")
}

record (stringin, "ISAC2:N2ANAL1:FLOW")
{
    field (DTYP, "stream")
    field (FLNK, "ISAC2:N2ANAL1:SCALE")
    field (INP,  "@regexp.proto readFLOW n2anal1")
}

record (stringin, "ISAC2:N2ANAL1:SCALE")
{
    field (DTYP, "stream")
    field (INP,  "@regexp.proto readSCALE n2anal1")
}


#==================< Instrument web page >=======================
# (additional line breaks inserted by e-mail; hopefully this gets
#  displayed as HTML text, and not as a web page)
#
<HTML><HEAD><meta http-equiv="refresh" content="1"></HEAD><BODY bgcolor="#BEF06E"><!-- Affiche le contenu de l'écran -->
<FONT color="black" size="6"><B><PRE>
              &lt;&lt;RUN MODE>>        F4:RET
Measure(PPM): 2.3
Sample flow(CC): 50.0   Scale:0-10   PPM
System status:OK                        </PRE></FONT></B></BODY></HTML>


===============================================================


I am trying to extract the values for 'Measure (PPM)', 'flow(CC)', & 'Scale:'. When the 'FLOW' and 'SCALE' records process, streamDevice reports:


2009/03/04 10:49:59.678 n2anal1 ISAC2:N2ANAL1:PPM: Input "HTTP/1.0 200 OK<0d><0a>Con..." does not match format %.1/^MeasurePM*([0-9.]+)/ 2009/03/04 10:49:59.680 n2anal1 ISAC2:N2ANAL1:FLOW: asynError in read: 142.90.132.71:80 connection closed 2009/03/04 10:49:59.680 n2anal1 ISAC2:N2ANAL1:FLOW: I/O error after reading 0 bytes: ""
2009/03/04 10:49:59.680 n2anal1 ISAC2:N2ANAL1:FLOW: Protocol aborted
2009/03/04 10:50:00.677 timerQueue ISAC2:N2ANAL1:SCALE: I/O error after reading 0 bytes: ""
2009/03/04 10:50:00.677 timerQueue ISAC2:N2ANAL1:SCALE: Protocol aborted

If I add some more backslashes to the regex in front of the parentheses that are embedded in the HTML text, the error messages more closely match the actual HTML text. I can't seem to find a 'right' combination of back-slashes that makes streamDevice and PCRE happy.

Thanks for any insight.

Rod Nussbaumer
ISAC Controls, TRIUMF
Vancouver, Canada.



Replies:
Re: Help with streamDevice parsing HTML Dirk Zimoch
Re: Help with streamDevice parsing HTML Dirk Zimoch

Navigate by Date:
Prev: RE: Variable size frame Jeff Hill
Next: Re: Uncovered gold??? - "Channel Access Client Library Tutorial, R3.13" John Faucett
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
Navigate by Thread:
Prev: New Software Engineer Position at NSCL Eric Berryman
Next: Re: Help with streamDevice parsing HTML Dirk Zimoch
Index: 1994  1995  1996  1997  1998  1999  2000  2001  2002  2003  2004  2005  2006  2007  2008  <20092010  2011  2012  2013  2014  2015  2016  2017  2018  2019  2020  2021  2022  2023  2024 
ANJ, 31 Jan 2014 Valid HTML 4.01! · Home · News · About · Base · Modules · Extensions · Distributions · Download ·
· Search · EPICS V4 · IRMIS · Talk · Bugs · Documents · Links · Licensing ·