[Xrays@aps.anl.gov] 2006 XSD Scientific Software Workshop User Survey
Sergey Stepanov
sstepanov at anl.gov
Thu Jun 22 01:56:31 CDT 2006
> In order to prepare for this workshop we would like your input on
> what you see as the needs and opportunities for scientific
> software development at the APS and in the X-ray community, as
> well as information that would support making a funding proposal
> for such resources.
The need for scientific software can be proven by:
-- the statistics of X-Ray Server (http://sergey.gmca.aps.anl.gov) that
I have been running at the ANL since 1997. Although the Server
provides scientific software for a small subset of diverse X-ray
studies carried at the APS (namely for high-resolution diffraction
and scattering in the field of material science), it has got close
to 130,000 calculations requests from about 5,000 researches with
about 1,500 regular users who submitted ten or more jobs.
-- the existence of scientific software projects at similar facilities,
for example:
(a) the 9-member Scientific Software Group at the ESRF
(www.esrf.fr/UsersAndScience/Experiments/TBS/SciSoft/Members/).
The group is authoring widely used software packages like XOP
and FI2D.
(b) the DANSE (http://wiki.cacr.caltech.edu/danse/) scientific
software project at the Spallation Neutron Source for which ORNL
requested $15M five-year grant and has already received some of
those money.
(c) the popular software project at the LBL Center for X-ray Optics
(CXRO), see: http://www.cxro.lbl.gov/optical_constants/
(d) the CCP4 (Collaborative Computational Project Number 4) in
Protein Crystallography at Daresbury laboratory in UK
(http://www.ccp4.ac.uk).
-- existence of many other scattered resources for scientific computing,
e.g. NIST and LLNL databases, SHADOW at the University of Wisconsin,
BioSAXS software at EMBL Hamburg, and the attempts of the
International Union for Crystallography to systemize them; see very
long lists maintained by the IUCr at
http://www.iucr.org/sincris-top/logiciel/
and
http://www.iucr.org/iucr-top/data/
> In particular:
>
> 1. What are the limitations of current tools for
> x-ray data reduction, analysis, modeling, and simulation?
> 2. What additional tools are needed?
>
> 3. How can the existing tools be improved?
>
Here I would suggest to draw a rough distinction between mostly software
projects and those were the physical model is the dominant part in the
development.
The first group would comprise software tools and databases that are
based on well established algorithms and models. The examples could
be some scientific visualization and analysis software like GRACE or
FIT2D, databases of X-ray scattering factors, many macromolecular
crystallography packages, and etc. The limitations in this group are:
sometimes lack of good interfaces, installation difficulties on
different computer platforms, poor documentation and the need of
remote access in some cases. The improvements in this area would
mostly require software engineering effort.
To the second group I would refer modelling and data reduction software
based on recent or ongoing research projects. This is very challenging
part in terms of making such software available to the APS community
(see below about the difficulties), but no doubts that building such a
pipeline between the most recent theoretical research and the
experimental community at the APS would be the great way to improve
the productivity of experiments at the APS.
Questions 1 to 3 in the survey mostly apply to the first group for
which they are certainly important. With the second group those
questions cannot be answered because the "tools" are not known yet --
they may be just emerging or may appear only tomorrow and no one can
list what has not been discovered.
The most important question for the second group is how to work out a
framework for quick interfacing new emerging scientific software tools.
Some attempts of that kind have been made within the DANSE project.
Namely, they suggested to wrap pieces of data analysis software written
by different researches into Python scripts and that way to link them
together even between different computer systems. Thus, the original
data analysis code would not have to be rewritten from original
language (e.g. Fortran or C) or ported from original operating system
(e.g. Unix, or Windows, or Mac). This sounds attractive, but the
procedure seems to presume that the scientific algorithm is something
fully settled and the DANSE team only needs to replace the I/O
interface. However, most of data analysis software is based on physical
models and approximations that have their limitations and those
limitations are not always clear until one starts getting some weird
results. This is a critical distinction from software implementing pure
mathematical algorithms (e.g. in crystallography). So, the programmer
who modifies the original code will never be able to figure out what is
wrong: he may see that the calculated reflection coefficient is e.g.
25, but it would not tell him anything because the "formula" was
programmed correctly!
Therefore, I would suggest creating a framework with a closer
involvement of original developers into providing a common interface to
their software. The Scientific Software group that should be created at
the APS needs to consist of both software engineers and X-ray
theorists. The latter are needed in order to understand the models.
That group could develop some scientific software on its own, but it
should also closely work with each scientific software provider on
individual basis helping him/her to adapt his/her software to a common
interface that needs to be defined within the framework. The ultimate
goal should be to preserve the link between the original developer and
the modified code, so that he/her would still have it under complete
control, be able to monitor usage/bugs and refine the code by himself.
I have very strong proofs based on my long term experience that
preserving these links is vitally important for making the scientific
software project efficient.
Thanks,
Sergey Stepanov
--
Sergey Stepanov, Ph.D.
Staff Scientist, GM/CA Collaborative Access Team,
Advanced Photon Source, Argonne National Laboratory
Email: sstepanov at anl.gov Voice: +1(630)252-0664
http://sergey.gmca.aps.anl.gov Fax: +1(630)252-0667
More information about the XRAYS
mailing list