ECCE Version 1.4.2 Release Notes
Note: It is not possible to use previous versions of ECCE after the release of version 1.4.2. All user databases were upgraded to a new format compatible with an upgrade of the database server software. See the "What's new" section for more information.
The intent of this page is to provide information specific to version 1.4.2 of ECCE released June 11, 1999. Except as mentioned herein release notes from previous versions of ECCE still apply so please do not treat this as standalone documentation.
Release Notes for Previous Versions
- V1.4.1 Release Notes - January 15, 1999
- Version V1.4 Release Notes - October 8, 1998
- Version V1.3 Release Notes - June 19, 1998
- Version V1.2 Release Notes - March 26, 1998
What's new? Known bugs What's fixed?
What's new?
- All user databases were upgraded to support the latest release of the ObjectStore database server.
- SGI IRIX 6.5 and Sun Solaris 2.6 are now supported.
- The oc_server root daemon on compute resources has been eliminated.
- Job monitoring communication now takes place via a remote shell connection.
- Job monitoring now relies on the queue or process status of the job rather than solely the output file.
- Zero footprint installs on compute resources
- Login environment file setup on compute resources is no longer necessary.
- The NERSC Cray, mcurie.nersc.gov, is now supported using telnet to launch and monitor jobs.
- ECCE calculation states have changed.
- The Calculation Manager now has an option for terminating submitted or running jobs.
- Import capability for Gaussian 9X and NWChem jobs improved.
- Improved input file generation for Gaussian 9X frozen core calculations.
- There has been a manyfold speed-up in importing calculations.
- Job monitoring performance has been improved.
- The transparency option in the builder and calculation viewer have been changed.
- A Depth Cueing menu toggle has been added to the Calculation Viewer under the "Display" menu.
- Calculation Editor main window tweaked slightly.
- Automated database inspections have been added to catch problems earlier.
Known bugs
- Project databases currently do not work between the Sun and SGI versions of ECCE V1.4.2.
- DFT functionals not found on imports
- Changes/bugs in new versions of third party widget sets.
- Calculation Manager Scroll Bars
- Globus is not supported for ECCE V1.4.2
What's fixed?
- MO's included in imports of Gaussian 9X calculations
- Occupation numbers on Gaussian 9X calculations are now correct for linear systems.
- Miscellaneous problems associated with importing xyz files have been fixed.
- Frozen core orbital calculation in Calculation Editor is fixed.
- Jobs submitted to batch queues no longer have a job monitoring timeout.
- A bug that prevented the deletion of personal structure library databases has been fixed.
- A personal project database can now be deleted even if it is currently in use.
- Setting a password on the Launcher main window now works.
- File selection dialogs no longer auto-resize.
- The installation and configuration scripts have had several fixes/improvements.
What's new? Top
- All user databases were upgraded to support the latest release of the ObjectStore database server. The upgrade changes the internal format of databases and they are no longer compatible with previous releases of the ObjectStore server and thus previous releases of ECCE. This upgrade was needed to support new operating systems, meet Y2K readiness criteria, and be up to date with the currently supported version of ObjectStore. All users will be running V1.4.2 of ECCE beginning June 11, 1999. No action is necessary to use this new version since the database upgrades were done during the down time for upgrading the database server. If you have problems starting ECCE V1.4.2 please check your .mycshrc to insure you are sourcing /msrc/share/ecce/scripts/runtime_setup. If you are still having difficulty or if you find problems with accessing or using your project databases within the Calculation Manager contact ecce-support@emsl.pnl.gov.
- SGI IRIX 6.5 and Solaris 2.6 are now supported. To accomplish this all third party products used by ECCE for the database, user interface, and visualization were upgraded to the most recent release. For SGI this meant switching to the new "n32" object file format for these products and ECCE. Unfortunately lack of support for older versions of IRIX by our database, ObjectStore, has led to ECCE dropping support for IRIX 6.2, 6.3, and 6.4. All were extensively tested with the hope that they would work but each exhibited some degree of problems traced back to the database. Support of the new operating systems along with documented testing makes ECCE Year 2000 ready.
- The oc_server root daemon on compute resources has been eliminated. The replacement for oc_server, eccejobmonitor, is only run when monitoring a running job. This lessens administration on compute resources by not having to start oc_server in the machine boot process and more importantly removes a perceived security loophole that made system administrators reluctant to allow ECCE jobs to run on their machines.
- Job monitoring communication now takes place via a remote shell connection. The compute server connection using either ssh, rsh, or telnet can now be used to send back data from the running job. In the case of ssh, the preferred remote shell when available, all data is encrypted rather than being sent over as clear text. The use of a remote shell rather than the previous socket based design is the main contributor to eliminating the need for a root daemon which was required for user authentication purposes. A variant of the previous socket implementation where all properties parsed from the running job output file are sent over a socket instead of the remote shell connection is still supported although user authentication is still done via a remote shell connection. Performance benchmarking of high data volume communication over a remote shell connection vs. a socket indicated that there was no discernable difference to dictate that one type should be used over the other even with encrypted data sent via ssh.
- Job monitoring now relies on the queue or process status of the job rather than solely the output file. This allows much more robust job monitoring because many "out of the ordinary" events may happen as a job is run that aren't reflected in the output file. During the entire execution of the job its status is monitored in addition to monitoring the output file for properties. If the job suddenly disappears it is correctly reported back to the machine where the ECCE client applications are being run so the run state can be set to "incomplete". This alleviates the frustration from previous releases of ECCE where jobs would remain in a submitted or running state forever even though a manual check of the job on the compute resource indicates it no longer exists. For queued machines it also allows a more reliable means of recognizing that the job has gone from submitted to running.
- Installs are no longer required on compute resources. Thus ECCE has no footprint on machines where jobs will be run but ECCE client GUIs will not. Everything needed to run and monitor jobs for ECCE is shipped to the compute resource as a job is launched including the eccejobmonitor script itself. Upon completion of the job these files are removed and only the files associated with the run itself are left. The minimal extra time needed to transfer these files during the launch process is negligible compared with the benefit of removing all administration duties from these machines which often have tight security access restrictions.
- Login environment file setup on compute resources is no longer necessary. Previously there was a C shell setup script that had to be sourced from either .cshrc or .mycshrc for any account where ECCE jobs were run. The lines to source this file can be removed. All information needed to run jobs on behalf of ECCE is now gathered during by the machine registration process performed by the configsvrs application.
- The NERSC Cray, mcurie.nersc.gov, is now supported using telet to launch and monitor jobs. For ECCE V1.4.1 launches to mcurie were possible only using Globus. Even then they were disabled shortly after a conference last year where there were demonstrations of ECCE working with Globus. The system administrators would not allow the oc_server root daemon to be run on the Cray. They are also reluctant to run the root daemon process that Globus requires. With ECCE V1.4.2 being able to launch jobs without a root daemon process to monitor them and using the ubiquitous remote shell telnet we now support mcurie completely within ECCE. Given some machine registration any compute resource accessible by telnet, ssh, or rsh that runs either LoadLeveler (with or without the Maui filter) or NQE/NQS (or no queue manager for a regular UNIX workstation) with Perl available can be used by ECCE for running jobs. To the compute resource ECCE places no more requirements on it than a user logging in interactively, submitting a job, and running a script to grab output as it becomes available.
- ECCE calculation states have changed. The "unconverged" option is no longer supported. This is primarily due to computational codes inconsistently reporting this information themselves. Instead, all calculations that fail either due to system failures, non-convergence, or application failures are reported as incomplete. In a future release, there will be a distinction between system and application failures. In addition, the killed state is now supported and is represented by a coffin icon. The skinny lopsided look for the diamond run states is also gone.
- The Calculation Manager now has an option for terminating submitted or running jobs. Select the "Terminate Calculation" menu item under the "Run Mgmt" menu and the selected calculation will be killed. When a job is terminated, either through ECCE or otherwise, the state will change to the killed state. It may take up to 30 seconds for the state to change to killed. This is necessary to verify that the job was successfully terminated on the compute resource.
Terminating a calculation always pops up a confirmation dialog to recover from inadvertent selection. The order of items in the Run Management menu has been changed so the more common, less destructive items are listed first followed by the more drastic items with separators between them. - The import capability for both the Gaussian and NWChem electronic
structure codes has been extensively revamped. Both codes should now
correctly identify all theories and runtypes supported by ECCE
and will be able to import files in most cases for viewing using the
Calculation Viewer.
The parsing of Gaussian 9X output files has been improved so that all theories are correctly identified, including all semi-empirical theories. If the import still fails, it can probably be made to succeed by editing a copy of the output file so that the route card is similar to the route card generated by ECCE. The geometry import capability has also been improved so that more imports should succeed.
The import capability for output from NWChem was also improved. It should now be able to identify CCSD(T) theories and it will now find a viable geometry even if the input geometry is set up using the symmetry card in the geometry block. - Some minor improvements have also been made in the input file generation for Gaussian 9X codes. The input decks for Gaussian codes previously used the RW keyword in the route card to specify that a calculation using frozen core orbitals was to be performed, even if the default number of orbitals was being used. The current release will use the keyword FC if the default number of core orbitals is being used. If a non-default value is used, the RW keyword is used, along with the specific number of orbitals that should be frozen.
- There has been a manyfold speed-up in importing calculations. There are two major contributing factors. First, property parsing is no longer done by inefficiently building up strings (sometimes many megabytes in size) in memory but rather by directly parsing into files. Second, the separate "system" call from ECCE to parse each property (hundreds or possibly thousands of calls) has been replaced by a much more efficient communication channel.
- Job monitoring performance has been improved. A new communication protocol between eccejobmonitor and eccejobstore (the client side application receiving properties and parsing them into the database) based on sending the size of data packets accounts for most of the speed up. Previously the client side monitoring application (was oc_client) read data sent over the socket byte by byte until a delimiter was read indicating the end of a property. The read has now been optimized to process the entire property in a single chunk (or as fast as it can be sent across by the compute server). Client side monitoring by eccejobstore also benefits from the same improvements as importing calculations by removing the separate "system" calls to parse each individual property. These changes are good for a performance increase on the order of 30% for data intensive calculations. Calculations which spend most of the time grinding numbers with little output will benefit less. These changes also reduce the CPU load of eccejobstore vs. the old oc_client.
- The transparency option in the builder and calculation viewer have been changed. They now provide a true transparency and are turned on and off via a menu toggle. With the new transparency colors of overlapping objects are blended. Because this can create some confusion when looking at complex molecular orbital displays, the old transparency style is used when viewing MOs.
- A Depth Cueing menu toggle has been added to the Calculation Viewer under the "Display" menu. Depth Cueing uses an atmospheric fog effect that causes atoms to fade into the background based on their distance from the camera. The current implementation uses linear fog.
- Calculation Editor main window tweaked slightly. The middle panel with the code, theory, and runtype menus has been reorganized to reduce the overall size of the main window and better imply the dependence of available runtypes on both the code and theory. Additionally it is possible to have more than one simultaneous Final Edit session which is useful for comparing changes in detail dialog field settings in input files. The Calculation Editor can now be quit with open Final Edit sessions.
- Automated database inspections have been added to catch problems earlier. We are attempting to discover and repair inconsistencies in user databases before they cause loss of data or runtime failures.
Known bugs Top
- Project databases currently do not work between the Sun and SGI versions of ECCE V1.4.2. Due to a serious defect in how the ObjectStore database software manages certain types of data on the SGI platform, it is not possible to use the same project database from both a Sun and an SGI. If you only run ECCE from your desktop workstation or any single workstation, regardless of whether it is a Sun or SGI, this will not impact you. However, if you switch back and forth between Suns and SGIs then this is a significant restriction. Project databases can only be used from the platform they were initially created on. During the database upgrade process the primary platform of each user was determined so there would be no difficulty in using project databases initially. It is possible to use ECCE V1.4.2 on both the Sun and SGI by having different project databases for each platform. The result of attempting to use a database on a platform other than which it was created is that applications such as the Calculation Manager, Calculation Editor, and Job Launcher will routinely crash with cryptic error messages to the console written by the ObjectStore database. We consider this issue critical and are currently working with ObjectStore to get this problem resolved for their next release.
- The imports for both Gaussian 9X and NWChem do not pick up the correlation and exchange functionals for DFT calculations.
- There are some bugs in the new versions of the third party widget sets ECCE uses. We have reported these problems to the vendors but do not anticipate near term solutions. Work arounds have been created where possible. Problems and work arounds are described below.
- The user interface panels in Calculation Viewer appear truncated.
To correct this, manually adjust the size of the left pane by grabbing and moving the small green square near the bottom of the vertical line that separates the left pane from the viewer area on the right and dragging it a pixel or two to the right or left. - The user interface panels in the Calculation Viewer are
sized incorrectly when switching between calculations.
The only way to correct this is to quit the Calculation Viewer and restart it. The problem typically shows up when you switch between calculations with different numbers of atoms. - It has been found in the Calculation Manager with databases containing over 100 calculations that the right hand scroll bar may not allow viewing of the last two or three projects in the database. To work around this bug simply click and hold with the left mouse button on the teal "panner" box in the lower left corner of the Calculation Manager moving it down until the bottom projects are visible.
- Globus is not supported for ECCE V1.4.2. Due to the complete
rearchitecture of job monitoring it was no longer possible to allow Globus
to be used to launch jobs and other remote communication. The Globus
architecture only allows individual remote commands to be issued rather than
establishing a communication channel for any number of commands as is the
case with standard remote shells including ssh, rsh, and telnet. The
Globus team has more recently seen the need for a persistent connection
to compute resources and is implementing this approach. The next release
of ECCE will support this new version of Globus. This will hopefully allow
sharing of the same remote communications code used for ssh, rsh, and telnet
instead of all the "special case" code that was not required in ECCE V1.4.1.
Because Globus will be added back it was not taken out of current machine registration information. If Globus is selected as the remote shell for configuring a machine or launching a warning message will be printed to the feedback area at the bottom of the window and another remote shell must be selected.
What's fixed? Top
- Imports of Gaussian 9X files should also now pick up the molecular orbitals. This apparently was not happening in the past, although no one appears to have spotted the problem until recently.
- The occupation numbers for calculations using Gaussian 9X were not always being correctly assigned to orbitals for linear systems. This was caused by problems with the Gaussian 9X output, which was dropping the occupation number information for orbitals with Delta symmetry. The problem has now been fixed and ECCE should list the correct occupation numbers for all calculations.
- Miscellaneous problems associated with importing xyz files have been fixed. The previous implementation relied on babel. The new implementation no longer uses babel. Problems with parsing certain atomic symbols should no longer occur. In addition, the use of atomic numbers and/or symbols is now supported.
- Frozen core orbital calculation in the Calculation Editor is fixed. Previously the Calculation Editor would reset the number of frozen core orbitals back to the default by dragging and dropping a calculation into the Calculation Editor even when the user had overriden the default. There is now a button to manually reset the number of orbitals back to the default. Additionally a change to the code, theory, or runtype will reset the number of frozen core orbitals to the default.
- Jobs submitted to batch queues no longer have a job monitoring timeout. Since the eccejobmonitor script now monitors the queue status as well as the output file there is no reason to associate a maximum time allowed for the job to go from submitted to running. As long as the job continues in an "idle" or "wait" state on the queue it will remain in the submitted state in ECCE. Previously there was a three day timeout period where any job remaining idle for that period would be set to "incomplete" ("failed" in V1.4.1) and job monitoring would not be done even though it is perfectly likely the job would eventually run.
- A bug that prevented the deletion of personal structure library databases has been fixed.
- A personal project database can now be deleted even if it is currently in use.
- Setting a password on the Launcher main window now works. Previously only the password specified in the "Configure Machine" dialog window would be used regardless of the value in the main window password field. It is now possible to override the password (or more likely, the username and the password) for a single launch without changing the configured password.
- File selection dialogs no longer auto-resize. Previous file selection dialogs such as for importing calculations and importing/exporting chemical systems would reset to a fixed size whenever a new file filter was applied. Now if they are manually resized they will remain at the selected size until dismissed.
- The installation and configuration scripts have had several fixes/improvements. The configuration scripts have undergone improvements to the data entry interface, along with some file modification fixes. ECCE packaging and distribution, and machine registration has been fixed and refined for offsite.