Final

Software User's Manual

 

for the

OSD CALS IWSDB PROJECT

An MVP Joint Venture

 

December 2, 1994

 

Submitted by

ManTech International Corporation
Technology Applications Operations Center
1313 Locust Avenue
Fairmont, West Virginia

 

In support of

Contract DAAB10-89-D-0503

And in compliance with

CDRL Sequence Number A009

SOW Numbers 3.1.1, 3.1.2

 



 

______________________ ______________________
Robert S. KidwellJack G. Richman
Technical Director ManTech International Corp.
OSD CALSOSD CALS Project Manager

 

 

TABLE OF CONTENTS

   
[ Next ]      [Home ]

 

LIST OF FIGURES

1.0  SCOPE

2.0  REPOSITORY INFRASTRUCTURE

2.1  Multiple Platforms for Browsers

2.2  Multiple Platforms for Servers

2.3  Fast, Interactive, and Highly Intuitive Information Navigation

2.4  Communication through Current Standards

2.5  Design Provisions for the Transfer of a Variety of Data Formats

3.0  OVERVIEW OF THE WORLD WIDE WEB

4.0  HTML AND URLs

5.0  WORLD WIDE WEB AND OTHER INFORMATION DISCOVERY SERVICES

5.1  Netscape

5.2  NCSA Mosaic

5.3 Cello

5.4  WinWeb and MacWeb

5.5  Lynx

5.6  Recommended Browsers

6.0  IMPLEMENTATION

6.1  Common Gateway Interface (CGI)

6.2  Multimedia Oriented Repository Environment (MORE)

6.3  CALS Test and Validation Tool Repository

APPENDIX A:  MULTIMEDIA ORIENTED REPOSITORY ENVIRONMENT (MORE)

APPENDIX B:  NCSA MOSAIC

APPENDIX C:  NETSCAPE

 

 

LIST OF FIGURES

      
[ Previous ]           [ Next ]           [ Home ]

 

Figure 1.0-1  Multi-Platform Access to CALS Test and Validation Tool Repository

Figure 6.3-1  OSD CALS Repository

Figure 6.3-2  CALS Programs

Figure 6.3-3  CALS Services

Figure 6.3-4  CALS Standards

Figure 6.3-5  OSD CALS IWSDB Program

Figure 6.3-6  Reuse Programs

Figure 6.3-7  Testing Tools

 

 

1.0  SCOPE

      
[ Previous ]           [ Next ]           [ Home ]

 

After surveying information discovery resources, we have decided on the World Wide Web (WWW) concept as the foundation for the repository infrastructure, because information is distributed at different geographical locations such as the National Institute of Standards and Technology (NIST), Continuous Acquisition and Life-Cycle Support (CALS) Test Network (CTN), CALS Shared Resource Centers (CSRC), etc. Also, such an implementation would provide the user virtual access to data. The hypertext and hypermedia interface would link all the information sources, and the entire system would be a virtual web of CALS test and validation tools information. A major factor leading to the selection of the WWW is the extent of hardware and software support available for the WWW and the relative simplicity of linking to such a vast web of information.

It is essential to note those existing programs that are intended to serve the needs of the CALS user community. To this end, our plan espouses an open and virtual repository that will provide seamless access (wherever possible) to existing/emerging services within the CALS domain. Thus, the users from multiple platforms can be provided with a set of comprehensive services without incurring unnecessary costs from duplication. The following Figure 1.0-1 depicts a basic scenario for the virtual CALS test and validation tools reuse repository using multiple platforms with the WWW.

 


Figure 1.0-1  Multi-Platform Access to CALS Test and Validation Tool Repository

 

 

2.0  REPOSITORY INFRASTRUCTURE

      
[ Previous ]           [ Next ]           [ Home ]

 

The WWW has been chosen as the front end of the CALS test and validation tools repository for many reasons. Many companies and organizations have set up WWW servers that allow them to distribute hypermedia documents across the Internet and local networks. The essential reasons for choosing the WWW are:

Multiple Platforms for Browsers.

Multiple Platforms for Servers.

Fast, Interactive, and Highly Intuitive Information Navigation.

Communication through Current Standards.

Design Provisions for the Transfer of a Variety of Data Formats.

 

2.1  Multiple Platforms for Browsers

The WWW is designed to convey information in a device independent manner. Hypermedia documents can be shown on many different types of computers. In other words, documents on the WWW are shown to users in a way that is best suited to their display. Documents with graphics and formatting may be shown in full color on an X-terminal, while only the basic formatting and text will be shown on a text-only screen. Section 5.0 contains a comprehensive list of the browsers and the host platforms currently available.

 

2.2  Multiple Platforms for Servers

WWW server software is also available for different types of hardware platforms. This makes it easier to establish a WWW server without the need for specific hardware requirements.

 

2.3  Fast, Interactive, and Highly Intuitive Information Navigation

Documents on the WWW contain hot-spots in the form of hyperlinks. Forms and menus allow users to search quickly through archives and documents. Multimedia presentations, on-line forms, and database interfaces can be created so that users can interact and control information rather than just view it. Such a facility is well-suited to the test and validation repository where information transfer and modification need to be performed.

 

2.4  Communication through Current Standards

Most WWW clients and servers have been designed to communicate using Transmission Control Protocol/Internet Protocol (TCP/IP). Many web clients also speak the Gopher protocol, File Transfer Protocol (FTP), and Network News Transfer Protocol (NNTP) so that hypermedia browsers can also act as Gopher, FTP, and USENET client applications.

 

2.5  Design Provisions for the Transfer of a Variety of Data Formats

Audio, video, extended character sets, and interactive hot-spot graphics are some of the fundamental types of data that can be transferred using the WWW. There are actually no barriers to the types of data sent by the server as long as the client or browser has an application associated with the file data type to read it. This important feature can be taken advantage of for file data types that are not a standard as of yet. As and when the formats become pertinent, the clients can have the respective software to access up the file. This feature has implications in the domain of CALS test and validation tools. Data files for Standard Generalized Markup Language (SGML), Computer Graphics Metafile (CGM), RASTER, and Initial Graphics Exchange Specification (IGES) can be ported across the Internet while keeping the file format intact. The browser will then submit the files to the respective application.

Our team has successfully installed WWW servers on a variety of platforms at the ManTech Services Corporation (MSC) office complex in Fairmont, West Virginia. The server software was installed and checked on UNIX, Linux, and Windows-NT host platforms. The server software was obtained by anonymous FTP to the National Center for Supercomputing Applications (NCSA). The Mosaic WWW browsers also were installed on our computers. Preliminary trial front end applications for a CALS test and validation tools repository were developed in the Hypertext Markup Language (HTML). The servers were checked by allowing persons to login to the HyperText Transport Protocol (HTTP) WWW server from external nodes on the Internet. Further development is underway.

 

 

3.0  OVERVIEW OF THE WORLD WIDE WEB

      
[ Previous ]           [ Next ]           [ Home ]

 

The WWW is officially described as a "wide area hypermedia information retrieval initiative aimed at giving global access to a large universe of documents." The WWW provides users on computer networks a consistent means of accessing a variety of media in a simplified manner.

The WWW relies mainly on hypertext to interact with users. Hypertext is basically the same as regular text because it can be stored, read, searched, or edited. However, an important exception is that hypertext contains connections within the text to other documents. These connections are known as hyperlinks that create pathways to other documents and links. Hyperlinks, therefore, facilitate an intricate and complex web of connections between documents. Hypermedia is hypertext with a difference:  hypermedia documents contain links not only to other pieces of text, but also to other forms of media like sounds, images, and movies. In other words, hypermedia simply combines hypertext and multimedia.

The WWW is used on the Internet. The WWW refers to a whole body of information (an abstract space of knowledge), while the Internet refers to the physical side of the global network (the giant mass of cables and computers). The WWW uses the Internet to transmit hypermedia between computer users internationally. People and organizations are responsible for the documents they author and make publicly available to millions of users around the world on the Internet.

The WWW project was initiated in March 1989, by a collective of European high energy physics researchers who proposed the project to be used as a means of transporting research documents and ideas effectively throughout their international organization. The initial project proposal outlined a simple system of using networked hypertext to transmit documents. Since then, hundreds of people around the globe have participated in contributing their time to the writing of WWW software. The project reached global proportions and is now one of the premier forms of information exchange in the world. It is extremely popular because of its simplicity of use and excellent user interface capacity. The WWW offers a very user-friendly interface to the traditionally hard to master resources of the Internet. It is probably this ease of use as well as the popularity of many graphical interfaces to the WWW that caused the explosion of the WWW traffic in 1993. The potential of using networked hypertext and multimedia has prompted many users to create and explore countless innovative applications on the Internet.

The WWW software is designed around a distributed client-server architecture. The WWW client (also called a WWW Browser) is a program that can send requests for a document to any WWW server. A WWW server is a program that, upon receipt of a request, sends the document back to the requesting client. Using a distributed architecture means that a client program may be running on a completely different and separate machine from the server. The task of document storage is left to the server and that of retrieval is local to the client. Therefore, each program can concentrate on its individual duties and work independently of each other. Because servers operate only when documents are requested, they put a minimal amount of workload on the computers they run on for each transaction. The WWW is composed of thousands of these virtual transactions taking place per hour throughout the world, creating a web of information flow.

The language that WWW clients and servers use for communications is called the HTTP. The standard language used for creating and recognizing hypermedia documents is the HTML. It is loosely related to the SGML. Start and end tags (markup) that precede and follow each logical portion of a document are added to the standard text to represent different attributes of the enclosed text. The entities are defined by the HTML Document Type Definition (DTD). The HTML DTD defines the elements and tags of the SGML subset. WWW documents are written in HTML. HTML documents are nothing more than standard American Standard Code for Information Interchange (ASCII) files with formatting codes that contain information about layout (text styles, document styles, paragraphs, lists) and, of course, hyperlinks. A new version of HTML called HTML+ will be ready by the end of 1994 and will support more interactive forms, hot-spots in images, more versatile layout, and formatting options and formatted tables. Related to HTML is HyTime, a proposed standard language for representing the structure of multimedia, hypertext, hypermedia, and time and space­based documents. HyTime, Hypermedia/Time­based Document Structuring Language is also based on SGML as information content is packaged using standard markup. HyTime was basically created for the digital information publishing industry. HyTime can be used to integrate the information management of a multifaceted enterprise, and to facilitate concurrent and collaborative engineering.

The WWW uses Uniform Resource Locators (URL) to represent hypermedia links and links to network services within HTML documents. It is possible to represent any file or service on the Internet with a URL. Most WWW browsers allow the user to specify a URL and connect to that document or service. When a hyperlink is selected within a document, the user is actually sending a request to open a URL. In this way, hyperlinks can be made not only to other texts and media but also to other network services. WWW browsers are typically not just WWW clients but are full­featured FTP, Gopher, and Telnet clients. HTML+ will allow the use of URLs to perform E-Mail functions automatically.

The WWW exists virtually. There is no standard way of viewing it or navigating around it. Many software interfaces to the WWW exist that have similar functions and work in the same way, no matter what computer or type of display is used. As mentioned earlier, a WWW browser is used to navigate a user through the WWW. The browser is a software program on any computer with a graphical or textual interface, such as a Macintosh, an X-Windows systems, or even an IBM-compatible with Windows. WWW clients or browsers are available for different varieties of platforms and environments.

The explosive growth of the WWW brings close to realization the concept of everyone having access to any document, sound recording, or video image on a computer screen. The underlying principle of the WWW is a simple concept:  the combination of a HTML and the URL. The HTML file can be created with any standard word processor adding special HTML pointers, markers, and style tags. The encoded information is used by the user's WWW software to interpret the layout and style and to make internal and external hyperlinks. The URL addresses enable the HTML to link to any available resource around the Internet.

 

 

4.0  HTML AND URLS

      
[ Previous ]           [ Next ]           [ Home ]

 

HTML is a markup language related to SGML. SGML grew out of a decade of work addressing the need for capturing the logical elements of documents as opposed to the processing functions to be performed on those elements. SGML is essentially an extensible document description language, based on a notation for embedding tags into the body of a document's text. The SGML specification is defined by the International Organization for Standardization (ISO) 8879. The markup structure permitted for each class of documents is defined by an SGML DTD. It is impractical to design a DTD to meet the needs of all possible users. Instead, the markup has been tailored to the needs of a specific community.

In HTML, the need to support a wide range of display types and to keep browser software as simple as possible limits the complexity that can be handled. Text is tagged with marks that pass information to a software reader. The readers use the tag information to format the text and other media for the viewer. As mentioned earlier, HTML is soon to be superseded by HTML+, which contains additional features and enhancements. HTML+ has grown out of several years of experience with the HTML document format of the WWW community. Browser writers are experimenting with extensions to HTML, and it is appropriate to draw these ideas together into a revised document format. The new format is designed to allow a gradual rollover from HTML, adding features like tables, captioned figures, and fill-out forms for querying remote databases or mailing questionnaires.

 

 

5.0  WORLD WIDE WEB AND OTHER INFORMATION DISCOVERY SERVICES

      
[ Previous ]           [ Next ]           [ Home ]

 

The WWW is intuitive in use and exists on many platforms, from simple text-based browsers to full graphical implementations. (See Table 5.0-1.) It is an excellent way of cruising the information space offered by the Internet resources. The WWW is currently expanding at an estimated 4 times as fast as the Internet explosion in the late 1980's.

The WWW is an attempt to unify the enormous amount of information available from the global networks, with a very easy-to-use front end interface and a server protocol. It may be thought of as a single tool to combine individual network tools, dispensing with the need to run these tools independently. Some of the more widely used tools are Anonymous FTP, Archie, Wide Area Information Server (WAIS), and Gopher.

 

Table 5.0-1  World Wide Web Browsers

Text Only Browsers
Graphical Interface Browsers
Dumb terminal, any UNIX platform
Sun 4/Sun OS 4.1.x
Text only, Sun OS, IBM AIX, DEC Ultrix, VAX Multinet
Silicon Graphics IRIX 4.x
Macintosh text-only, Mac SE's Sys 7.x
VMS
Perl
Linux
emacs
DEC MIPS Ultrix, DEC Alpha AXP, OSF/1
 
IBM RS/6000, AIX 3.2
 
HP 9000/700, HP/UX 9.x
 
NeXT, NeXTStep 3.0
 
Commodore Amiga, AmigaOS 3.0
 
IBM compatibles, 386 and above with
Windows 3.1
 
Macintosh computers System 7.x
 
Power Macintosh

 

5.1  Netscape

Netscape (TM) network navigator is a WWW browser free to users via the Internet. Netscape was released by Netscape Communications Corporation on October 12, 1994. It is essentially a more robust, optimized version of NCSA Mosaic by the original authors. Netscape is presently the most stable browser we have evaluated and it delivers security features such as encryption and server authentication. When paired with Netsite Commerce Server, Netscape lets users take advantage of such commercial services as on-line publications, financial services and interactive purchasing. The Final Test Tool Repository demonstration will utilize Netscape navigator as its WWW browser.

Available for all popular desktops environments, Netscape is a powerful commercial navigator for the Internet, offering point-and-click network navigation. It is optimized to run smoothly over 14.4 kilobit/second modems as well as higher bandwidth lines. Netscape claims to deliver performance of up to ten times that of other network browsers. Netscape provides a common feature set and graphical user interface across computers running the Microsoft Windows, Macintosh, or X-Windows System operating environments.

 

5.2  NCSA Mosaic

NCSA Mosaic is an Internet information browser and a WWW client. Mosaic was developed at the National Center for Supercomputing Applications at the University of Illinois, Urbana-Champaign.

With Mosaic, the user may notice highlighted text or phrases. Each highlighted phrase (in color or underlined) is a hyperlink to another document or information resource somewhere on the Internet. If the user single clicks on any highlighted phrase, the link will be established. NCSA Mosaic comes in three flavors:

NCSA Mosaic for the X Window System,

NCSA Mosaic for the Apple Macintosh, and

Mosaic for Microsoft Windows.

 

5.3  Cello

Cello is a multipurpose Internet browser that permits you to access information from many sources in many formats. Technically, it is a WWW client application. This means that Cello can access data from WWW, Gopher, FTP, and CSO/ph/qi servers, as well as from X.500 directory servers, WAIS servers, HYTELNET, TechInfo, and others through external gateways. Cello and the WWW-HTML hypertext markup standard can build local hypertext systems on Local Area Networks (LANs), on single machines, and so on. Cello also permits the postprocessing of any file with an association in the Windows File Manager. For example, if the user downloads an uncompressed Microsoft Word file from an FTP site, and if the appropriate association exists in File Manager, Cello will run MS-Word on it. This same capability is used to view graphics and listen to sound files from the Net.

 

5.4  WinWeb and MacWeb

WinWeb and MacWeb are full-featured, WWW browsers for the Microsoft Windows and Macintosh operating systems. WinWeb/MacWeb, a vehicle used for exploring Internet resources all over the world, was developed at the Microelectronics and Computer Technology Corporation (MCC) by the Enterprise Integration Network Group.

 

5.5  Lynx

Lynx grew out of efforts to build a campus-wide information system (CWIS) at The University of Kansas. Lynx clients provide a user-friendly hypertext interface for users on a variety of platforms, and allow information providers to publish information located on any platform that can run a Gopher, HTTP, FTP, WAIS, or NNTP server. Providers retain complete control over their information but may receive comments and suggestions from users running Lynx on client systems. Lynx was written by Lou Montulli at The University of Kansas

 

5.6  Recommended Browsers

The following recommendations are based on our use of the various browsers for our research as well as such factors as stability, user friendliness, availability, platforms supported, etc. Internet browsers, as a class of software, are very dynamic with new versions, point releases, and entirely new products being added on a regular basis. These recommendations, therefore, are for this point in time only and specifically for use on the CALS Test Tool Repository.

Netscape (aka Mozilla)

Released by Netscape Communications Corporation on October 12, 1994, there are free versions available for Unix (SGI, AIX, SunOS, HP, Solaris, AXP), Macintosh, and Microsoft Windows. It is the most stable browser currently available.

Most Recent Version: 0.9 Beta.

Known Bugs:

Text entry fields sometimes are not selectable; try resizing the window.

A small bug in user authentication sometimes causes it to retry loading over and over. If you see it flashing "Contacting Host" in the status field, stop and try getting a different page.

TABbing through fields sometimes does not work.

No control over font or background color selection.

NCSA Mosaic

Available for many different platforms:

X Windows

SGI, SunOS, DEC, Linux, Solaris, HP, AIX, and many more versions available. The source code is available, so if you have Motif libraries you can compile it for any platform. Most Recent Version:  2.5.2beta

Known Bugs:

mailto:  URLs unsupported.

Form fields in the SGI version are half as wide as they should be.

MacMosaic

Powermac and 68K versions available.

Most Recent Version:  2.0.0a8

Known Bugs:

The forms facility has lots of bugs; using " & " in a form just does not work. Any posts you submit may get tweaked final submission.

Pull-down menus sometimes doesn't work.

Crashes while trying to stop downloading large documents.

mailto:  URL's unsupported.

Automatic caching of pages, which is a problem on our site where some pages are changing by the minute.

Ampersands are not handled correctly in MacMosaic's forms. Please do not use ampersands in any forms if you are running MacMosaic, especially in your password.

Lynx

Works for any non-graphical terminal, but works best on something with curses support (like vt100).

Most Recent Version:  2.3BETA

Known Bugs:

TEXTAREA fields represented by a single line.

Screen width means problems in some areas, notably Threads.

Spyglass Mosaic

Versions for all three platforms available.

Most Recent Version:  1.0

Known Bugs:  NONE

Emacs-W3

Full Web browser written in Emacs elisp - runs on all platforms emacs runs on

Most Recent Version:  2.1

Known Bugs:  NONE

Spry's AIR_Mosaic

MS Windows

Most Recent Version:  3.07

Known Bugs:  NONE

DISrecommended Browsers

Chimera

Requires reauthentication with every page.

Delphi-HTML

No support for user authentication and forms.

MacWeb/WinWeb

No user authentication support.

Netcom's "NetCruiser"

Requires reauthentication with every page.

WinMosaic

Frequently, the first field in a form is lost. This renders WinMosaic all but unusable on HotWired.

AmigaMosaic

Does not support forms.

OmniWeb

Requires reauthentication with every page.

 

 

6.0  IMPLEMENTATION

      
[ Previous ]           [ Next ]           [ Home ]

 

The implementation of the CALS test and validation tools repository uses Common Gateway Interface (CGI) routines running in conjunction with a database and the WWW interface. The Multimedia Oriented Repository Environment initiative of the Repository Based Software Engineering project is one such application that uses CGI routines with the Oracle Database. Details of these terms and definitions are outlined below.

 

6.1  Common Gateway Interface (CGI)

CGI is an interface for running external programs, or gateways, under an information server. For implementation of the CALS Test Tool Repository, the information server supported is an HTTP server and a set of CGI executables are used to provide access to a relational database containing meta-data. The Gateway handles information requests and generates the appropriate meta-data document dynamically. The ability to generate documents on the fly and convert the results to HTML in real-time is referred to as gatewaying.

Gateway programs, or scripts, are executable programs that can be run independently (although this would probably not be done). They are written as external programs to allow them to run under various information servers independently. Gateways conforming to the CGI specification can be written in any language, producing an executable file. Examples include C programs, PERL scripts, Bourne shell scripts, and C shell scripts among many others. This allows a wide variety of applications to be accessed seamlessly through Mosaic or other WWW browsers while maintaining the same look and feel of the user interface.

 

6.2  Multimedia Oriented Repository Environment (MORE)

The CALS Test Tool Repository utilizes the Multimedia Oriented Repository Environment and an Oracle Relational Database Management System (RDMS) to provide powerful database capabilities to the Integrated Weapon System Database (IWSDB) end-user through a Mosaic/WWW user interface.

MORE was developed by the Repository Based Software Engineering (RBSE) research and development (R&D) group as part of NASA's work in software engineering and reuse. MORE was designed as a set of application programs (CGI executables) that operate in conjunction with a standard HTTP server to provide access to an Oracle database containing meta-data (relevant information, keywords, abstracts, location etc.). The entire MORE interface, client browsing and search capabilities, repository definition, data entry, and other administrative functions are provided through stock Web clients. MORE provides separate hierarchies of meta-classes and collections and support for controlled access to proprietary collections through the definition of user groups. With the single exception of the system front page, the entire user interface is accomplished as dynamically generated HTML. The final version of the Repository Users Manual will contain details on the using More, the actual repository structure, etc. Refer to Appendix A for more information on how to use the MORE System to integrate a structured database into the Web.

 

6.3  CALS Test and Validation Tool Repository

This portion of the Software User's Manual will outline the assets that are currently available in the Test Tool Repository and will illustrate various pages as they might appear to a user searching the Repository for test tool information.

In reality, this document and others like it are rapidly becoming artifacts of the paper based environment from which CALS is trying to migrate. The Test Tool Repository is virtual in nature; that is, it exists in digital or electronic format on our local server and on other computers to which it is linked over the Internet. Because the user accesses meta-data or information about the assets and not the assets themselves, he is assured, in a properly administered system, the latest released version of whatever documentation or tool he chooses to download. By using a relational database containing the meta-data with hyperlinks to the actual files, one can look at changes to tools, documents or manuals immediately upon their release to production.

Therefore, manuals of this type, which purport to act as manuals for virtual systems, digital libraries or any on-line repositories, are almost guaranteed to be obsolete in their paper form as soon as they are published. The actual up-to-date "user's manual" will always exist in digital format like the virtual repository it is describing. In fact, the entire philosophy of documents is undergoing dramatic change as we redefine how we access information in electronic format. With hyperlinks, multimedia (hypermedia), search engines, and other digital services, information is available in forms that make traditional paper manuals unnecessary.

The CALS Test and Validation Repository, as currently structured, accesses meta-data contained in a relational database describing the following collections:

1.  CALS Programs

2.  CALS Services

3.  CALS Standards

4.  OSD CALS IWSDB Program

5.  Reuse Programs

6.  CALS Testing Tools

The following illustrations show typical screen pages the user would see as he navigates through the Repository. Figure 6.3-1 shows the "Sub- Collection" items listed under the OSD CALS Repository. Underlined text within the screen pages represent hypertext links to other documents, home pages, or information servers on the Internet.

 


MORE Browser

Current Collection:  OSD CALS Repository

Description:  Repository of information and services relevant to the Continuous Acquisition and Life Support Initiative

Parent Collection:  Main Collection

Sub-Collections:

CALS Programs

CALS Services

CALS Standards

OSD CALS IWSDB Program

Reuse Programs

Testing Tools

Figure 6.3-1  OSD CALS Repository

 

It is important to note that all screens past this point are dynamically generated, i.e., the HTML that makes up the subsequent screens are generated in real time after accessing the relational database. Figures 6.3-2 through 6.3-7 illustrate the primary sub-collections and their current assets.

 

Current Collection:  CALS Programs

Description:  Collection of CALS related programs.

Parent Collection:  OSD CALS Repository

Sub-Collections:  None

Related Collections:  None

Assets:

DoD Information Systems Technology Insertion (asset)

DoD WWW Servers (asset)

HQ AFC4A WWW Server (asset)

Navy On Line ISMAP (asset)

OSD WWW Server (asset)

Department Of Commerce Information Services via WWW (asset)

Figure 6.3-2  CALS Programs

 

The CALS Programs section of the repository links the user to a wide variety of government servers that have a vested interest in some aspect of the CALS Initiative. These "Homepages" include ones from the Department of Defense (DoD), Department of Commerce, the Air Force, Army, Navy, as well as some related commercial sites.

 

Current Collection:  CALS Services

Description:  Collection of services geared towards the needs of the CALS community.

Parent Collection:  OSD CALS Repository

Sub-Collections:  None

Related Collections:  None

Assets:

Computer Systems Laboratory Overview (asset)

FedWorld Beta Home Page (asset)

LLNL Home Page (asset)

NIST WWW - Home Page (asset)

Figure 6.3-3 CALS Services

 

CALS Services includes, among others, links to FedWorld with its comprehensive list of services and information and the Computer Systems Laboratory (CSL) at NIST, which works with governments, industry, and academia to improve the efficiency and delivery of government services. CSL develops standards, guidelines, and test methods; validates products for conformance to standards; conducts research; and provides technical advice and assistance. The CSL is heavily involved with the National Information Infrastructure, Electronic Commerce and Enterprise Integration including CALS efforts to apply information technology in manufacturing, health care, and government services.

 

Current Collection:  CALS Standards

Description:  Collection of the Military Specification pertinent to the CALS initiative.

Parent Collection:  OSD CALS Repository

Sub-Collections:  None

Related Collections:  None

Assets:

A brief description of each of the CALS standards (asset)

Appendix 1 to MIL-R-28002B (ASCII) (asset)

Appendix 1 to MIL-R-28002B Raster Graphics Standard (asset)

Document Center Home Page (asset)

Draft of SGML-Handbook (asset)

IGES Standard MIL-D-28000A in MS Word form (asset)

MIL-D-1840B in WordPerfect datafile (asset)

MIL-D-28003A - Digital Representation for ........ CGM (asset)

MIL-M-28001B Markup requirements .... SGML (asset)

MIL-R-28002B Raster Graphics CALS Standard (ASCII) (asset)

MIL-R-28002B Raster Graphics Standard (WordPerfect) (asset)

MIL-STD-1840A (asset)

Military Standard MIL-D-1840B (asset)

Overview of CALS Standards (asset)

Figure 6.3-4 CALS Standards

 

The section on CALS Standards includes descriptions, overviews and draft documents as well as the latest electronic copies of the CALS Core Standards, i.e., MIL-STD-1840B and the MIL-28000 Series of data translation standards.

 

Current Collection:  OSD CALS IWSDB Program

Description:  Details of work in progress under the CALS OSD Integrated Weapon Systems DataBase Program.

Parent Collection:  OSD CALS Repository

Sub-Collections:

Task_1

Related Collections:  None

Assets:

Functional Management Plan (MS Word DOC Format) (asset)

Implementation Report (MS Word DOC Format) (asset)

Initial Assessment Report (MS Word DOC Format) (asset)

IWSDB Strategy Paper (MS Word DOC Format) (asset)

ManTech Services Corporation Home (asset)

Figure 6.3-5  OSD CALS IWSDB Program

 

The OSD CALS IWSDB Program part of the repository contains the deliverable documents from this project. Task 1 documents are in a separate sub-collection, while additional assets allow access to the Functional Management Plan, the IWSDB Strategy Paper, et. al.

 

Current Collection:  Reuse Programs

Description:  A collection of popular Software reuse Programs

Parent Collection:  OSD CALS Repository

Sub-Collections:  None

Related Collections:

Reusable Software Libraries (RSLs)

Assets:

Annual Workshops on Software reuse (asset)

ARPA STARS Program (asset)

ASSET: Asset Source for Software Engineering Technology (asset)

CARDS Home Page (asset)

WWW Virtual Library - Software Engineering (asset)

Figure 6.3-6 Reuse Programs

 

Reuse Programs and domain specific software reuse part of our research and planning in designing the CALS Test and Validation Repository. This section lets users access current data on a wide range of Reuse Programs and Virtual Libraries.

 

Current Collection:  Testing Tools

Description:  Collection of tools and information related to test and validation of CALS Data.

Parent Collection:  OSD CALS Repository

Sub-Collections:

1840B

CGM

IGES

News Groups

Raster

SGML

Related Collections:  None

Assets:  None

Figure 6.3-7  Testing Tools

 

The Testing Tools portion of the Repository is a central archive for the all available (public domain) software related to testing the CALS Core Standards. Included in this section are validated data files and the testing tools for 1840B, CGM, IGES, Raster, and SGML. The hyperlinks to these assets allow the user to download the appropriate tool or data file at his convenience.

The final portion of this report depicts a hierarchical listing of all the remaining assets in the CALS Test and Validation Tool Repository. If placed into production, this Repository will be constantly updated as new assets are added, as modifications are made to existing assets, and as new services become available.

LIST OF OSD CALS REPOSITORY ASSETS:

CALS INFORMATION

PHONELST.TXT

CALS PROGRAMS

CALS SERVICES

CALS STANDARDS

1840B.ASC

1840B.WP5

28000A.DOC

28001B.WPC

28002B.ASC

28002BA1.ASC

28002BA1.WPC

28002BA1`.WPF

28003A.DOC

CALS-STD.TXT

OVERVIEW.TXT

SGMLHDBK.DOC

IWSDB PROGRAM

FMANPLAN.DOC

IMPLWM.DOC

INIASSM.DOC

IWSDB.DOC

TASK 1

CRITERIA.DOC

FREPO.DOC

REPOS.DOC

STTREPT.DOC

REUSE PROGRAMS

TESTING TOOLS

1840B

TOOLS

MCTOOL

ANSIUTIL.EXE

ANSIUTIL.MAAK

CALLTAPE.OBJ

LALLTAPE.OBJ

MALLTAPE.OBJ

MARKUP.TXT

MCTOOL.EXE

SALLTAPE.OBJ

SRCSYS.DEF

TAPETOOL

APPENDB.WPD

APPENDC.WPD

APPENDD.WPD

APPENDE.WPD

CHAP1-2.WPD

CHAP3-5.WPD

CHAP6B.WPD

CHAP7-A.WPD

SRCSYS.DEF

TAPETOOL.EXE

TTMANUAL..ZIP

CGM

DATA-SET

CTN-01I.CGM

CTN-01I.CLR

CTN-01I.LIS

CTN-01I.VAL

CTN-01R.CGM

CTN-01R.CLR

CTN-021A.CGM

D001C001.CGM

D001C002.CGM

D001C003.CGM

D001C004.CGM

D001C005.CGM

D001C006.CGM

D001C007.CGM

D001C008.CGM

D001C009.CGM

TOOLS

METACHECK

EVALUATION-REPORT

VALIDCGM

ABC.GCM

ABC.LIS

ABC.VAL

CTN-01I.CGM

CTN- 01ICLR

CTN-01ILIS

CTN-01RCGM

CTN-01R.CLR

CTN-02IA.CGM

GBACK

GSCRIPT

README.CGM

TABLE

VALIDCGM

VCGM.LBL

VCGM.TBL

VSCRIP

IGES

DATA SET

CLASS1-1.IGS

CLASS1-2.IGS

CLASS1.ZIP

CLASS2.1-IGS

CLASS2 2.IGS

CLASS2.ZIP

NEWS GROUP

RASTER

DATA SET

D005R003

D005R004

D005R005

D005R006

D005R007

D005R008

D005R009

D005R010

D005RR11

FIGI.GP4

HOOKL.GP4

IGESMAP.GOS

SGMLMAP.GP4

TP[G1-GP4

TOOLS

DECOMPG4

DECOMP.C

DECFOMPG4.DOC

DECOMPG4.DSK

DECOMPG4.EXE

DECCOPG4.PRJ

DECOMPRE.C

DECOMPRE.H

READ.C

TREE.C

WRITE.C

VALIDG4

K500TST2.CAL

K7000TST.CAL

KSCOTTL2.CAL

MCVALG4.EXE

SC715301.CAL

VALIDG4.EXE

VALIDG4.MAK

SGML

YEARSGML.DOC

DATA SET

AMBIG.SGM

BASICDOC.SGM

CLEAN

COMMENT.SGM

CONREF.SGM

DEFAULT.SGM

ECKKIHARDT.GML

ENDLESS.SGM

ERREXIT.SGM

TOOLS

ARCSGML

ARCVM2.EXE

SGML.MSG

VM2.EXE

V,2HELP.DOC

INCONTEX

IC.ZIP

SGML

CONCRETE.SYN

LICENSE.DOC

PARSE.BAT

PARSEI.BAT

PARSEI.BAT

README.DOC

SGMLS.DOC

SGMLS.EXE

SGMLS.ZIP

SOURCE

ACTION.H

ADL.H

AMBIG.C

APPL.H

CHANGE01.XFX

CMS.CFG

CONFIG02.XFX

CONTEXT.C

CONTEXT.H

DOS.CFG

DOSPROC.C

EBCDIC.C

EBCDIC.H

ENTGEN.C

ENTITY.H

ERROR.H

ETYPE.H

 

 

APPENDIX A:  MULTIMEDIA ORIENTED REPOSITORY ENVIRONMENT (MORE)

      
[ Previous ]           [ Next ]           [ Home ]

 

MORE - Multimedia Oriented Repository Environment

Version 1.0

As increasing quantities of information are made part of the World Wide Web, it will be increasingly difficult for Web administrators to provide effective access to that information. Support for meta-information concerning web-accessible artifacts will be necessary, particularly when there are large numbers of such artifacts. The Repository Based Software Engineering (RBSE) project, funded by NASA, addresses just such a scenario, a public repository of thousands of software engineering information and application source artifacts with over one thousand remote users.

The RBSE research and development group, seeking to free the project from a monolithic architecture based upon X-Windows, has developed a new repository - MORE, the Multimedia Oriented Repository Environment. MORE was designed as a set of application programs (more specifically a set of CGI executables) that operate in conjunction with a stock httpd server to provide access to a relational database of meta-data. The entire MORE interface, client browsing and search, repository definition, data entry and other administrative functions, is provided through stock Web clients. (We currently use X-Mosaic, WinMosaic and Lynx for most of our interaction.) MORE provides separate hierarchies of meta-classes and collections and will support controlled access to proprietary collections through the definition of user groups. With the single exception of the system front page, the entire user interface is accomplished as dynamically generated HTML.

MORE Documentation

Overview.

Information for End Users.

General.

Browsing.

Searching.

Other MORE Documentation.

Other General Documentation.

Information for Librarians.

General.

Classes.

Collections.

Groups.

Users.

Librarians.

The Admin Page.

Enumerations.

Synonyms.

Field Types.

Assets.

Archives.

Other General Documentation.

Information for Installers.

Dependencies.

Installation Instructions.

After You Have Compiled.

Bug Reports and Comments.

Glossary.

MORE Overview

Version 1.0

The following paper gives an overview of the MORE system, which is a meta-data based repository structure using the World Wide Web, the Common Gateway Interface, and Mosaic or a Mosaic-like browser as its sole user interface. The CALS Test Tool Reuse Repository utilized the MORE system as its core building block. Around MORE, we constructed the basic framework for the repository and imbedded hyperlinks to existing services and digital documents. This was consistent with our baseline philosophy of not duplicating services/products that already existed and of putting into practice the lessons learned in the "reuse" portion of our study.

The Preliminary Test Tool Repository, which was demonstrated at the CALS Open House on October 4-6, 1994, Fairmont, WV., used MOREplus, which is the commercial version of MORE and is supported by MountainNet, Inc., a small West Virginia technology company. Since that time we have acquired the public domain version of MORE from NASA and are familiarizing our personnel with its features. Any future expansion or enhancements to the Repository will be able to be handled in-house.

Following the overview is a copy of the digital MORE manual. It was felt that a fuller understanding of the MORE system would enable the reader to better appreciate the capabilities and potential of the Test Tool Repository. Although the current implementation of the Repository accesses an Oracle database, we are by no means restricted to any particular vendor. MountainNet is presently doing a Sybase interface to MORE and has scoped the effort at two man-weeks. The OSD CALS IWSDB Project team is evaluating public domain relational database programs as potential alternatives to the commercial packages.

Integrating Structured Databases Into the Web

The MORE System

David Eichmann†    Terry McGregor‡    Dann Danley‡

†University of Houston - Clear Lake

2700 Bay Area Boulevard, Houston, Texas 77058 U.S.A.

eichmann@rbse.jsc.nasa.gov

‡I-NET, Inc.

1020 Bay Area Boulevard, Houston, Texas 77058 U.S.A.

{tmcgrego,ddanley} @rbse.jsc.nasa.gov


Abstract

Administering large quantities of information will be an increasing problem as the World Wide Web grows in size and popularity. The MORE system is a meta-data based repository employing Mosaic and the Web as its sole user interface. We describe here our design and implementation experience in migrating a repository system onto the Web. A demonstration instance of MORE is accessible at: "http://rbse.jsc.nasa.gov:81/DEMO/" This paper was presented at the First International Conference on the World Wide Web, Geneva, Switzerland, May 25-29, 1994. The paper is also available in

HREF="http://rbse.jsc.nasa.gov/eichmann/www94/MORE/MORE_A4.ps"and

HREF="http://rbse.jsc.nasa.gov/eichmann/www94/MORE/MORE.ps">8.5"x11" PostScript

Introduction

As increasing quantities of information are made part of the World Wide Web, it will be increasingly difficult for Web administrators to provide effective access to that information. Support for meta-information concerning web-accessible artifacts will be necessary, particularly when there are large numbers of such artifacts. The Repository Based Software Engineering (RBSE) project addresses just such a scenario, a public repository of thousands of software engineering information and application source artifacts with over one thousand remote users.

The RBSE research and development group, seeking to free the project from a monolithic architecture based upon X-Windows, has developed a new repository:  MORE, the Multimedia Oriented Repository Environment. MORE was designed as a set of application programs, more specifically, a set of CGI executables that operate in conjunction with a stock "httpd" server to provide access to a relational database of meta-data. The entire MORE interface, client browsing and search, repository definition, data entry and other administrative functions, are provided through stock Web clients. (We currently use X-Mosaic, WinMosaic, and Lynx for most of our interaction.) MORE provides separate hierarchies of meta-classes and collections and support for controlled access to proprietary collections through the definition of user groups. With the single exception of the system front page, the entire user interface is accomplished as dynamically generated Hypertext Markup Language (HTML).

The Repository Based Software Engineering Project

The Repository Based Software Engineering (RBSE) project is a research and development program whose mission is to provide a technology transfer mechanism to improve NASA's software capability. RBSE is sponsored by the NASA Technology Utilization Division and is administered by NASA's Johnson Space Center and the Research Institute for Computing and Information Systems (RICIS), a part of the University of Houston - Clear Lake. The purpose of RBSE is the support and adoption of software reuse through repository-based software engineering in targeted sectors of industry, government, and academia. The project consists of two principal activities:  research into repositories and related issues, and operation of a public facility. The RBSE research and development group is active in a number of areas:

Repository technology.

Internet discovery.

Collaboration packaging.

Reengineering process modeling.

ION - the Intermediate Object Notation.

Interface slicing.

Reusability metrics.

Automated classification of assets.

The RBSE project contracts with MountainNet, Inc., a small technology firm, to operate the AdaNET repository as its public facility. The AdaNET Repository contains a comprehensive collection of information about all aspects of software engineering and the software development life cycle, as well as an extensive collection of public domain software with related documentation provided to support software development efforts. In addition to software and related documentation contained in the reuse library, AdaNET contains information related to conferences, tools and environments, publications, and references. The AdaNET Repository is available to the general public; however, user registration is required.

Our Previous System

RBSE's previous repository mechanism, known as ASV3, consisted of a monolithic X-Windows application that interacted with a relational database, as shown in Figure 1. The data model provided for an inheritance hierarchy of asset meta-classes with asset attributes chosen by librarians as needed. A separate hierarchy of collections allowed for the clustering of heterogeneous sets of assets, and support for related collections. Search mechanisms included boolean expressions, pattern templates, and relevance-feedback natural language. Browsing of both class and collection hierarchies was supported. The resulting client interface proved to be quite powerful and adaptable to a broad variety of user expertise.

The RBSE public facility currently has more than a thousand subscribers spread broadly across industry, academia, and government. Access is gained by logging in to the repository host and executing the repository software. Both ASCII and X-Windows interfaces are provided; previous versions of the system were entirely ASCII-based and dominantly over dial-up connections, but current user session counts run over 80% direct Internet connections, largely employing the X-Windows interface.

Performance in our local area network is quite adequate, but geographics have created problems for remote user interaction with ASV3. Many clients are quite distant from the server, and the presence of ASV3 suffers from network delays.

Furthermore, while the representation and search mechanisms in the current system are rich, definite limitations must be sought to overcome.


Figure A-1:  Previous RBSE Repository Architecture


Users must assess deposits with little support beyond group classification and text display of the actual source code (and perhaps a user manual if the author went to the effort of creating one).

Some assets have over 100 compilation units, and providing a sense of asset system structure was limited by the nature of the interface. Addition of new interfaces promised a massive retesting of system stability.

Most error reports against ASV3 trace back to the intricacies of X-Windows, not to the particular semantics of the application or to database interaction.

The system used a single table, made up of fifty attributes, each an eighty-character string, into which all asset metadata was placed. These limits were hard-coded into the system.

All users, regardless of their local environment, were presented with the same repository model, the schema definition did not support the capability of defining access limitations. Either users were not allowed access the system, or they could access all production artifacts.

The monolithic architecture required substantial installation effort, and was dependent upon a single DBMS (Oracle). Interaction with the database was spread throughout the application. Installing multiple instances of the repository on the same server required substantial replication of system resources.

The MORE Architecture

Building MORE involved a complete redesign of the repository to support the following goals:

An adaptable group definition mechanism for managing access to proprietary sub-collections of assets.

Optimization in the storage of metadata concerning assets; each class definition now has a corresponding relation, with each class attribute directly mapping to a relation attribute of the same name and type.

The World Wide Web, and Mosaic in particular, as the sole user interface (including all administrative access).

Visualization mechanisms using HTTP ISMAP protocols.

Integration of other Internet resources into our user interface through the use of URLs in repository data.

The resulting architecture, shown in Figure 2, supports repository semantics without the overhead of a low-level X session across the Internet. The redesign has substantially reduced and simplified the overall system structure as well.


Figure A-2:  The MORE Architecture

The Mosaic/World Wide Web architecture has altered the perception of user interaction. Measuring user activity previously involved session connect hours. Now, users connect to the system through sequences of distinct transactions, forced by the stateless interaction with the server. This results not only in a change in how we report user activity, but also in an acceptance that adding URL references as our primary semantic threading of the system result in us not knowing whether a user actually takes advantage of that URL unless it points back at us. URLs followed from a client result in direct interaction with that other server (which, of course, also frees us from having to play intermediary). Heterogeneity supports access to WAIS, WWW, Gopher, and other servers without having to concern ourselves with anything other than the composition of a correct anchor. This promises to have great impact on the current efforts in standardization of repository interoperation.

Users can access the system at arbitrary entry points by storing URLs through their client software. The repository system is now modular, with execution split between the client (e.g., Mosaic or Lynx) and the server referenced by the current URL. With the exception of the home page, all HTML presented to the client software is generated dynamically by the glue routines. The modular architecture also easily supports a variety of interface models for customization of the appearance of the system, because a typical glue routine execution thread is a couple of pages of code. We have created a compilation dependency graph browser as an example of the type of visual interfaces that we plan to incorporate.

MORE is now much more adaptable to multiple platforms. Mosaic currently runs on PCs, Macs and a number of UNIX platforms, and the only requirement for Web client programs is that they support HTML for client access (HTML+ for the search mechanisms) and HTML+ for librarian access. The HTML+ requirement is more specifically support for forms. HTTP demons are available for numerous platforms. The only requirement here is that the demon support execution of programs and user authentication (this avoids prompting for authentication with every repository interaction during a transaction sequence).

 

Table A-1:  A Comparison of System Sizes

System
Subsystem
Source Lines of Code
ASV3application38,468
 library10,315
 other13,578
 total62,361
MOREapplication12,640
 library16,184
 other1,264
 total30,088

 

We also have a modular DBMS interface as a secondary effect of the redesign. We are using Oracle v.6 for MORE 1.0 and plan Oracle v.7 and Postgres support in MORE 1.1; other SQL-compliant engines are easily added through the separation provided by the database interface layer. Reporting mechanisms are now discrete from the repository itself. The default information provided by the http log files is sufficient for the majority of our needs and have freed us from the need track user activity within the repository itself, as was necessary in ASV3.

MORE is substantially smaller than ASV3, as shown in Table 1. The dramatic reduction in the size of the application code is due primarily to the replacement of complex X-Windows code with printf C calls to mark up data into HTML, and in small part to migration of database specific code into the library. The remaining growth in the library is due to the additional database activity to support group definition and manipulation.

Metadata and Classification in MORE

MORE is a meta-data based repository; the information stored in its underlying database is not the artifact itself, but rather information concerning the artifact, which is stored using other mechanisms (the file system, another database, or another software package such as a configuration management or CASE tool). MORE supports two distinct representation mechanisms:  a class definition hierarchy, allowing homogeneous organization of information, and a collection hierarchy, allowing a mix of homogeneous, and heterogeneous information.

The class definition hierarchy is single inheritance, with a base class that is customizable at installation time through the database definition scripts (no software changes are required). The semantics of the system require that the base class contain at least an asset id, title, keywords, and abstract, with additional fields added as required. Further definition of the class hierarchy is then carried out completely through the librarian interface, with the database interface generating the calls to the DBMS to dynamically create and destroy classes and their corresponding relations as necessary.

The collection hierarchy supports the aggregation of assets without respect to their defining class. Any given collection can contain a set of assets drawn from any number of classes, as well as sets of subcollections and related collections. Any asset will always be a member of at least one collection in the hierarchy, but can be a member of as many collections as is appropriate, at any level in the hierarchy. Furthermore, each collection can have associated with it one or more groups which are authorized to access the assets and subcollections making up the collection. Groups in turn are made up of sets of users and other groups, all defined through the librarian interface. Users not transitively a member of a designated group for a given collection will never see the collection (or its contents) through any of the browser or search mechanisms.

The related collection mechanism is unary; a given collection can refer to another collection without the referenced collection being required to reciprocate. This was a conscious design choice, as we wish to support work groups referencing more public collections without revealing the contents (or existence) of their own collections to the organization or public at large.

Assets, as mentioned earlier, are characterized by their metadata, which normally includes an address, composed of a hypertext anchor (a URL and a label) that provides a clickable path to the asset. A special case involves assets that are composites, made up of a number of distinct artifacts. We organize these into directories in a server file space (usually the same server as MORE is using), one asset per directory and one artifact per file. The URL is then a path from the server root directory to the asset's directory. This results in a list of files marked up as links.

The Database Schema

Figure 3 shows the schema for MORE. As mentioned earlier, much of the structure defined for asset metadata is not static, but rather generated dynamically by librarians through the MORE interface. The two clouds in the E-R diagram denote where this occurs: the asset tables and the associated list item tables that hold repeating field data (such as keyword lists and abstract text lines). Each asset class has a tuple in the asset class table, tuples in the class fields table corresponding to each field that field declares beyond its parent in the inheritance hierarchy and a table that stores the tuples containing the asset metadata. Class fields of type enumeration also have entries in the enumeration values table corresponding to the values making up the enumeration. Enumerations are defined through the existence of one or more pairs of enumerations and values in the enumeration values table, allowing multiple class fields to be defined that share a common enumeration type.

The collection, group, and user mechanisms described in the previous section are represented through corresponding tables, with the usual implementation of many-to-many relationships as pairs of database keys. Note that both users and groups can be members of groups, allowing for shorthand inclusion and removal of collection access for entire communities of users.

Librarians are presumed to be users, with access privileges defined as bit masks. Privileges operate at a very fine grain, supporting for instance, specific allowance/disallowance of asset creation, modification, and deletion. The administrative interface program matches the librarian's identity against the defined list of capabilities and presents hypertext links to only those administrative functions that their mask permits.

 


Figure A-3:  The MORE Schema

 

The two decoupled tables, unique words and synonyms, are used for natural language search against the metadata. When an asset is added to a collection, the text of its various fields is merged into a list of unique words, mapped against a stop list to remove common words and then added to the unique words table. (Note that we currently include terms that appear only in the metadata, and not in the asset itself.) We plan a separate mechanism to support full-text indexing of assets. When a natural language search is run, the query executes against this table and results, ordered by relevance, are presented as hyperlinks back to the asset's metadata. The natural language search page contains a check box for indication that the unique words should be augmented with any synonyms that appear for them in the synonyms table.

Lessons Learned in Reengineering a Repository for the Web

The architecture of the ASV3 system was split between the X/Motif interface, the semantics layer, and the database interface layer. The X/Motif interface and semantics layers of the design composed the ASV3 user interface. The database layer provided an Application Programming Interface (API) to the ASV3 repository. The two interfaces utilized the same API and were dependent on state information for each interaction with the user. The design of the database API was driven in large part by the data requirements of the X/Motif interface.

We became aware of the World Wide Web and the NCSA Mosaic viewer in mid 1993. Over the next few months, we discussed the feasibility of using Mosaic as a display mechanism for the new RBSE library system. It was decided in November of 1993 to develop a proof of concept program which would interface with the ASV3 repository through the database API and to attempt to display the information via Mosaic. The learning curve for developing script programs for the generation of dynamic HTML pages was short. The first program generated an HTML page that contained a form with a scrolled list of collections retrieved from the repository through the database API, proving it could be done at least at the user level. One week later, a fully functional interface to the ASV3 repository had been constructed which allowed browsing, searching, and viewing of assets in the repository. We then held a lessons-learned session and concluded that:

We needed to be able to do parameter passing between the dynamically allocated HTML forms and the programs written to access the database. The database API needed key field information that could not be stored anywhere else but the form action anchor.

Changing from a state environment to the stateless environment of HTML would require a new architectural design for the system. The system would change from one monolithic program which required users to maintain a sustained session on the server to a series of short disjunct sessions lasting only long enough to generate the next HTML page. Developing complex interface code would no longer be necessary.

Asset files were no longer required to be on the server, but could be anywhere on the Internet that was accessible through a hypertext link.

Support for download and viewing capabilities would be handled by WWW client applications rather than by the repository software.

There would no longer be a single entry point into the system. The user would have the capability to drive the system to information of interest and save the URL allowing entry back into the system at the same point later.

It was clear that the project should change direction and pursue development of the new system using WWW client applications as its interface. The MORE database API was designed and built using an object-oriented approach. Each table in the repository has a set of operations that support data manipulation. The API was developed using C and Oracle Pro*C. The API supports data manipulation at the tuple level and above. The API was greatly influenced by the architecture of the system. The kinds of state information available in ASV3 were not available in the new design. API functions were written to support the retrieval of information previously available as state data from the interface.

The redesign and development of the new database API were also driven by the move to a stateless environment. Previous assumptions about available state information were invalid, and a new interface was required. The semantic design of the repository was also changed as features which were previously supported by the system would now be handled by the WWW client applications.

One of the temptations of working in a stateless environment is to create one executable for every page that needs to be created. In the case of the MORE system, this approach would have resulted in producing almost 150 programs, each of which was 500K bytes in size. (The large size is due in large part to linking with the Oracle interface libraries.) The design of the system is organized around functional systems and sub-systems, with most of the programs in support of repository administration. Only one-third of all the programs in the system support browsing, searching, and viewing. The other two-thirds support repository administration.

An interesting side effect of moving from a state environment to the stateless environment of HTML is that the user interface became more simplistic. The user interface transformed from complex multi-function windows to single function, single action HTML pages.

Conclusions and Future Work

Our experience with MORE has been an unqualified success. We have been able to develop and deploy a complete revision of our repository in dramatically less time and with substantially fewer (and simpler to fix!) error reports submitted during testing than our previous system. MORE is comparatively lightweight, currently comprising approximately 12mb of executables (in debug mode, with no sharing of Oracle libraries) and a database requirement one-quarter of that for ASV3. The complexity of the executables is also substantially reduced, implying that maintenance should be relatively easy.

The system is capable of administering an effectively unlimited number of assets (constrained only by the amount of disk space the database engine can manipulate for the metadata) distributed on an arbitrary number of Web servers scattered about the Internet. This leads to an interesting ability for independent repositories to point at one another without requiring substantial alteration to either system. For example, our demonstration system contains an asset with a hyperlink to COSMIC's front page, an asset with a hyperlink directly at COSMIC's author list, and an asset with a hyperlink directly at the description to a particular software system in COSMIC's catalog. Repositories can share assets and support separate organizations and classification schemes in their local systems.

Our development path for MORE includes:

MORE 1.0:  Core Functionality (completing user testing).

MORE 1.1:  Multiple data engine support [mid `94].

MORE 1.2:  Asset versioning [mid-to-late `94].

MORE 2.0:  Distributed Servers [late `94].

Subcollections spanning servers.

Related collections spanning servers.

Searches spanning servers.

The planned releases of 1.x will result in a rich, flexible, and portable repository mechanism where metadata for a single environment resides on a single server. MORE 2.0 will extend this capability with the ability to have the repository environment seamlessly span an arbitrary number of Web servers. The user's only awareness of this distribution will be the URLs that the client might display.

 

 

References

 

[1]  Andreessen, M., "A Beginner's Guide to HTML," National Center for Supercomputer Applications, Univ. of Illinois, http://www.ncsa.uiuc.edu/demoweb/html-primer.html.

[2]  Andreessen, M., "Graphical Image Map Tutorial," National Center for Supercomputer Applications, Univ. of Illinois, http://wintermute.ncsa.uiuc.edu:8080/map-tutorial/image-maps.html.

[3]  Beck, J. and D. Eichmann, "Program and Interface Slicing for Reverse Engineering," jointly appearing in Working Conference on Reverse Engineering, Baltimore, MD, May 21-23, 1993, pages 54-63 and the International Conference on Software Engineering, Baltimore, MD, May 17-21, 1993, pages 509-518.

[4]  Berners-Lee, T. (ed.), "HyperText Mark-up Language," CERN, http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html.

[5]  Berners-Lee, T., "The World Wide Web Initiative," CERN, http://info.cern.ch/hypertext/WWW/TheProject.html.

[6]  Boetticher, G. and D. Eichmann, "A Neural Network Paradigm for Characterizing Reusable Software," to appear in Proc. of The First Australian Conference on Software Metrics, Sydney, Australia, November 18-19, 1993.

[7]  Boetticher, G., K. Srinivas, and D. Eichmann, "A Neural Net-Based Approach to Software Metrics," Fifth International Conference on Software Engineering and Knowledge Engineering, San Francisco, CA, June 16-18, 1993, pages 271-274.

[8]  Eichmann, D., "The RBSE Spider -- Balancing Effective Search Against Web Load," Proceedings of the First International Conference on the World Wide Web, CERN, Geneva, Switzerland, May 25-27, 1994.

[9]  Gunawardena, S., "Mapping Problem Oriented Notations to the Implementation Oriented Notation (ION)," M. S. Thesis, University of Houston -- Clear Lake, in preparation.

[10]  McCool, R., "The Common Gateway Interface," National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, http://hoohoo.ncsa.uiuc.edu/cgi/.

[11]  McCool, R., "NCSA httpd Overview," National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, http://hoohoo.ncsa.uiuc.edu/docs/Overview.html.

[12]  Mishra, R. R., "ION -- An Implementation Oriented Notation for the Graphical Representation of OO Programs," M. S. Thesis, University of Houston -- Clear Lake, December 1993.

[13]  Montulli, L., "Lynx Users Guide v2.3," The University of Kansas, http://www.cc.ukans.edu/lynx_help/Lynx_users_guide.html.

[14]  NCSA, "Introduction to NCSA Mosaic for Microsoft Windows," National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, http://www.ncsa.uiuc.edu/SDG/Software/WinMosaic/HomePage.html.

[15]  NCSA, "Introduction to NCSA Mosaic for X," National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/Docs/d2-intro.html.

[16]  Perera, D. N., "A Software Reengineering Process for an Object Oriented Environment," University of Houston -- Clear Lake, M. S. Thesis, December 1993.

[17]  Reuse Interoperability Group, "Uniform Data Model," Applied Expertise, Arlington, VA, 1993.

[18]  Stonebraker, M. and L. A. Rowe, "The Design of POSTGRES," Proc. of SIGMOD \q86 International Conference of Management of Data, Washington, D.C., May 28-30, 1986, p. 340-355.

This work supported in part by NASA cooperative agreement NCC-9-16, RICIS research activity number RB02.

MORE - End USER INFORMATION

Version 1.0

General

MORE is a metadata-based repository system - that is, it does not store assets themselves, but rather information about assets. Since MORE uses the World Wide Web as its interface to the Internet, it uses URLs (Universal Resource Locators) as its means of actually referencing assets. Any hyperlink you find in a page displayed by MORE will either invoke a program to browse or search the database, or actually reference some asset located somewhere in the World Wide Web. Note that on occasion servers may be down, so a particular asset may be unavailable.

Browsing

Browsing within MORE involves step-by-step manual navigation through the hierarchies of classes and/or collections defined by a site's librarians. All MORE browsers are useable by any HTML compliant client program.

There are two specialized browsers for collections, the hierarchical view of all collections, and the alphabetical view of all collections. Each of these provides hyperlinks to the general browser for a particular collection.

Collection Browser

Screen Elements

Collection Name

The name of the current collection. Invoking the collection browser from the front page (or with no collection argument) results in display of the root collection for this site.

Description

A brief explanation of the collection provided by the librarian who created it.

Parent Collection

Clicking on this link will take you to this collection's parent. This is needed for moving up the hierarchy when you've traversed a related collection link.

Subcollections

This is a list of all collections below the current collection that you have access to. Each element of the list is a hyperlink, clicking on it will make that collection the current collection.

Related Collections

These are collections at arbitrary points in the collection hierarchy that the librarian for this collection deems to be of interest or to contain related information. Each element of the list is a hyperlink, clicking on it will make that collection the current collection.

Assets

These are the assets that are members of this collection (an asset can appear in multiple collections). Each element of the list is a hyperlink, clicking on it will display the metadata for that asset.

Collection Alphabetical Browser

The Alphabetic Browser displays the collection set in alphabetic order.

Collection Hierarchy Browser

The Hierarchy Browser displays the collection set in hierarchical order.

Class Browser

Screen Elements

Class Name

The name of the current class. Invoking the class browser from the main page displays the current class with other attributes.

Parent Class

This link takes you to the class's parent. This link is needed for moving up the hierarchy when you have traversed a sub-class link.

Sub-Classes

This is a list of all classes below the current class that you have access to. Each element of the list is a hyperlink, clicking on it will make that class the current class.

Assets

These are the assets that are members of this class. Each element of the list is a hyperlink, clicking on it will display the metadata for that asset.

Searching

Searching within MORE involves specifying particular criteria via a form that results in a list of matching assets being presented as a set of hyperlinks. Because of the form requirements for data entry, MORE search tools are useable only by HTML+ compliant client programs.

Natural Language Search

Natural language search supports relevance feedback against the complete text of the metadata. (We're planning to merge a WAIS-based index of complete textual asset in the near future.)

Screen Elements

Include Subcollections Checkbox

Toggling this box includes the current collection and all of its subcollections in the search.

Include Synonyms Checkbox

Toggling this box expands the scope of the search to include any synonyms defined for words listing in the text window.

Collection List

Click on a collection in this list to limit the search to that collection (and its subcollections if that checkbox is toggled).

Text Window

Include in this field as few or as many words as you wish to have matched against the metadata for assets. Results are ranked by their relevance.

Perform Button

Clicking on this button submits the search to MORE.

Pattern Match Search

The Pattern Match search allows the search of metadata for specified patterns of characters and wild cards.

Screen Elements

Include Subcollections Checkbox

Toggling this box includes the current collection and all of its subcollections in the search.

Include Subclasses Checkbox

Toggling this box includes the current class and the sub classes in the search.

Include Ignore Case Checkbox

Toggling this box would either not include collections and classes or include at least one of them.

Collection List

Click on a collection in this list to limit the search to that collection (and its subcollections if that checkbox is toggled).

Class List

Click on a class in this list to limit the search to that class (and its subclass if that checkbox is toggled).

Create Search Form Button

Clicking on this button submits the search to MORE.

Boolean Search

Still in development.

Other MORE Documentation

Introduction.

Information for Librarians.

Information for Installers.

Other General Documentation

NCSA Mosaic for X Documentation.

Serving Information to the Web.

MORE Librarian Information

Version 1.0

General

MORE system allows administrative functions like adding, modifying, deleting, copying or moving to be done on the repository. The privilege for creating these library functions, as they are called, is reserved for library administrators. A hyperlink in the MORE page will invoke other hyperlinks which display forms for the library functions.

Classes

A group of objects which have the same format. Some classes might include documents, pictures, software, data and drawings.

Screen Elements

Parent Class

A parent class is chosen out of the class set for which a new class name is to be added or a class name is to be deleted.

Inherited Attributes

The class attributes for the parent class selected. The attributes include object_id, title, node_type, format, address, keywords and abstract.

New Class Name

The name of the new class that has to be created.

Class Attributes

If a new class is created then at least one new attribute has to be entered. The attribute elements include Attribute name, Data type, Attribute length, Visible columns, Number of lines, Attribute type and Enumeration type.

Collections

A group of objects kept together in the repository for a specific reason such as a common purpose, topic, format, user group or source.

Screen Elements

Collection Name

The name of the collection which is to be created.

Description

A brief description of the collection provided by the librarian who created it.

Parent Collection

The parent collection under which the new collection needs to be created or deleted.

Node Class

It represents the different stages in the development of a repository. The developmental stage is restricted to librarians. In the production stage the elements developed in the developmental stage are put to test. The elements tested are open to public view in the archive production and modifications are made in the archive developmental stage.

Related Collection

The related collection elements contain information that is similar to other collection elements.

Groups

A list of group names which could be given access to the collection.

Groups

A collection of users or groups used for access control.

Screen Elements

Group Name

The name of the group that is to be created, modified or deleted.

Description

A brief comment about the group created by the librarian.

Access Requirements

Any requirements that is needed by the member of the group to gain access.

Groups

This is a list of groups that are available.

Users

This is the list of people who can be included in a group.

Users

A person who uses the repository to browse, search or administer.

Screen Elements

Login ID

The login id of the user who is to be created, modified or deleted.

Full Name

The full name of the user who logs in.

Organization

The organization to which the user belongs.

Address

The address of the organization or user.

Phone #

The phone number of the user.

E-Mail Address

The E-Mail address of the user.

Groups

The group or groups to which the user may belong.

Librarians

A person responsible for the care and management of the library.

Screen Elements

User

Select a user from a set of users who want librarian privileges.

Collection

A select from a set of collections to which the user is to be awarded these privileges.

The Admin Page

The Admin page consists of the administrative functions like add, modify and delete on assets, collections, classes, users, groups, enumerations, synonyms, librarians and archive export and import facility. A sample of how the Admin Page looks like.

Enumerations

A list of special terms used for class. It is similar to enumeration-type. The enumeration value is a particular type of enumeration.

Screen Elements

Enumeration Name

The name of the enumeration-type for which enumeration values are to be specified.

Enumeration Value

A particular type of enumeration-type. A single enumeration name can have multiple values.

Synonyms

A synonym is a single word that is defined by a librarian to be a substitute for a single word specified by the user's natural language search.

Screen Elements

Keywords

A term in the repository for which synonyms are to be created, modified or deleted.

Synonyms

A single word or collection of words used to define a keyword.

Field Types

The various field types that are used in the repository representation.

Screen Elements

Character

The field type is character which includes upper and lower case alphabets, numbers and a few special characters. There is a limit of 256 characters.

Enumeration

An enumeration type can be format or node_type. A format usually is on of the following:  Encrypt, Group, Postscript, Tiff, Gif, Binary, CALS, Bitmap. A node_type is development, production, archive production, archive development.

List

A list field type can be used to enter multiple values.

Long

The field type long represents the long integer.

Integer

The field type integer has a limit of 99999.

Float

The field type is float and can have a length of 99999.

Date

One to One

One to Many

Assets

Some file accessible through the World Wide Web that MORE has a metadata record about.

Screen Elements

Asset Collection

A collection of assets under which assets can be created, modified or deleted.

Asset Class

A group of class assets which are necessary when creating an asset.

Title

The name of the asset that is getting created.

Node_Type

It represents the different stages in the development of a repository. The developmental stage is restricted to librarians. In the production stage the elements developed in the developmental stage are put to test. The elements tested are open to public view in the archive production and modifications are made in the archive developmental stage.

Format

The type of document that can be created. A format usually is on of the following:  Encrypt, Group, Postscript, Tiff, Gif, Binary, CALS, Bitmap.

Address

The address at which the asset can be found is also recorded.

Keywords

The keywords which can be used in connection with this asset.

Abstract

A description of the asset that has been created or modified.

Archives

A place where documents can be placed for public viewing.

Screen Elements

Database Name

The name of the database from/to which the document has to be imported or exported.

Password

The password to the database.

Output/Import Filename

The name of the file that has to be imported or exported.

MORE Documentation

General Information.

Information for End Users.

Information for Installers.

Other Documentation

NCSA Mosaic for X Documentation.

Serving Information to the Web.

MORE -- Librarian Menu (v. 1.4)

Assets:

Create Asset

Copy Asset

Modify Asset

Delete Asset

Collections:

Create Collection

Modify Collection

Delete Collection

Asset Class:

Create Asset Class

Modify Asset Class

Delete Asset Class

Users:

Create User

Modify User

Delete User

Groups:

Create Group

Modify Group

Delete Group

Enumerations:

Create Enumeration

Modify Enumeration

Synonyms:

Create Synonym

Modify Synonym

Delete Synonym

Librarians:

Create Librarian

Modify Librarian

Delete Librarian

Archive:

Archive Export

Archive Import

Welcome to UHCL MORE for the World Wide Web

Version 1.0

This README details installation steps.

doc/MORE.ps contains our WWW'94 paper, which gives an overview of the system. More complete information and documentation on UHCL MORE is under construction and will be available on-line, via our server, rbse.jsc.nasa.gov.

Dependencies

UHCL MORE is known to compile on the following platform:

Sun 4 (SunOS 4.1.3 with stock NCSA httpd 1.3, gcc 2.5.8 and Oracle V6.0.36.0.1)

If you have to make nontrivial changes to UHCL MORE to get it to compile on a particular platform, please send a set of context diffs (e.g., "diff -c oldfile newfile") to more@rbse.jsc.nasa.gov.

Installation Instructions

To create an instance of the MORE system you must first create the database in Oracle. To create the database in Oracle first you must become the "oracle user". This is done by logging into the system as oracle or performing a switch user (su) command and becoming the "oracle" user. To become the oracle user you must know the password of the "oracle" user. If you do not know the password of the "oracle" user see your system administrator.

After you have become the "oracle" user you run the program "sqldba". To run the program enter the following commands:

% sqldba

You will be presented with the following prompt:

SQLDBA>

Now you need to connect to oracle internally. Do that by entering the following command:

%SQLDBA> connect internal

Create the table space for the database instance by entering the following command:

%SQLDBA> create tablespace <table space name> <datafile> - <size>

An example of this command would be:

%SQLDBA> create tablespace elsa elsa_tables - 10M

Create the table space for the database instance index by entering the following command:

SQLDBA> create tablespace <table space name> <datafile> - <size>

An example of this command would be:

SQLDBA> create tablespace elsa else_index - 5M

Create a default user for the new database by entering the following command:

SQLDBA> grant connect to <user name> identified by <password>

An example of this command would be:

SQLDBA> grant connect to elsa identified by elsa

Give the new user access to the database table space just created by entering the command:

SQLDBA> grant resource on <table space name> to <user name>

An example of this command would be:

SQLDBA> grant resource on elsa_index to elsa.

Assign the table spaces as the default tablespace for the user by entering the following command:

SQLDBA> alter user cosmic default tablespace cosmic_tables

Exit sqldba by entering the following command:

SQLDBA> exit

To create the database tables, log into sqlplus using username/password specified in #6.

An example of this command would be:

% sqlplus elsa/elsa

You will be presented with the following prompt:

SQL>

Create MORE's tables using the script Create_tables.sql listed below:

SQL> @Create_tables

Insert initial data into MORE tables using the script initial_data.sql listed below. Modify the script for the appropriate librarian/user names.

SQL> @initial_data

Commit and exit from sqlplus:

SQL> commit;

SQL> quit

Modify the Makefiles in lib and source to reflect:

The server and port.

Oracle sid, home, user id and password.

Whether you want to log librarian activity.

Location of the stop list.

(NonKeywordFile) for natural language search.

Type "make" and watch the fun.

After You Have Compiled

Type "make install" to copy the binaries into the bin directory.

Copy the gif files from the gif directory into your http server document root directory.

Add the appropriate definitions into access.conf in your httpd configuration directory to allow execution of the binaries and access to the files in the html directory. Here's the definition that we use for one of our instances:

<Directory /usr07/MORE/bin>

AuthType Basic

AuthUserFile /usr/local/etc/httpd_1.0/.password

AuthGroupFile /usr/local/etc/httpd_1.0/.group

AuthName RBSE MORE testbed

<Limit GET>

require user ddanley

require user pshastri

require user tmcgrego

require user eichmann

</Limit>

</Directory>

Add the definition into srm.conf to point the http demon at the executables that you build. Here's the definition that we use for the same instance shown just above:

ScriptAlias /more /usr07/MORE/bin

Run htpasswd against the AuthUserFile defined in access.conf for each of the users you have listed there.

Modify html/more.html to reflect the new URLs. We request that you retain the CR/DR URLs as they stand, so that we receive submissions as soon as possible. Remember to relink default.html into your directory structure.

Bug Reports and Comments

Bug reports and other comments can be sent to MORE@rbse.jsc.nasa.gov.

If you find UHCL MORE useful or particularly interesting, please also send us a note.

Enjoy,

Dave Eichmann

Repository Based Software Engineering Group

Research Institute for Computing and Information Systems

University of Houston - Clear Lake

2700 Bay Area Blvd., Houston, TX 77058

eichmann@cl.uh.edu or eichmann@rbse.jsc.nasa.gov

MORE Documentation

General Information.

Information for End Users.

Information for Librarians.

MORE Glossary

Version 1.0

Asset

Some file accessible through the World Wide Web that MORE has a metadata record about.

Browsing

Involves step-by-step manual navigation through the hierarchies of classes and/or collections defined by a site's librarian.

Class

A group of objects which have the same format. Some classes might include documents, pictures, software, data and drawings.

Collection

A group of objects kept together in the repository for a specific reason such as a common purpose, topic, format, user group or source.

Collection Hierarchy

A hierarchy of relationships between collections.

Enumeration

A list of special terms used for class. It is similar to enumeration-type. The enumeration value is a particular type of enumeration.

Group

A collection of users and groups used for access control.

Inheritance Hierarchy

The hierarchy by which a class inherits the attributes of the class above it and the class below it inherits its attributes.

Keyword

A term supplied by a librarian to describe an object's content.

Librarian

Person responsible for the care and management of the library.

Metadata

A set of information that provides a complete citation or description of an object.

Object

Member of the library collection. Could be a program source code, data file, binary file, graphics image or document.

Related Collection

A collection that contains information that is of similar topic or associated in any way to another collection.

Relevance Feedback

A method of performing information retrieval where relevant items returned by previous queries are fed back to the system as additional queries.

Repository

A place where all objects and object attributes are stored.

Stemming

Stemming is the reduction of a word to a base form so that the search mechanism can find things like plurals, possessives, etc.

Subcollection

A group of objects defined as a subset of another group, or parent collection.

Subclass

An object class which inherits all the object attributes of its parent, class and which may have additional metadata attributes defined.

Synonym

A synonym is a single word that is defined by a librarian to be a substitute for a single word specified by a user's natural language search. Note that both the user's original word and the synonym(s) are reduced to a stemmed form prior to the search.

User

Person who uses the repository to browse, search or administer.

MORE Documentation

General Information.

Information for End Users.

Information for Librarians.

Information for Installers.

Other Documentation

NCSA Mosaic for X Documentation.

Serving Information to the Web.

 

 

APPENDIX B:  NCSA MOSAIC

      
[ Previous ]           [ Next ]           [ Home ]

 

NCSA Mosaic for Microsoft Windows User's Guide

Copyright, Disclaimer, Trademarks, and Contact Information

Introduction

A Brief History and an Overview.

This Document and Version.

System Requirements.

Other Documentation.

Concepts and Terminology

The World Wide Web (The Web, WWW, and W3).

Client/Server Software.

HyperText Transfer Protocol (HTTP).

Uniform Resource Locators (URL).

HyperText, Hyperlinks, and Hypermedia.

HyperText Markup Language (HTML).

Inline Images and External Viewers.

Hotlists.

Quick Start

Starting the Application.

The NCSA Mosaic Document View Window.

The Toolbar.

Navigation with Hyperlinks and the Menus.

Leaving the Application.

The Document View Window and the Mouse

The Document View Window.

The Title Bar.

The Menu Bar.

The Tool Bar.

Document Title and URL Bar and the NCSA Mosaic Logo.

Document Display Area.

Status Bar.

Moving a Document within the Display Area.

Use of the Mouse and Mouse Buttons in NCSA Mosaic.

Navigating the World Wide Web

Following Hyperlinks.

Hotlists (and User-configurable Menus).

The History List -- Backward and Forward through History.

The Home Page.

The Menus

File Menu.

Edit Menu.

Options Menu.

Navigate Menu.

Annotate Menu.

User-configurable Menus and Hotlists.

Help Menu.

Special Features

Annotations.

Using, Creating, and Editing Hotlists.

Forms.

Window Size and Placement.

Specifying your Home Page.

Aborting Document Transfers.

Searching for a Character String.

Gateways and Proxy Gateways.

Installation and Configuration

Acquiring the Software.

Installing Win32s.

Checking the WinSock DLL.

Configuring NCSA Mosaic.

Installing NCSA Mosaic into the Windows Program Manager.

Executing and Testing NCSA Mosaic.

Installation Trouble-shooting.

Other Configuration Issues.

Creating Simple HTML Documents

Seven Basic HTML Commands.

A Sample HTML Document.

More Information on HTML.

A Brief Guide to URLs

Structure of a URL.

Non-standard Port Numbers.

Partial URLs.

For More Information.

Known Bugs and Bug-like Features

Feedback to NCSA

Bug Reports, Comments, and Suggestions.

Getting Help.

National Center for Supercomputing Applications/mosaic-win@ncsa.uiuc.edu

Copyright, Disclaimer, Trademark, and Contact Information

NCSA Mosaic for Microsoft Windows

Copyright (C) 1993, 1994, Board of Trustees of the University of Illinois

NCSA Mosaic software, both binary and source, (hereafter, Software) is copyrighted by the Board of Trustees of the University of Illinois (UI), and ownership remains with the UI.

The UI grants you (hereafter, Licensee) a license to use the Software for academic, research, and internal business purposes only, without a fee. Licensee may distribute the binary to third parties provided that the copyright notice and this statement appears on all copies and that no charge is associated with such copies.

Licensee may make derivative works. However, if Licensee distributes any derivative work based on or derived from the Software, then Licensee will (1) notify NCSA regarding its distribution of the derivative work, and (2) clearly notify users that such derivative work is a modified version and not the original NCSA Mosaic distributed by the UI.

Any Licensee wishing to make commercial use of the Software must contact the UI, c/o NCSA, to negotiate an appropriate license for such commercial use. Commercial use includes (1) integration of all or part of the source code into a product for sale or license by or on behalf of Licensee to third parties, or (2) distribution of the binary code or source code to third parties that need it to utilize a commercial product sold or licensed by or on behalf of Licensee.

By using or copying this Software, Licensee agrees to abide by the copyright law and all other applicable laws of the U.S. including, but not limited to, export control laws, and the terms of this license. UI shall have the right to terminate this license immediately by written notice upon Licensee's breach of, or non-compliance with, any of its terms. Licensee may be held legally responsible for any copyright infringement that is caused or encouraged by Licensee's failure to abide by the terms of this license.

Disclaimer

UI makes no representation about the suitability of this software for any purpose. It is provided "As Is" without express or implied warranty. The UI shall not be liable for any damages suffered by the users of this software.

Trademarks

AppleTalk is a registered trademark of Apple Computer Inc. Ethernet is a trademark of Xerox Corporation. Microsoft and MS-DOS are registered trademarks and Windows, Windows NT, Win32s are trademarks of Microsoft Corporation. PostScript is a registered trademark of Adobe Systems Inc. Sun is a registered trademark, and Sun Workstation and SunView are trademarks of Sun Microsystems Inc. UNIX is a registered trademark of X/Open. All other brand or product names are trademarks or registered trademarks of their respective companies or organizations.

NCSA Contacts

Mail user feedback, bug reports, questions, and software and manual suggestions to:

Software Development Group

152 Computing Applications Bldg.

605 E. Springfield Avenue

Champaign, IL 61820-5518

Send electronic mail to one of the following:

mosaic-w@ncsa.uiuc.edu

Communications regarding NCSA Mosaic for Microsoft Windows

mosaic@ncsa.uiuc.edu

Other communications regarding NCSA Mosaic

softdev@ncsa.uiuc.edu

Other communications to the Software Development Group

If you want to see more software like NCSA Mosaic, please send us a letter, E-Mail or U.S. mail, telling us what you are doing with the software. We need to know:

1. What you are working on - an abstract of your work would be fine; and

2. How NCSA Mosaic has helped you, for example, by increasing your productivity or allowing you to do things you could not do before.

We encourage you to cite the use of NCSA Mosaic, and any other NCSA software you have used, in your publications. A bibliography of your work would be extremely helpful.

Orders

All NCSA products are available without charge from NCSA's anonymous FTP server:

ftp ftp.ncsa.uiuc.edu

Hardcopy manuals and software disks and tapes can be ordered through the NCSA Technical Resources Catalog. All orders must be prepaid. For a copy of the catalog, contact NCSA Orders by E-Mail at orders@ncsa.uiuc.edu, by phone at (217) 244-4130, or by U.S. mail at:

NCSA Orders

152 Computing Applications Bldg.

605 E. Springfield Ave.

Champaign, IL 61820-5518

Introduction

NCSA Mosaic, a World Wide Web browser, is a networked information discovery and retrieval tool developed by the Software Development Group (SDG) at the National Center for Supercomputing Applications (NCSA) on the campus of the University of Illinois at Urbana­Champaign (UIUC). Those are the facts of the matter, but to understand what NCSA Mosaic can do for you, you have to broaden the horizon a bit.

The World Wide Web (the Web) is a portion of the Internet designed for the dissemination of hypermedia material. But the Web is more than just a computer network; it is that network plus the vast store of information those computers can access. NCSA Mosaic allows the user to browse the information available on the Web much as browsing the shelves of a research library. The user can coast through material quickly or stop to delve into topics that look particularly interesting. And just like the information at the public library, the information on the Web is useful for the casual reader and the serious scientist or researcher.

NCSA Mosaic is also more than a Web browser; it provides a portal to most of the major server types on the Internet:  HTTP servers (the standard server on the Web), FTP servers, Gopher servers, and WAIS servers. With this capability, a user can access files on virtually any of the major servers on the Internet. Returning to the library analogy, if browsing the Web from a single home page is like browsing a research library, browsing the Internet with NCSA Mosaic, taking full advantage of all its server interface capabilities, is like browsing several of the world's greatest research libraries all at once from your desk.

A Brief History and an Overview

In late 1992 and early 1993, NCSA staff members were looking for a way to make the information on the Internet more accessible to the average computer user. The search for tools that would lend themselves to the sort of graphical, point-and-click interface that has proven so effective over the past decade led eventually to the World Wide Web and HTML.

The World Wide Web is a large-scale networked hypertext information system initially developed at CERN, the European Laboratory for Particle Physics in Geneva, Switzerland, in 1989. The Web was designed to distribute hypermedia information and to take advantage of developing standards for hypertext markup, hypermedia distribution, and information locators:

HyperText Markup Language

An SGML-based markup language that includes provisions for rich-text formatting, hyperlinks, inline graphics, and external viewers (for data formats that cannot be handled internally by the viewer software).

HTTP

HyperText Transfer Protocol, a fast, lightweight transfer protocol designed for the interactive, networked hypermedia environment.

URL

Uniform Resource Locator, a uniform reference system for locating individual files on virtually any computer system on the Internet.

The first