Monday, May 12, 2008
Monday, January 14, 2008
WEB-BASED METADATA ADMINISTRATION SYSTEM
WEB-BASED METADATA ADMINISTRATION SYSTEM
Durgendra Man Kayastha
Chief Survey Officer,
NGIIP, Survey Department, Nepal
digimap@wlink.com.np
KEY WORDS: Metadata, Geospatial data search
ABSTRACT: Metadata forms the basis for exploring data. A well structured and detailed metadata supports exploring the existence of a data set, suitability of datasets for the intended application, and also finding out the ways to obtain such a dataset. One of the fundamental purposes of establishing a geospatial information infrastructure is to provide an easy means to supporting such an activity of data mining/exploration etc. National Geographic Information Infrastructure Programme, Survey Department (NGIIP/SD), Nepal is working towards the development of a web-based metadata administration system concerning geospatial data in Nepal. The system was XML based and built upon open source software technology.
This paper describes the content standards adapted followed by system prototype.
1. Introduction
Geo-spatial data are often required for different purposes to users. Consequently collection of such data proceeds with the methods probably most suitable for the intended application. In general, data collected for one particular application could be made useful to scores of other applications. Hence, such dataset should be made available to other users as easily as possible. It is generally found that the data collected and maintained by one agency was not known to other agency or may be difficult to obtain such data due to several reasons. In many cases the other users even have no option of finding out the suitability of such data maintained by other agencies in case of availability too. This generally leads to collection and maintenance of data by individual agencies on their own, probably duplicating the data itself. Such situation can be avoided or the problems can be minimized by setting up a metadata system and thus making suitable provision for sharing the information among the users in the framework of a national geographic information infrastructure. The following sections describe the content of the metadata adopted by the NGIIP/SD and the implementation of the system.
2. Metadata Content
Metadata serves several important purposes including data browsing, data transfer, and data documentation. Considering the usability of existing dataset, metadata could be maintained at several levels of complexities. In the basic form, metadata might consist of a simple listing describing basic information about available data, whereas detailed information may be included about individual dataset. The Content Standards for Digital Geospatial Metadata (CSDGM) of the Federal Geographic Data Committee, US, specifies some 334 different metadata elements for a set of geospatial data. The purpose of the content standards is to provide a common set of terminology and definitions for documentation related to these metadata. Information about what elements of the metadata is mandatory, optional, repeatable, or one of a choice are encoded in the production rules of the CSDGM. ANZLIC Working Group, on the other hand proposed considerably a smaller subset as core metadata elements.
The content of the NGIIP/SD metadata contains only 176 elements mostly taken from the CSDGM/FGDC and few elements are taken from ESRI’s adaptation. Besides, variations have also been made in the production rules such as removing certain mandatory elements and also limiting the number of entries. For example, only two addresses are allowed whereas unlimited numbers of address are allowed in the FGDC specifications. Similarly, only 10 attributes are allowed in this system as opposed to unlimited attributes in the CSDGM/FGDC specifications. These limitations are made to simplify the situation from the practical standpoint only which after evaluating the system could be changed in future. Yet, the basic structure of the content remains the same as of CSDGM/FGDC.
3. System Context
The overall system context is shown in figure 1.
Web Client: The actual user interface enables users to Add/Edit metadata along with features to query for desired data in the network and other features.
Gateway: Provides interfaces to client, respond to clients and connects the client to different services available.
Catalogue Service: Provides features to search, retrieve and display metadata.
Building metadata repository.
Metadata Service: Enables adding/editing metadata. Contains Interfaces/APIs allows to enter metadata and create XML files.
Metadata DB : Metadata repository. Stores all XML files as a single repository.
Data Access Service: Data download and/or interactive session
Webmaping Service: Interactive session through Data Access Service

4. System Functionality
The system was developed basically to allow the following activities:
1. Metadata service,
4.1 Metadata Service System
This service allows any users with valid username and password to create and edit metadata. User starts by entering metadata file identification details namely File ID and Title. One can add metadata either using the forms provided (figure 2) or directly upload an XML file (Figure 3). This service generates an XML file in the case where users chose the form inputs). The system adds the XML file to a temporary repository for approval by administrator.
The user could edit the previously submitted metadata by invoking the edit module which involves retrieving the data, editing and resubmitting the metadata. The system will add edited metadata to a temporary XML repository for approval by administrator.

Figure 2. Metadata entry using form.
Figure 3. Direct uploading of XML file.
The general process flow of the metadata service system is as shown in the figure 4.
As shown in Figure 4, user fills up the metadata entry form and submits it to create the metadata. Then comes the role of metadata system, which converts the user provided data into a complete set of metadata (a xml file). The creation of metadata xml is based on the xml schema (xsd file) which defines the structure / hierarchy of the metadata along with different constraints and validation rules. Based on this schema, JAXB API is used by the java application code to create a java content tree holding the hierarchical metadata elements for each set of metadata. The content tree is created using the JAXB generated java classes and interfaces corresponding to the xml schema elements. The application code then converts the content tree into the metadata set (xml file) using JAXB API. This process is known as marshalling.
The metadata generated is then stored in a temporary XML repository, which holds all the metadata, submitted by the users. Admin user can view all the newly submitted metadata sets. Upon verifying the validity of the metadata sets, the admin user can then add the metadata to the xml database server (eXist server in our case). All the metadata visualization, search and metadata edit are done for metadata residing in this XML database server.

Figure 4. Process flow of the metadata service system
Metadata Edit involves the unmarshalling process, which is the conversion of metadata xml file into java content objects. The data residing in the content objects are then accessed by the application code and displayed in the web forms for editing. User can now edit the metadata entries and then submit it to the temporary XML repository. Processes similar to the creation of new metadata are followed then after till the metadata is submitted to the XML database server where the old metadata set is replaced by the new updated set.
4.2 Catalogue Service
Catalogue service includes search, visualization and administration of metadata.
Search: The search service is available to all the users and do not require login process. Two types of search is available viz. ‘Keyword Search’ and ‘Spatial Search’.
Spatial Search is conceptually similar to keyword search by map coordinates but instead of directly feeding the map coordinates to a search form; the user input is taken from spatial interface itself.
Visualization: The result of the search is a simple listing of available metadata. The list shows the title of the dataset, origin and published date. User can view slightly more information by following the ‘details’ link to view partial metadata page. From this page, if the user is interested, they can view full metadata. The full metadata can be viewed in different formats and layouts as desired by the user. The available stylesheets for viewing full metadata are FGDC, ESRI and core XML standards.
Metadata Administration: A new metadata submitted will firstly be added to a temporary repository. Inclusion of such metadata for publication requires administrator’s approval. Once approved the metadata will be added to the main repository and deleted from the temporary repository.
5. Conclusion
As a prototype, the system has been created around an intranet environment. If this is ported to the web, any publisher or users can access such system with an access to the Internet. The application thus developed is still being tested prior to porting to the web.
XML has been adopted as standard for storing metadata. Various APIs and interfaces have been developed to fetch the metadata contents and finally create an XML document. Similarly, XML document can be edited. As a prototype a fully functional metadata entry system has been created with a partial editing capability to work on an intranet environment. Additional work is still required to include full editing capability.
The elements of metadata have been adapted from CSDGM/FGDC with modification as per the local context. For creating a single repository of XML documents, eXist server has been chosen as a XML database server. Thus, in general, the overall system is based on open software solution.
References
Implementation report of National Geographic Information Infrastructure Programme (NGIIP) Clearing House Project, 2004.
www.fgdc.gov/metadata
www.ngiip.gov.np