NOTE: Region 1 metadata was imported to describe this nation-wide dataset. BioGeomancer merged all Region datasets and it is unclear if the other regions have done the same extensive work on their datasets. BioGeomancer has modified the Region 1 metadata to include some information from the more generalized metadata from the other regions.
While any number of fields may be included in datasets submitted for entry into the BGSD, only a subset are uploaded. Extraneous fields not necessary for the purposes of the BGSD are deleted and, in most cases, additional fields are added during processing of the dataset. There are three main fields in the BGSD upload file; "name" which holds the feature name, "term" which holds the feature type, and "geometry" which holds the spatial information. In addition, "time_period_id" is used to indicate whether the feature name is current, former, proposed or historical, and "related_name" is used to hold a concatenated list of alternative names for the feature. Two fields are used to record the date of processing for a record set ("entryDate") or the date selected records were reprocessed ("modificationDate"). Another field relates each record to the dataset it came from ("g_coll_name") and another to the metadata about that dataset ("g_coll_note").
Importing data into the Biogeomancer Spatial Database involves a number of checks, modifications, and transformations. If metadata is not provided it must be created de novo using whatever information is available. When the input file is a shapefile, it must first be checked for projection. The BGSD does not use projected data, rather, data are stored with latitude and longitude as coordinates, using the WGS84 (World Geodetic Survey 1984) spheroid for the horizontal datum. This provides for an accurate and uniform storage of feature coordinates for the world and it is sometimes known as a Geographic Projection. If the input shapefile is in any other projection (e.g. UTM, Lambert Conformal Conic) it must be reprojected or converted to the WGS84 projection. If the feature is a polygon or a line, rather than a point, its geometry is then checked and, if necessary repaired. The problems that could arise in the geometry of a feature include short segments, null geometries, incorrect ring orderings, incorrect segment orientations, self intersections, unclosed rings, and/or empty parts. These problems are repaired with a script in ArcGIS software ver. 9.1 (ESRI, Redlands, CA, USA). If the input file is text, it must be first converted to a GIS layer using spatial information contained in the file (X, Y) and its associated metadata.
The input records are assigned feature types and all data imported into an ArcGIS Personal Geodatabase. These may have a typing system included, but in almost all cases this will differ from the one used by the BGSD, which itself is based on that of the Alexandria Digital Library. During processing, a feature type field, "term", is added to the dataset. If an incoming dataset is heterogeneous and has a field for feature type, a cross-reference table is created to convert from the dataset feature type lexicon to that of the BGSD. The use of feature types is integral to the proper functioning of the Biogeomancer Spatial Lookup Module, because it allows default extent for the feature type to be used in cases where an extent for the specific feature is not available. The feature types are useful for indexing and assigning relative uncertainty measures where geographic feature extents are unknown.
Where possible, we include the original metadata information for the dataset.