Search
Service API
Skip navigation Digital Library for Earth System Education
Digital Library for Earth System Education
Search tips

Search Service API Documentation

Service version: DDSWS v1.1

Table of Contents

  1. Overview
  2. Definitions and concepts
  3. Service requests
  4. Service responses
  5. Search fields
  6. Example search queries
  7. Configure search fields

Overview

The Digital Discovery System Search Service (DDSWS) is a search and retrieval service API for items that reside in a digital repository, and is available from the Digital Discovery System (DDS) and the NSDL Collection System (NCS). Service requests are expressed as HTTP argument/value pairs and responses may be returned as XML or JSON.

The primary service request is Search, which provides a wide range of Information Retrieval features that are implemented using the Lucene search engine and supports textual searching over repository metadata and content, searching within specific fields, date ranges, geospatial bounding box search, and other functionality. Metadata are returned from the service for the objects that reside in the repository and may be disseminated in a number of XML formats as indicated by the ListXmlFormats request.

Web service requests and responses are described in detail below and examples are provided for reference by developers.

Definitions and concepts

DDSWS is a Representational State Transfer (REST) style Web service API. Service requests are expressed as HTTP argument/value pairs. These requests must be in either GET or POST format. Responses are returned in XML format by default, which varies in structure and content depending on the request as shown below in the examples section of this document. Responses can also be returned as JSON (JavaScript Object Notation) as an alternate output format to XML.

  • Base URL - the base URL used to access the Web service. This is the portion of the request that precedes the request arguments. For example http://www.dlese.org/dds/services/ddsws1-1.
  • Request arguments - the argument=value pairs that make up the request and follow the base URL.
  • DDSWS response envelope - the XML container used to return data. This container returns different types of data depending on the request made.

HTTP request format

The format of the request consists of the base URL followed by the ? character followed by one or more argument=value pairs, which are separated by the & character. Each request must contain one verb=request pair, where verb is the literal string 'verb' and request is one of the DDSWS request strings defined below. All arguments must be encoded using the syntax rules for URIs. This is the same encoding scheme that is described by the OAI-PMH.

Service requests

This section defines the available requests, or verbs.

The HTTP request format has the following structure:
[base URL]?verb=request[&additional arguments].

For example:
http://www.dlese.org/dds/services/ddsws1-1?
        verb=GetRecord&id=DLESE-000-000-000-001

Summary of available requests:

Search - Allows a client to search across resources in the repository using standard Lucene queries, which support term, field and phrase searches, term and term/field boosting, term stemming, wildcard and fuzzy searches, term proximity searches, and other functionality. The Search request has access to a wide range of search fields, and through the use of query clauses, can be used to apply custom search rank algorithms (see example search queries). The request also supports searching by XML format, date ranges, geospatial bounding box search, and other functionality.

UserSearch - Is nearly identical to the Search request except that it operates over educational resources in the ADN metadata format only, and it applies a default searcher that automatically performs word stemming and relevancy rank boosting for items that match higher relevancy search indicators such as when a matching term appears in the title field as opposed to elsewhere. These search algorithms are the same as those that are applied to user's searches in the DLESE library. This request is meant to to be used by clients working with ADN resources and that wish to leverage the automatic word stemming and search rank algorithms that are applied.

ListFields - Accesses the fields in the index.

ListTerms - Accesses the terms in a given field or fields.

GetRecord - Accesses the metadata for a single record.

ListCollections - Accesses the list of available metadata collections in the repository.

ListGradeRanges - Accesses the list of DLESE-specific controlled vocabularies and search keys for grade ranges.

ListSubjects - Accesses the list of DLESE-specific controlled vocabularies and search keys for subjects.

ListResourceTypes - Accesses the list of DLESE-specific controlled vocabularies and search keys for resource types.

ListContentStandards - Accesses the list of DLESE-specific controlled vocabularies and search keys for content standards.

ListXmlFormats - Accesses the list of the available XML formats from this service.

UrlCheck - Allows a client to check whether a given URL is cataloged in the repository.

ServiceInfo - Accesses information about this Web service.


Search

Sample request

The following request performs a search for the term "ocean" and returns 10 search results, starting at position 0:

http://www.dlese.org/dds/services/ddsws1-1?verb=Search&q=ocean&s=0&n=10&client=ddsws-documentation

Summary and usage

The Search request allows a client to search across resources in the repository using standard Lucene queries, which support term, field and phrase searches, term and term/field boosting, term stemming, wildcard and fuzzy searches, term proximity searches, and other functionality. The Search request has access to a wide range of search fields, and through the use of query clauses, can be used to apply custom search rank algorithms (see example search queries). The request also supports searching by XML format, date ranges, geospatial bounding box search, and other functionality.

The Search and UserSearch response consists of an ordered set of metadata records, sorted by relevancy. The Search request searches over all XML formats that are available in the repository, unless otherwise specified in the 'xmlFormat' argument as described below. The UserSearch request searches over the records available in the ADN format only. Flow control is managed by the client, which may 'page through' a set of results using the 's' and 'n' arguments as described below.

The Search and UserSearch requests accept queries supplied in the standard Lucene Query Syntax (LQS). LQS supports advanced Information Retrieval query clauses such as term and field boosting, wildcard and fuzzy searches, etc. Queries are supplied in the q argument of the request.

Arguments

Textual and fielded searches: The following argument is used to conduct textual and fielded searches and may be performed independently or in combination with other search criteria described below.

  • q - (query) an optional argument that may contain plain text or field/term specifiers. Boolean logic, field/term specifiers and boosting must be specified using the Lucene Query Syntax (LQS). Plain text terms (when no field is indicated) are used to search in the default field, which contains textual metadata extracted from the title, description, keywords, grade ranges, resource types and other areas of the repository metadata. See available search fields for detailed information about the fields that are available for searching.

Controlled vocabulary searches: The following arguments perform a search by controlled vocabulary and may be performed independently or in combination with other search criteria. The searchKey that must be used with these arguments must be discovered using the vocabulary list requests. Example searchKey gr=07. If supplied, the controlled vocabulary portion of the search criteria must match a given record in order for it to be included in the results. Note that searching by grade range (gr), resource type (re), subject (su) or content standard (cs) is useful only for clients that wish to search over ADN records using these DLESE-specific vocabularies.

  • ky - (collection) an optional repeatable argument that limits the search to records that reside in the given metadata collection(s).

  • gr - (grade range) an optional repeatable argument that limits the search to ADN records that contain the given grade range(s). These grade ranges are a DLESE-specific controlled vocabulary that is part of the ADN framework.

  • re - (resource type) an optional repeatable argument that limits the search to ADN records that contain the given resource type(s). These resource types are a DLESE-specific controlled vocabulary that is part of the ADN framework.

  • su - (subject) an optional repeatable argument that limits the search to ADN records that contain the given subject(s). These subjects are a DLESE-specific controlled vocabulary that is part of the ADN framework.

  • cs - (content standard) an optional repeatable argument that limits the search to ADN records that contain the given content standard(s). These content standards are a DLESE-specific controlled vocabulary that is part of the ADN framework.

Date range searches: The following arguments instruct the service to search in a given index date field and may be performed independently or in combination with other search criteria. The values provided in the fromDate or toDate arguments must be a union date type string of the form yyyy-MM-dd or an ISO8601 UTC datastamp of the form yyyy-MM-ddTHH:mm:ssZ. Example dates include 2004-07-08 or 2004-07-26T21:58:25Z. The fields that are available for searching by date are listed below. If supplied, the date range portion of the search criteria must match a given record in order for it to be included in the results. These arguments are Not supported in the UserSearch request.

  • dateField - an optional argument that indicates which index date field to search in. If supplied, one or both of either the fromDate or toDate arguments must be supplied.

  • fromDate - an optional argument that indicates a date range to search from. If supplied, the dateField argument must also be supplied.

  • toDate - an optional argument that indicates a date range to search to. If supplied, the dateField argument must also be supplied.

Geospatial searches: Geospatial searches operate over each record that has associated with it a geographic footprint (a geographic region representing the records's area of relevance) in the form of a box (defined below). A geospatial query takes a query region (also in the form of a box) and a spatial predicate (one of "within," "contains," "overlaps,") and returns all documents that 1) have a geographic footprint that 2) has the predicate relationship to the query region.

Formally, a box is a geographic region defined by north and south bounding coordinates (latitudes expressed in degrees north of the equator and in the range [-90,90]) and east and west bounding coordinates (longitudes expressed in degrees east of the Greenwich meridian and in the range [-180,180]). The north bounding coordinate must be greater than or equal to the south. The west bounding coordinate may be less than, equal to, or greater than the east; in the latter case, a box that crosses the ±180° meridian is described. As a special case, the set of all longitudes is described by a west bounding coordinate of -180 and an east bounding coordinate of 180.

The following arguments instruct the service to conduct a geospatial query over the subset of records that contain a geospatial footprint. Geospatial queries may be performed independently or in combination with other search criteria. To perform a geospatial query, all five of the required geospatial arguments must be included, otherwise none may be included, and thus are conditionally required. If an error in the request arguments is encountered, the service will return an appropriate error response and message. The optional geospatial argument may be included if desired.

  • geoPredicate - a conditionally required argument that indicates the relationship to the query region. Values must be one of [ within | overlaps | contains ].

  • geoBBNorth - a conditionally required argument that indicates the northern most latitude of search. Values must be a floating point number in the range [-90,90].

  • geoBBSouth - a conditionally required argument that indicates the southern most latitude of search. Values must be a floating point number in the range [-90,90].

  • geoBBWest - a conditionally required argument that indicates the western most longitude of search. Values must be a floating point number in the range [-180,180].

  • geoBBEast - a conditionally required argument that indicates the eastern most longitude of search. Values must be a floating point number in the range [-180,180].

  • geoClause - an optional argument that indicates the boolean clause applied to the geospatial portion of the search. Values must be one of [ must | should ], where must indicates the geospatial portion of the search criteria must match a given record in order for it to be included in the results; should indicates it should match but is not required in order to appear in the search results. Default value is must.

Flow control: A search client can control the flow of paging through a set of search results and the size of the result set using the the s (starting offset) and n (number returned) arguments. As an example, when a search is initially performed, the client might construct a request that supplies the arguments s=0 and n=10 to return up to the first 10 matching results. The client would then page through the set of results by issuing subsequent requests indicating s=10 and n=10 for the next ten results, s=20 and n=10 for results 20 through 30 and so forth up to totalNumResults. To retrieve each successive segment of search results the client must supply identical search criteria in all search related arguments (q, xmlFormat, gr, su, cs, re, xmlFormat, so, etc.), sorting and date-restrictive arguments. DDS search is deterministic and the set and order of search results are guaranteed to be identical for any two identical searches (assuming the repository has not changed in the interim). Thus the s and n arguments can be thought of as indicating the 'window' into the set of ordered search results into which the client wants to see.

  • s - (starting offset) - a required argument that specifies the starting offset into the results set upon which metadata records should be returned. May be any integer grater than or equal to 0.

  • n - (number returned) - a required argument that specifies the number of metadata records to return, beginning at the offset specified by s. Must be a integer from 1 to maxSearchResultsAllowed, as indicated in the response to the ServiceInfo request. The maximum allowed by this server is 1000.

Additional arguments: The following arguments may also be supplied in the request.

  • xmlFormat - an optional argument that indicates the format the records must be returned in. If specified, searches are limited to only those records that can be disseminated in the given format. If not specified, the records will be returned in their native format using a localized version of XML (e.g. stripped of their namespace and schema declarations). The available formats may be discovered using the ListXmlFormats request. Not supported in the UserSearch request.

  • client - an optional argument that may be supplied by the client to indicate where the request originated from. Example values might be ddsExamplesSearchClient or myLibrarySearchClient. When supplied, this information is used by the services administrators to help understand how people are using the service on a client-by-client basis.

  • so - (search over) an optional argument that must contain the value allRecords or discoverableRecords. Clients that request to search over allRecords must be authorized by IP, otherwise an error is returned. Defaults to discoverableRecords. Not supported in the UserSearch request.

Sorting the response: The following two arguments instruct the service to sort the response by a given index field. The service sorts the entire result set lexically prior to returning the requested portion of the results. Only one of these two arguments may be supplied in the request. Values must a sortable field in the index, as listed below. These arguments are Not supported in the UserSearch request.

  • sortAscendingBy - an optional argument that instructs the service to sort the search results in ascending lexical order by a given index field.

  • sortDescendingBy - an optional argument that instructs the service to sort the search results in descending lexical order by a given index field.

 


Errors and exceptions

See error and exception conditions.

Examples

Request

Search for the word ocean.

http://www.dlese.org/dds/services/ddsws1-1?
           verb=Search&q=ocean&s=0&n=10

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <Search>
    <resultInfo>
      <totalNumResults>520</totalNumResults>
      <numReturned>10</numReturned>
      <offset>0</offset>
    </resultInfo>
    <results>
      <record>
        <head>
          <id>DLESE-COLLECTION-000-000-000-018</id>
          <collection recordId="DLESE-COLLECTION-000-000-000-012">
              Science Ed Resource Center (SERC)</collection>
          <xmlFormat>dlese_collect</xmlFormat>
          <fileLastModified>2004-03-29T20:44:41Z</fileLastModified>
          <whatsNewDate type="collection">2004-03-29</whatsNewDate>
          <additionalMetadata realm="dlese_collect">
            <formatOfRecords>adn</formatOfRecords>
            <isEnabled>true</isEnabled>
            <numRecords>325</numRecords>
            <numRecordsIndexed>324</numRecordsIndexed>
            <partOfDrc>false</partOfDrc>
          </additionalMetadata>
        </head>
        <metadata>
          <collectionRecord>
            <general>
              <fullTitle>Carleton College Science Education 
               Resource Center (SERC) - Starting Point Entry 
               Level Geoscience Collection
              </fullTitle>
              ...

</DDSWebService>

Request

Search for the word ocean and limit the search to grade range High (9-12).

http://www.dlese.org/dds/services/ddsws1-1?
           verb=Search&q=ocean&gr=02&s=0&n=10

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <Search>
    <resultInfo>
      <totalNumResults>208</totalNumResults>
      <numReturned>10</numReturned>
      <offset>0</offset>
    </resultInfo>
    <results>
      <record>
	<head>
          <id>NASA-Edmall-2315</id>
          <collection recordId="DLESE-COLLECTION-000-000-000-014">
             NASA ED Mall Collection</collection>
          <xmlFormat>adn</xmlFormat>
          <fileLastModified>2004-06-17T18:24:10Z</fileLastModified>
          <whatsNewDate type="itemnew">2003-07-29</whatsNewDate>
          <additionalMetadata realm="adn">
            <accessionStatus>
               accessioneddiscoverable
            </accessionStatus>
            <partOfDrc>false</partOfDrc>
          </additionalMetadata>
        </head>
        <metadata>
          <itemRecord>
            <general>
              <title>Coriolis Force</title>
              ...

</DDSWebService>

Request

Search for all ADN records new to the repository since July 7th, 2004 and sort descending by the wndate field.

http://www.dlese.org/dds/services/ddsws1-1?
verb=Search&s=0&n=10&fromDate=2004-07-08&dateField=wndate
&sortDescendingBy=wndate&xmlFormat=adn-localized

Response

Same format as above.



UserSearch

Sample request

The following request performs a search for the term "ocean" and returns 10 search results, starting at position 0:

http://www.dlese.org/dds/services/ddsws1-1?verb=UserSearch&q=ocean&s=0&n=10&client=ddsws-documentation

Summary and usage

The UserSearch request is nearly identical to the Search request except that it operates over educational resources in the ADN metadata format only, and it applies a default searcher that automatically performs word stemming and relevancy rank boosting for items that match higher relevancy search indicators such as when a matching term appears in the title field as opposed to elsewhere. These search algorithms are the same as those that are applied to user's searches in the DLESE library. This request is meant to to be used by clients working with ADN resources and that wish to leverage the automatic word stemming and search rank algorithms that are applied.

The UserSearch response is identical to the Search response and consists of an ordered set of ADN records, sorted by relevancy. The default searcher that is used incorporates several Information Retrieval techniques designed to augment the search rank and total number of results. These augmentations include word stemming, boosting of records that contain search terms in their title or description and a slight boosting of records that are cataloged by two or more collections or are part of the DLESE Reviewed Collection. The default searcher's algorithms are applied automatically to all terms and phrases supplied by the client in the default field portion of the query sent in the request, and all other fields in the query are treated normally (see available search fields for examples and details). Clients may use this request as a starting point and apply additional boosting to what is provided by the default searcher by supplying additional ranking clauses in the query sent in the request. Clients wishing to implement their own search rank algorithms fully from scratch, or to search over records in formats other than ADN, should use the Search request.

Arguments

UserSearch accepts the same arguments as the Search request, with the exception of xmlFormat, sortAscendingBy, sortDescendingBy, dateField, fromDate, toDate, and so.

Errors and exceptions

See error and exception conditions.

Examples

Request

Same as the Search request, however the verb argument must be indicated as 'UserSearch' and the arguments listed above are not accepted.

Response

Identical to that of the Search request. UserSearch only returns ADN records.


GetRecord

Sample request

The following request displays the metadata for record ID DLESE-000-000-000-001 displayed in it's native XML format:

http://www.dlese.org/dds/services/ddsws1-1?verb=GetRecord&id=DLESE-000-000-000-001

Summary and usage

The GetRecord request is used to pull up the metadata for a single item in the repository. Clients should use this request to display the metadata from a single record, for example if the user has requested "more information" about a resource. The data is returned in ADN format and other formats including dlese_collect, dlese_anno, oai_dc, nsdl_dc and briefmeta. Sample ADN records are available here.

Arguments

  • id - a required argument that specifies the identifier for the record.

  • xmlFormat - an optional argument that indicates the format the record must be returned in. If specified, responses are limited to only those records that are available in the given format. If not specified, the record will be returned in it's native format using a localized version of XML (e.g. stripped of it's namespace and schema declaration). The available formats may be discovered using the ListXmlFormats request.

  • so - (search over) an optional argument that must contain the value allRecords or discoverableRecords. Users who request to search over allRecords must be authorized by IP, otherwise an error is returned. Defaults to discoverableRecords.

Errors and exceptions

See error and exception conditions.

Examples

Request

Request the record id DLESE-000-000-000-337 and get the response in ADN format. Shown without the required encoding, for clarity.

http://www.dlese.org/dds/services/ddsws1-1?
        verb=GetRecord&id=DLESE-000-000-000-337

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <GetRecord>
    <record>
      <head>
        <id>DLESE-000-000-000-337</id> 
        <collection recordId="DLESE-COLLECTION-000-000-000-015">
          DLESE Community Collection (DCC)</collection> 
        <xmlFormat>adn</xmlFormat> 
        <fileLastModified>2004-06-24T19:06:08Z</fileLastModified> 
        <whatsNewDate type="itemnew">2003-07-10</whatsNewDate> 
        <additionalMetadata realm="adn">
          <accessionStatus>accessioneddiscoverable</accessionStatus> 
          <partOfDrc>true</partOfDrc> 
          <alsoCatalogedBy collectionLabel="NASA ESE Reviewed 
             Collection" 
             collectionRecordId="DLESE-COLLECTION-000-000-000-023">
                  NASA-ESERevProd333</alsoCatalogedBy> 
        </additionalMetadata>
      </head>
      <metadata>
        <itemRecord>
          <general>
            <title>Earth Science Picture of the Day</title> 
            ...

</DDSWebService>


ListFields

Sample request

The following request lists all fields in the index:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListFields

Summary and usage

The ListFields request is used to get all search fields that reside in the index. It is not necessary for the Lucene fields to be stored.

Arguments

None


Errors and exceptions

See error and exception conditions.

Examples

See link above

ListTerms

Sample request

The following request lists all terms in the index for field 'title':

http://www.dlese.org/dds/services/ddsws1-1?field=title&verb=ListTerms

Summary and usage

The ListTerms request is used to get all search terms that exist in the index for a given field or fields. It is not necessary for the Lucene fields to be stored. For each term the response indicates the number of times it appears in the index (termCount) as well as the number of documents (records) it appears in (docCount).

Arguments

  • field - a required repeatable argument that contains the name of a field. The field argument may be repeated as many times as desired within a single request. Note that response times will increase dramatically when more than one field is requested.


Errors and exceptions

See error and exception conditions.

Examples

See link above

ListCollections

Sample request

The following request lists the metadata collections that are available in the repository:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListCollections

Summary and usage

The ListCollections request is used to discover the available metadata collections in the repository and to retrieve the search field/key values used to perform searches across collections. Clients should use this request to generate user interface widgets for selecting collections to search from, or to display collection information such as the number of records in a collection. This request belongs to the vocabulary list class of requests.

The response from ListCollections conforms to the vocabulary list response format but includes two additional elements: <recordId> and <additionalMetadata>

Examples

Refer to the documentation for the vocabulary list class of requests.


ListGradeRanges

Sample request

The following request lists the DLESE-specific grade range vocabularies and corresponding search keys:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListGradeRanges

Summary and usage

The ListGradeRanges request is used to discover the DLESE controlled vocabularies and search field/keys for grade ranges used in the adn and dlese_collect metadata frameworks. Clients that work with these DLESE frameworks may use this request to generate user interface widgets for selecting grade ranges to search from. This request belongs to the vocabulary list class of requests.

Examples

Refer to the documentation for the vocabulary list class of requests.


ListSubjects

Sample request

The following request lists the DLESE-specific subject vocabularies and corresponding search keys:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListSubjects

Summary and usage

The ListSubjects request is used to discover the DLESE controlled vocabularies and search field/keys for subjects used in the adn and dlese_collect metadata frameworks. Clients that work with these DLESE frameworks may use this request to generate user interface widgets for selecting the subjects to search from. This request belongs to the vocabulary list class of requests.

Examples

Refer to the documentation for the vocabulary list class of requests.


ListResourceTypes

Sample request

The following request lists the DLESE-specific resource type vocabularies and corresponding search keys:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListResourceTypes

Summary and usage

The ListResourceTypes request is used to discover the DLESE controlled vocabularies and search field/keys for resource types used in the adn and dlese_collect metadata frameworks. Clients that work with these DLESE frameworks may use this request to generate user interface widgets for selecting the resource types to search from. This request belongs to the vocabulary list class of requests.

Examples

Refer to the documentation for the vocabulary list class of requests.


ListContentStandards

Sample request

The following request lists the DLESE-specific content standard vocabularies and corresponding search keys:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListContentStandards

Summary and usage

The ListContentStandards request is used to discover the DLESE controlled vocabularies and search field/keys for content standards used in the adn and dlese_collect metadata frameworks. Clients that work with these DLESE frameworks may use this request to generate user interface widgets for selecting the content standards to search from. This request belongs to the vocabulary list class of requests.

Examples

Refer to the documentation for the vocabulary list class of requests.


Vocabulary list requests

Summary and usage

Vocabulary list requests include ListGradeRanges, ListSubjects, ListResourceTypes, ListContentStandards, and ListCollections*. Each of the vocabulary list requests use the same request and response format.

Vocabulary list requests are used to determine the search values supplied in the gr, su, re, cs and ky arguments of the Search and UserSearch requests and should be used to construct user interface menus for selecting the grade ranges, subjects, etc. for users to limit their search by.

More specifically, vocabulary list requests represent the class of requests that expose controlled vocabularies in the repository (grade ranges, subjects, resource types, content standards and collections). Vocabulary list requests may be used to discover the vocabulary entries ('Primary elementary'. etc.), the search field/key pair used to perform and limit searches across the given vocabulary in the Search and UserSearch requests ('gr=07', etc.), and a set of rendering guidelines used to determine things such as whether to display the vocabulary listing to the user and the label that should displayed, for example 'Primary (K-2)'.

Implementation tip: Library vocabularies change very infrequently (on the order of years or months). Clients should retrieve the vocabulary values once and cache them, for example at application start up, rather than retrieving them each time a user accesses the client.

*Note: ListCollections conforms to the vocabulary list response but includes two additional elements: <recordId> and <additionalMetadata>

Arguments

None.

Errors and exceptions

See error and exception conditions.

Examples

Request

Request the grade ranges that are available. Note the verb argument may contain any of the vocabulary list requests (ListGradeRanges, ListSubjects, ListResourceTypes, ListContentStandards, or ListCollections) corresponding to the vocabulary you are interested in.

http://www.dlese.org/dds/services/ddsws1-1?verb=ListGradeRanges

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <ListGradeRanges>
    <searchField>gr</searchField>
    <gradeRanges>
      <gradeRange>
        <vocabEntry>DLESE:Primary elementary</vocabEntry>
        <searchKey>07</searchKey>
        <renderingGuidelines>
          <label>Primary (K-2)</label>
          <noDisplay>false</noDisplay>
          <wrap>false</wrap>
          <divider>false</divider>
          <hasSubList>false</hasSubList>
          <isLastInSubList>false</isLastInSubList>
          <groupLevel>0</groupLevel>
        </renderingGuidelines>
      </gradeRange>
      <gradeRange>
        <vocabEntry>DLESE:Intermediate elementary</vocabEntry>
        ...

</DDSWebService

*Note: ListCollections conforms to the vocabulary list response format shown above but includes two additional elements: <recordId> and <additionalMetadata>


ListXmlFormats

Sample request

The following request lists the XML formats that may be disseminated from this service and their corresponding search keys:

http://www.dlese.org/dds/services/ddsws1-1?verb=ListXmlFormats

Summary and usage

The ListXmlFormats request is used to discover the XML formats available from the repository as a whole or for a single record in the repository. Clients should use this request to discover the available XML formats and the keys that may be supplied in the 'xmlFormat' argument of the Search or GetRecord requests.

DDSWS is able to disseminate a number of XML formats including ADN (adn), News&Opps (news_opps), DLESE annotation (dlese_anno), DLESE collection (dlese_collect), OAI Dublin Core (oai_dc), NSDL Dublin Core (nsdl_dc), and others.

Certain records are available in multiple formats. For example, records that were originally cataloged in the ADN format may be returned in the adn, adn-localized, briefmeta, oai_dc, nsdl_dc, format. When a record is requested in a non-native format, it's XML is transformed to the requested format using XSLT or other transformation prior to being returned by the service.

Many XML formats are available in namespace-specific form or a localized form that contains no namespace or schema declaration. Localized XML is indicated by adding -localized to the end of the XML format specifier, for example adn-localized. When localized XML is returned, the XML is generally easier to read and XPath notation is greatly simplified. By default, all requests in the service return localized versions of the metadata unless a non-localized specifier is indicated.

Arguments

  • id - an optional argument that specifies an ID in the repository. If supplied the request will show only those XML formats that are available for that ID. If omitted, the response will indicate all XML formats that are available in the repository.

Errors and exceptions

See error and exception conditions.

Examples

Request

Show all XML formats available for ID DLESE-000-000-000-001.

http://www.dlese.org/dds/services/ddsws1-1?
             verb=ListXmlFormats&id=DLESE-000-000-000-001

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <ListXmlFormats>
    <xmlFormat>adn</xmlFormat>
    <xmlFormat>adn-localized</xmlFormat>
    <xmlFormat>briefmeta</xmlFormat>
    <xmlFormat>nsdl_dc</xmlFormat>
    <xmlFormat>oai_dc</xmlFormat>
  </ListXmlFormats>
</DDSWebService>


UrlCheck

Sample request

The following request searches for all records in the repository that have a URL ending in '.pdf':

http://www.dlese.org/dds/services/ddsws1-1?url=http://.pdf&verb=UrlCheck

Summary and usage

The UrlCheck request is used to check whether a given URL is in the DDS repository. This request supports the use of the * wildcard construct. The * character, or wildcard construct, indicates that any character combination is a valid match. For example, a search for http://www.dlese.org/myResource* will match the two URLs http://www.dlese.org/myResource1.html and http://www.dlese.org/myResource2.html. The wildcard construct may appear at any position in the URL argument except the first position.

Arguments

  • url - a required repeatable argument that contains a URL. The url argument may be repeated as many times as desired within a single request.

Errors and exceptions

See error and exception conditions.

Examples

Request

Determine whether the URL 'http://epod.usra.edu/' is in the repository. Shown without the required encoding, for clarity.

http://www.dlese.org/dds/services/ddsws1-1?
    verb=UrlCheck&url=http://epod.usra.edu/

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <UrlCheck>
    <resultInfo>
      <totalNumResults>1</totalNumResults> 
    </resultInfo>
    <results>
      <matchingRecord>
        <url>http://epod.usra.edu/</url> 
        <head>
          <id>DLESE-000-000-000-337</id> 
          <collection recordId="DLESE-COLLECTION-000-000-000-015">
            DLESE Community Collection (DCC)</collection> 
          <xmlFormat>adn</xmlFormat> 
          <fileLastModified>2004-06-24T19:06:08Z</fileLastModified> 
          <whatsNewDate type="itemnew">2003-07-10</whatsNewDate> 
          <additionalMetadata realm="adn">
            <accessionStatus>accessioneddiscoverable</accessionStatus> 
            <partOfDrc>true</partOfDrc> 
            <alsoCatalogedBy collectionLabel="NASA ESE 
                 Reviewed Collection" 
              collectionRecordId="DLESE-COLLECTION-000-000-000-023">
                 NASA-ESERevProd333</alsoCatalogedBy> 
          </additionalMetadata>
        </head>
      </matchingRecord>
    </results>
  </UrlCheck>
</DDSWebService>
Note: responses to this request contain the common head element.

Request

Determine whether the URL 'http://epod.usra.edu/' or 'http://www.marsquestonline.org/index.html' is in the repository.

http://www.dlese.org/dds/services/ddsws1-1?
   verb=UrlCheck&url=http://epod.usra.edu/&
   url=http://www.marsquestonline.org/index.html

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <UrlCheck>
    <resultInfo>
      <totalNumResults>2</totalNumResults> 
    </resultInfo>
    <results>
      <matchingRecord>
        <url>http://www.marsquestonline.org/index.html</url> 
        ....
      </matchingRecord>
      <matchingRecord>
        <url>http://epod.usra.edu/</url> 
        ...
      </matchingRecord>
    </results>
  </UrlCheck>
</DDSWebService>

Request

Determine whether a URL that begins with 'http://www.dlese.org' is in the repository. The * character acts as a wildcard, which may appear at any position in the URL argument except the first position.

http://www.dlese.org/dds/services/ddsws1-1?
         verb=UrlCheck&url=http://www.dlese.org* 

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <UrlCheck>
    <resultInfo>
      <totalNumResults>2</totalNumResults> 
    </resultInfo>
    <results>
      <matchingRecord>
        <url>http://www.dlese.org/vgee/index.htm</url> 
        ...
      </matchingRecord>
      <matchingRecord>
        <url>
   http://www.dlese.org/documents/policy/CollectionsScope_final.html
        </url> 
        ...
      </matchingRecord>
    </results>
  </UrlCheck>
 </DDSWebService>

Request

Determine whether the URL 'http://epod.usra.edu/zzzz' is in the repository. In this case no matching records are found.

http://www.dlese.org/dds/services/ddsws1-1?
        verb=UrlCheck&url=http://epod.usra.edu/zzzz 

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <UrlCheck>
    <resultInfo>
      <totalNumResults>0</totalNumResults> 
    </resultInfo>
  </UrlCheck>
</DDSWebService>


ServiceInfo

Sample request

The following request displays information about this Web service:

http://www.dlese.org/dds/services/ddsws1-1?verb=ServiceInfo

Summary and usage

The ServiceInfo request is used to retrieve general information about this Web service including name, description, the URL used to access the service (base URL), service version, the maximum number of search results allows by the Search and UserSearch requests, and an administrator e-mail.

Arguments

None

Errors and exceptions

See error and exception conditions.

Examples

Request

Display information about the Web service

http://www.dlese.org/dds/services/ddsws1-1?verb=ServiceInfo

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService>
  <ServiceInfo>
    <serviceName>
      Digital Library for Earth System Education (DLESE) 
      discovery Web service
    </serviceName>
    <baseURL>http://www.dlese.org/dds/services/ddsws1-1</baseURL>
    <serviceVersion>1.1</serviceVersion>
    <adminEmail>support@dlese.org</adminEmail>
    <compression>gzip</compression>
    <maxSearchResultsAllowed>1000</maxSearchResultsAllowed>
    <description> ... description here ... </description>
  </ServiceInfo>
</DDSWebService>


Service responses

Service responses are returned in XML or JSON format and vary in structure and content depending on the request made. This section describes common response structures that are returned by the service. The content and structure of each of the request responses are described above, not here.

Common response elements

Several requests in the protocol share common XML elements in their responses. These include the <head> and <additionalMetadata> elements, which are described below.

The head element

The head element appears in the UserSearch, Search, GetRecord, UrlCheck responses. The head element is used to return information about a single record. This includes the ID of the record, the collection in which the record is a member of, the XML format of the record, the date the record was last modified, the whatsNewDate and an additionalMetadata element.

Head element example:
<?xml version="1.0" encoding="UTF-8" ?> 
...
<head>
   <id>CEIS-000-000-001</id>
   <collection recordId="DLESE-COLLECTION-000-000-000-003">
          Discover Our Earth</collection>
   <xmlFormat>adn</xmlFormat>
   <fileLastModified>2004-07-02T17:32:29Z</fileLastModified>
   <whatsNewDate type="itemnew">2003-07-19</whatsNewDate>
	<additionalMetadata realm="adn">
         ...
        </additionalMetadata>
</head>
...


The additionalMetadata element

The additionalMetadata element appears in UserSearch, Search, GetRecord, UrlCheck and the vocabulary list class of responses. The additionalMetadata element is used to return additional information related to the record's format type, referred to as realms. The information realms include adn and dlese_collect, and each contains slightly different information related to underlying format type.

additionalMetadata element example:
<?xml version="1.0" encoding="UTF-8" ?> 
...
<additionalMetadata realm="adn">
   <accessionStatus>accessioneddiscoverable</accessionStatus>
   <partOfDrc>false</partOfDrc>
   <alsoCatalogedBy collectionLabel="DLESE Community Collection (DCC)"
        collectionRecordId="DLESE-COLLECTION-000-000-000-015">
           DLESE-000-000-000-840</alsoCatalogedBy>
   <alsoCatalogedBy collectionLabel="Cutting Edge" 
        collectionRecordId="DLESE-COLLECTION-000-000-000-010">
           SERC-NAGT-000-000-000-322</alsoCatalogedBy>
</additionalMetadata>
...

Error and exception conditions

If an error or exception occurs, the service returns an <error> element with the type of error indicated by a code attribute. Clients are advised to test the value of these codes and respond with an appropriate message to users. For example, if a user conducts a search that has no matches, the code noRecordsMatch will be returned from the server and a message indicating that the search had no results can be displayed. The error codes are similar to those defined by OAI-PMH.

Error Codes Description Applicable Verbs
noRecordsMatch The combination of values supplied in the Search or UserSearch request resulted in a query that had no matching records. Search
UserSearch
badQuery The value supplied in the q argument of the Search or UserSearch request was malformed or syntactically incorrect. Search
UserSearch
badArgument The request includes illegal arguments, is missing required arguments, includes a repeated argument, or values for arguments have an illegal syntax. all verbs
badVerb Value of the verb argument is not a legal or the verb argument is missing. N/A
cannotDisseminateFormat The metadata format identified by the value given for the xmlFormat argument is not supported by the item or by the repository. GetRecord
idDoesNotExist The value of the id argument is unknown or illegal in the repository. GetRecord
notAuthorized The client that made the request is not authorized to access the requested data from the service. all verbs
internalServerError The server for the service encountered a problem and was not able to respond to the request. all verbs

Example error response


Request

Request a record id that does not exist in the repository using GetRecord.

http://www.dlese.org/dds/services/ddsws1-1?
          verb=GetRecord&id=BAD-ID-123

Response

<?xml version="1.0" encoding="UTF-8" ?> 
<DDSWebService 
    xmlns="http://www.dlese.org/Metadata/ddsws" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xsi:schemaLocation="http://www.dlese.org/Metadata/ddsws 
	http://www.dlese.org/Metadata/ddsws/1-1/ddsws.xsd">
	<error code="idDoesNotExist">ID "BAD-ID-123" does not exist in the repository</errror>
</DDSWebService>


Requesting JSON output

Each of the service responses can be returned as JSON (JavaScript Object Notation) as an alternate output format to XML. JSON is a simple data format based on the object notation of the JavaScript language and is commonly used in Ajax-style programming to bring content into Web pages asynchronously. For more information about JSON and how it is used, see Douglas Crockford's site www.json.org and the Yahoo! JSON developers page. A DDS client that illustrates it's use is shown in these examples.

By default, all responses are output in XML format. To get JSON output, include the argument output=json in the request. Additionally, a callback argument callback=function may be included to wrap the JSON output in parentheses and a function name of your choosing. The JSON output by the service is a direct translation of the XML structure into JSON.

Arguments

  • output=json - An optional argument that, when used, instructs the service to return JSON output instead of XML.

  • callback=function - An optional argument that, when used in conjunction with the output=json argument, instructs the service to return the JSON output wrapped in parentheses and a function name of your choosing, as indicated by the argument value.



Removing namespaces from output

Namespaces can be removed from the XML and JSON output from the service, which can simplify working with and processing the output.

By default, all responses are returned with the namespaces that appear in the requested format disseminated from the repository. To remove namespaces, include the argument transform=localize in the request.

Arguments

  • transform=localize - An optional argument that, when used, instructs the service to return the XML or JSON output without namespaces.




Search fields

This section describes the search fields that are available in the DDSWS Search and UserSearch requests. The repository contains fields that are extracted from each of the XML records within, and a given repository may contain records in many different native XML formats. Searches within a given field operate over the set of records that contain that field. For example, a search in the default field operates over all records in the repository, since all records are guaranteed to contain the default field, whereas a search in the title field operates over a potentially smaller sub-set of records that contain the title field. Boolean searches may be performed across and within each of the fields using the Lucene Query Syntax (LQS) supplied in the q argument of the Search or UserSearch request. Example search queries are provided below.

Fields may contain plain text, controlled vocabularies or encoded field values.

Certain fields may be used to sort the search results when used in the sortAscendingBy or sortDescendingBy arguments of the Search request. Sortable fields are indicated below.

How search fields are generated

At index creation time, each record is inserted in the repository in it's native XML format. The indexer extracts standard, XPath and custom search fields from the content of the native XML and additional fields associated with the item may also be extracted from other sources, such as text derived from a crawl of the resource described by the metadata record. The indexer then generates a single entry containing each of the fields and inserts it into the repository. All records are guaranteed to contain certain fields such as the default and stems fields, as well as XPath fields for their native XML format. Details about the standard, XPath and custom fields are provided below.

Searching across and within specific XML formats

The Search request operates over and disseminates records in any available XML format. By default, searches operate over the available fields for all records in the repository regardless of format, and results may contain records of mixed XML formats. For example, a search for default:ocean searches the for the term ocean in the default field across all records in the repository and may return records in oai_dc, adn, dlese_anno and other formats in a single result set depending on what matches are found.

Requesting search results in a specific XML format: Certain XML formats can be disseminated from the service in multiple formats, for example records that reside natively as adn can also be disseminated in the oai_dc format. The Search request accepts an optional xmlFormat argument, which instructs the service to search over and return only those records that can be disseminated in the given format. In this case, the search still operates over the fields associated with the record's native XML format, however the results will be returned in the requested XML format only, and records that reside in a different native format will be transformed and returned in the requested XML format.

Limiting search to specific XML formats: Each record contains the special field xmlFormat, which contains the format key associated with the native format for the record. To search over and return records that reside in specific native XML formats, include this field in the query for the Search request. For example, the query xmlFormat:oai_dc ocean will search for and return all records in the repository that reside in the native oai_dc format and that contain the term ocean in the default field.

The xml format keys that may be used in the xmlFormat argument or xmlFormat search field in the Search request may be discovered using the ListXmlFormats request.

Text versus stemmed text

When searching in a text field, exact terms are matched. For example a search for ocean will return all records that contain the exact term ocean in the given field. Where indicated, certain textual fields have stemming applied to them using the Porter stemmer algorithm (snowball variation). When searching in a field that has been stemmed, all records containing morphologically similar terms in the given field are matched. For example a search for stems:ocean will return all records that contain the terms ocean, oceans or oceanic in the stems field. Note that when searching in a stemmed field, the client should not apply stemming to the terms it supplies for search. Stemming will be applied automatically by the search engine for these fields and no pre-processing is necessary by the client.


Standard Search Fields

Standard search fields are available across all XML formats that support them, which includes oai_dc, nsdl_dc, ncs_collect, adn, dlese_collect, dlese_anno, news_opps, concepts and all other formats that have them configured in a given DDS repository.

  • default - The default field represents the text that is most appropriate for searching by humans using natural language, textual queries. For the Search request, this field is searched when no field specifier is indicated in the query (e.g. the query default:ocean and ocean are equivalent). For the UserSearch request, the default field must be explicitly indicated in the query and if no field specifier is indicated, UserSearch performs an expanded search as described above in the UserSearch description.

    The default field contains text extracted from different locations in the metadata depending on the XML format. May not be sorted. Available for all formats. Contains the following content:
    • adn: Includes title, description, keyword, resource type, subjects, event names, place names, temporal coverage names, terms found in the primary URL, and creators last name. May be sorted.
    • dlese_collect: Includes full title, short title, description, subjects, keywords, and terms extracted from the primary URL, scope URL and review process URL.
    • dlese_anno: Includes title, description and terms extracted from the URL.
    • news_opps: Includes title, description, keywords, announcements, topics, audience, diversities, location, and sponsors institution.
    • All other formats: Contains the full content from all Elements within the XML. Content from Attributes within the XML is *not* included.

  • admindefault - This field is meant to support users who are responsible for maintaining the repository. It contains the full content from all Elements as well as Attributes within the XML. May not be sorted. Available for all formats.

  • stems - Contains the same content as the default field, however all terms are stemmed. May not be sorted. Available for all formats.

  • title - Contains the titles of resources or items, as text. May be sorted. Available for formats: adn, news_opps, ncs_collect and all formats that specify this field for indexing.

  • titlestems - Contains the same content as the title field, however all terms are stemmed. May not be sorted. Available for formats: adn, news_opps, ncs_collect and all formats that specify the title field for indexing.

  • url - Contains the URL for the resource encoded as text. Useful search query examples include http*nasa.gov* or http*.edu*. May be sorted. Available for formats: adn, ncs_collect and all formats that specify this field for indexing.

  • description - Contains the descriptions of resources or items, as text. May be sorted. Available for formats: adn, dlese_collect, news_opps, ncs_collect and all formats that specify this field for indexing.

  • descriptionstems - Contains the same content as the description field, however all terms are stemmed. May not be sorted. Available for formats: adn, dlese_collect, news_opps, ncs_collect and all formats that specify the description field for indexing.

  • xmlFormat - Contains the native XML format key for the record, for example oai_dc or adn, which may be discovered via the ListXmlFormats service request. Available for formats: all formats.

  • idvalue - Contains the internal unique identifier for the record, for example MY-ID-001, indexed untokenized as a keyword. Available for formats: all formats.

  • allrecords - Special field that matches all records in the repository by applying allrecords:true to the query. This is useful for constructing certain types of queries, for example allrecords:true NOT ocean returns all records in the repository that do not contain the term ocean in the default field. Single valid value is true. This has the same effect as the Lucene query *:*. Available for formats: all formats.

  • hasBoundingBox - Boolean value that indicates whether the record has a geospatial bounding box footprint available for search. Valid values are either true or false. Available for formats: all formats.

 

XPath Search Fields

XPath search fields provide separate searchable fields for the contents of every element and attribute found in the native XML of the records. For each element and attribute there are three forms of search fields: text, stemmed text and untokenized keywords. These provide a powerful, flexible way to search for specific text or data within and across the records in the repository.

The XPath fields consist of a prefix followed by an XPath that addresses a specific XML element or attribute in the XML record. Prefixes are one of /text/, /stems/, or /key/, which specify to search over text, stemmed text or untokenized keyword forms of the data, respectively. This is followed by a namespace-free, position-free XPath addressing a specific element or attribute in the XML.

The three types of search fields are processed in the following manner:

text - Text is processed using the Lucene StandardAnalyzer.
stems - Text is processed using the Lucene SnowballAnalyzer for the english language.
key - Text is processed using the Lucene KeywordAnalyzer, which is case-sensitive and includes the entire element or attribute as a single token.

The XPaths used for the search fields are the most simple form of XPath expression, containing no namespaces or position specifiers. For more information about XPath see XPath Language 1.0. The ZVON XPath Tutorial is also useful. Note that this is not an implementation of XQuery but rather a mapping of simple XPaths to searchable Lucene fields.

For example, consider this simple XML instance document:

<book>
  <author birthDate="1955-01-25">
    <firstName>John</firstName>
    <lastName>Doe</lastName>
  </author>
  <identifier>http://books.org/catalog_123</identifier>
</book>

The index will contain the following search fields for this record:

/text//book/author/firstName
/stems//book/author/firstName
/key//book/author/firstName

/text//book/author/lastName
/stems//book/author/lastName
/key//book/author/lastName

/text//book/author/@birthDate
/stems//book/author/@birthDate
/key//book/author/@birthDate

/text//book/identifier
/stems//book/identifier
/key//book/identifier

As another example, consider the following Dublin Core oai_dc record:

<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
    xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ 
	http://www.openarchives.org/OAI/2.0/oai_dc.xsd" 
    xmlns:dc="http://purl.org/dc/elements/1.1/" 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <dc:title xmlns:dc="http://purl.org/dc/elements/1.1/">Ocean Science Leadership Awards</dc:title>
  <dc:description xmlns:dc="http://purl.org/dc/elements/1.1/">This is a description of the 
  Ocean Science Leadership Awards... </dc:description>
  <dc:subject xmlns:dc="http://purl.org/dc/elements/1.1/">Earth system science</dc:subject>
  <dc:subject xmlns:dc="http://purl.org/dc/elements/1.1/">Education</dc:subject>
  <dc:format xmlns:dc="http://purl.org/dc/elements/1.1/">text/html</dc:format>
  <dc:type xmlns:dc="http://purl.org/dc/elements/1.1/">Text</dc:type>
  <dc:identifier xmlns:dc="http://purl.org/dc/elements/1.1/">
     http://www.usc.edu/org/cosee-west/quikscience/OceanLeadershipAwards.html
  </dc:identifier>
</oai_dc:dc>

The following Lucene queries are examples that match specific text and data in this record. As with all fielded Lucene queries, these queries consist of a field name followed by a colon ":" and then followed by the term(s) to search for. Note that XPaths do not contain namespaces or position specifiers:

/stems//dc/title:oceans - Matches the stemmed form of the term ocean found in the title element of the XML record.

/text//dc/subject:education - Matches the term education found in one of the subject elements of the XML record.

/key//dc/format:"text/html" - Matches the untokenized keyword term text/html found in the format element of the XML record.

 

Determining which XPaths have been indexed

In addition to the XPaths fields, a special field named indexedXpaths contains each XPath that has been indexed for a given record, as a keyword. Using this field it is possible to search for all records that have any value assigned for a given XPath. For example, the following query:

indexedXpaths:"/dc/subject" - Matches all records that have any value in the /dc/subject field.

Conversely, the following query:

allrecords:true !indexedXpaths:"/dc/subject" - Matches all records that have no value in the /dc/subject field.

 

Custom Search Fields

Custom search fields are available for specific XML formats as indicated below. Additional custom search fields that are not described here may also be available for a given DDS repository configuration.


Text fields - These fields contain plain text or, where indicated, text that has been stemmed using the Porter stemmer algorithm (snowball variation).

  • keyword - Contains keywords associated with the resource or item, as text. May be sorted. Available for formats: adn, dlese_collect, news_opps.

  • creator - Contains the first, middle and last name of each contributor for the resource. May not be sorted. Available for formats: adn.

  • organizationInstName - Contains the name of the contributing institution. May be sorted. Available for formats: adn.

  • organizationInstDepartment - Contains the name of the contributing institution's department. May be sorted. Available for formats: adn.

  • personInstName - Contains the name of the contributing person's institution. May be sorted. Available for formats: adn.

  • personInstDepartment - Contains the name of the contributing person's institutional department. May be sorted. Available for formats: adn.

  • emailPrimary - The primary contributor's e-mail. May be sorted. Available for formats: adn.

  • emailOrganization - The contributing organization's e-mail. May be sorted. Available for formats: adn.

  • emailAlt- The alternate contributor's e-mail. May be sorted. Available for formats: adn.

  • placeNames - Place names, for example "colorado," "AZ," "Brazil," as text. May be sorted. Available for formats: adn.

  • eventNames - Event names, for example "windstorm," "Destruction of Pompeii," as text. May be sorted. Available for formats: adn.

  • temporalCoverageNames - Temporal coverage names, for example "cambrian," "Triassic Period," as text. May be sorted. Available for formats: adn.

  • itemAudienceTypicalAgeRange - The typical age range for this resource. Available for formats: adn.

  • itemAudienceInstructionalGoal - The instructional goals for this resource. Available for formats: adn.

  • newsOppstitle - News & Opportunities title. May be sorted. Available for formats: news_opps.

  • newsOppsdescription - News & Opportunities description. May be sorted. Available for formats: news_opps.

  • newsOppskeyword - News & Opportunities keywords. May be sorted. Available for formats: news_opps.

  • ncsCollectOaiBaseUrl - Contains the NSDL Collection OAI baseURL. Useful search clause examples include http*nasa.gov* or http*.edu*. May be sorted. Available for formats: ncs_collect.
    From xpath: /record/collection/ingest/oai/@baseURL

Textual content - These fields contain the text of the content of the resources themselves, extracted by crawling the first page of the resource. These are available for all ADN resources in the reository whose primary content is in HTML or PDF.

  • itemContent - The full textual content of the resource. May be sorted. Available for formats: adn.

  • itemContentTitle - The HTML title element text. May be sorted. Available for formats: adn.

  • itemContentHeaders - The HTML header element (H1, H2, etc.) text. May be sorted. Available for formats: adn.

  • itemContentType - The HTTP content type header terms that were returned by the Web server that holds the resource, for example "text html", "application pdf". May be sorted. Available for formats: adn.

Textual vocabulary fields - These fields contain DLESE controlled vocabularies that have been indexed as plain text.

  • gradeRange - The DLESE grade range vocabularies verbatim as text, for example "DLESE:Primary elementary." These values may be discovered using the ListGradeRange request within the vocabEntry element. May be sorted. Available for formats: adn, dlese_collect.

  • resourceType - The DLESE resource type vocabularies verbatim as text, for example "DLESE:Learning materials:Classroom activity." These values may be discovered using the ListResourceTypes request within the vocabEntry element. May be sorted. Available for formats: adn, dlese_collect.

  • subject - The DLESE subject vocabularies verbatim as text, for example "DLESE:Atmospheric science." These values may be discovered using the ListSubjects request within the vocabEntry element. May be sorted. Available for formats: adn, dlese_collect.

  • contentStandard - The DLESE content standard vocabularies verbatim as text, for example "NSES:K-4:Unifying Concepts and Processes Standards:Change, constancy, and measurement." These values may be discovered using the ListContentStandards request within the vocabEntry element. May be sorted. Available for formats: adn.




  • itemannotypes - Indicates the type of annotation that this item has, for example "Teaching tip," "Information on challenging teaching and learning situations," as text. These values are shown in the types schema. May be sorted. Available for formats: adn.

  • itemannostatus - Indicates the status of an annotation that this item has, for example "Text annotation completed," as text. These values are shown in the status schema. May be sorted. Available for formats: adn.

  • itemannoformats - Indicates the format of an annotation that this item has. Values include 'text', 'audio', 'video' and 'graphical'. May be sorted. Available for formats: adn.

  • itemannopathways - Indicates the pathway of an annotation that this item has, for example "CRS (Community Review System)," as text. These values are shown in the pathway schema. May be sorted. Available for formats: adn.

  • newsOppsannouncement - News & Opportunities announcements. May be sorted. Available for formats: news_opps.

  • newsOppsaudience - News & Opportunities audience. May be sorted. Available for formats: news_opps.

  • newsOppsdiversity - News & Opportunities diversity. May be sorted. Available for formats: news_opps.

  • newsOppslocation - News & Opportunities locations. May be sorted. Available for formats: news_opps.

  • newsOppstopic - News & Opportunities topics. May be sorted. Available for formats: news_opps.

  • ncsCollectEdLevel - NSDL Collection education level field. May be sorted. Available formats: ncs_collect.
    From xpath: /record/educational/educationLevels/nsdlEdLevel, /record/educational/educationLevels/otherEdLevel

  • ncsCollectCollectionPurpose - NSDL Collection collection purpose field. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/collectionPurposes/collectionPurpose

  • ncsCollectAudience - NSDL Collection audience field. May be sorted. Available formats: ncs_collect.
    From xpath: /record/educational/audiences/nsdlAudience, /record/educational/audiences/otherAudience

  • ncsCollectSubject - NSDL Collection subject field. May be sorted. Available formats: ncs_collect.
    From xpath: /record/general/subject

  • ncsCollectCollectionSubject - NSDL Collection collection subject field. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/collectionSubjects/collectionSubject

  • ncsCollectPathwayName - NSDL Collection pathway name. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/pathways/name

Encoded vocabulary fields - These fields contain DLESE-specific controlled vocabularies used in the adn and dlese_collect metadata frameworks that have encoded into keys. Corresponding textual vocabulary fields are listed above, e.g. the same information is indexed both as text and as keys for these fields: gr - gradeRanges; re - resourceTypes; su - subjects; cs - contentStandards.

  • gr - The DLESE grade range vocabularies encoded as a two or three character key, for example "05." These values may be discovered using the ListGradeRanges request within the searchKey element. May be sorted. Available for formats: adn.

  • re - The DLESE resource type vocabularies encoded as a two or three character key, for example "05." These values may be discovered using the ListResourceTypes request within the searchKey element. May be sorted. Available for formats: adn.

  • su - The DLESE subject vocabularies encoded as a two or three character key, for example "05." These values may be discovered using the ListSubjects request within the searchKey element. May be sorted. Available for formats: adn.

  • cs - The DLESE content standard vocabularies encoded as a two or three character key, for example "05." These values may be discovered using the ListContentStandards request within the searchKey element. May be sorted. Available for formats: adn.

Defined key fields - These fields contain finite sets of key values that may be used to limit searches to a sub-set of records.

  • ky - Contains the search key for the collection in which the record resides, which may be used to limit search to within one or more collections of records. These values may be discovered using the ListCollections request within the searchKey element. May be sorted. Available for formats: adn.

  • collection - Similary to ky, contains the record's collection vocabulary entry appended with a 0, for example "0dcc," "0comet.". These values may be discovered using the ListCollections request within the vocabEntry element. May be sorted. Available for all formats.

  • itemhasanno - Indicates whether an item has an annotation. Values are either "true" or "false." May be sorted. Available for formats: adn.

  • partofdrc - Indicates whether the item or collection is part of the DLESE Reviewed Collection (DRC). Values are either "true" or "false." May be sorted. Available for formats: adn, dlese_collect.

  • multirecord - Indicates whether the resource that the record catalogs is also cataloged by other records in other collections. Values are either "true" or "false." May be sorted. Available for formats: adn.

  • wntype - Indicates the reason the item is new to the repository, corresponding to the 'wndate' field. Possible values are: itemnew, itemannocomplete, itemannoinprogress, annocomplete, drcannocomplete, drcannoinprogress, collection. May be sorted. Available for all formats.

  • ncsCollectHasOai - Boolean value that indicates whether the NSDL Collection metadata contains OAI information (an OAI baseURL). Possible values are: true, false. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/ingest/oai/@baseURL

  • ncsCollectOaiVisibility - The NSDL Collection OAI visibility field falue. Possible values are: public, protected, private. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/OAIvisibility

  • ncsCollectIsPathway - Boolean value that indicates the NSDL Collection pathway value. Possible values are: true, false. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/pathway

Fields available for searching by value or range of value - These fields may be searched by exact value or by range of value:

  • itemannoaveragerating - Contains the average of all star ratings assigned to a given resource. Values range from 1.000 to 5.000. Example search syntax itemannoaveragerating:[3.500 TO 5.000] - returns all resources with an average star rating of 3.5 to 5.0. May be sorted. Available for formats:adn.

  • itemannoratingvalues - Contains all star ratings assigned to a given resource. Values range from 1 to 5. Example search syntax itemannoratingvalues:[3 TO 5] - returns all resources that have one or more ratings of 3, 4, or 5 stars assigned to them. May be sorted. Available for formats:adn.

  • itemannonumratings - Contains the number of star ratings that have been assigned to a given resource. Values are encoded to 5 digits, for example 00000 or 00014. Example search syntax itemannonumratings:[00004 TO 99999] - returns all resources that have from 4 to 99999 star ratings assigned to them. May be sorted. Available for formats:adn.

  • annorating - Contains the star rating of a given annotation record. Values are integers from 1 to 5. Example search syntax annorating:[3 TO 5] - returns all annotations that have a start rating of 3 to 5. May be sorted. Available for formats: dlese_anno.

  • ncsCollectOaiFrequency - Integer and float values that indicate the NSDL Collection OAI harvest frequency in months. Range queries are not supported. (search by value only). Possible values are: 1, 2, ... n; 0.5. May be sorted. Available formats: ncs_collect.
    From xpath: /record/collection/ingest/oai/@frequency

Fields available for searching by date - These fields may be supplied in the 'dateField' argument of the Search request:

  • wndate - A date field that indicate the date the item was new to the repository, corresponding to the 'wntype' field. May be sorted. Available for all formats.

  • accessiondate - The ADN accession date for the record. May be sorted. Available for formats: adn.

  • collaccessiondate - The dlese_collect accession date for the collection. May be sorted. Available for formats: dlese_collect.

  • modtime - A date field that corresponds to the time the items file was last modified or touched. This does not necessarily indicate that the content in the record changed. May be sorted. Available for all formats.

  • newsOppsapplyBydate - News & Opportunities applyBy date. May be sorted. Available for formats: news_opps.

  • newsOppsarchivedate - News & Opportunities archive date. May be sorted. Available for formats: news_opps.

  • newsOppsduedate - News & Opportunities due date. May be sorted. Available for formats: news_opps.

  • newsOppseventStartdate - News & Opportunities eventStart date. May be sorted. Available for formats: news_opps.

  • newsOppseventStopdate - News & Opportunities eventStop date. May be sorted. Available for formats: news_opps.

  • newsOppspostdate - News & Opportunities post date. May be sorted. Available for formats: news_opps.

  • newsOppsrecordCreationdate - News & Opportunities recordCreation date. May be sorted. Available for formats: news_opps.

  • newsOppsrecordModifieddate - News & Opportunities recordModified date. May be sorted. Available for formats: news_opps.


Example search queries


This section shows some examples of performing searches using the Search or UserSearch request. To perform these searches, the values shown below should be supplied in the 'q' argument, using the Lucene Query Syntax (LQS). Additional arguments may be supplied to the Search or UserSearch request to further limit the search, such as xmlFormat, dateField and the vocabulary fields gr, su, re and cs.

Search for the term 'ocean' in the default field:
ocean

Search for the term 'ocean' in the stems field. This will return documents containing morphologically similar terms including ocean, oceans and oceanic:
stems:ocean

Search for the terms 'currents in the oceans' in the stems field. Notice that the client should supply the plain english version of the terms without pre-stemming them. In this example the resulting search matches documents that contain both currents, current or currently AND oceans, ocean, or oceanic (the terms 'in' and 'the' are stop words that are dropped for the purpose of search):
stems:(currents in the oceans)

Search for resources that that have an average star rating of 3.5 to 5.0:
itemannoaveragerating:[3.500 TO 5.000]

Search for resources that contain 'noaa.gov' in their URL:
url:http*noaa.gov*

Search for the term ocean within resources from 'noaa.gov':
url:http*noaa.gov* AND ocean

Search for term 'estuary' in the stems field, and limit the search to subject biological oceanography (subject key 02):
stems:estuary AND su:02

Search for the term 'ocean' in the default field, and boost the ranking of results that contain 'ocean' in their title (stemmed) (uses the special clause allrecords:true to select the set of all records). Note that this clause returns the same number of results as if the search were performed only over the word 'ocean' in the default field, but it applies additional boosting for records that contain the term 'ocean' in their title (stemmed), which augments the search rank of the results that are returned. This example illustrates the kind of search rank augmentation that is applied automatically in the UserSearch request.
ocean AND (allrecords:true OR titlestems:ocean^2)

Show all records with subject biological oceanography, and boost results that contain florida in the title (stemmed), description or placeNames fields (uses the clause allrecords:true to select the set of all records):
su:02 AND (allrecords:true OR titlestems:florida*^20 
           OR description:florida*^20 OR placeNames:florida^20) 


Glossary

whatsNewDate - A date that describes when an item was new to the repository. Generally this corresponds to the item's accession date or the date in which the item first became accessible in the repository.

 

Configure search fields

The following document provides information for system administrators who are installing and managing a DDS repository system, which includes the Digital Discovery System (DDS) and the NSDL Collection System (NCS).

  • Configure Search Fields - This document describes how to configure the search fields for specific XML frameworks in the repository.


John Weatherley <>
Last revised: $Date: 2010/04/27 23:14:53 $
University Corporation for Atmospheric Research (UCAR) National Science Foundation (NSF) National Science Digital Library (NSDL)