Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Some names have different nomenclatural status (e.g. whether the name was validly published under the code – if not the status = nom. inval.). It is possible for that name to be subsequently validly published. The details of the name will be the same, but the name now has a different nomenclatural status. At first it seems to be best to treat these as separate names that are linked. However this may cause issues during integration, for example if a provider does not provide the nomenclatural status, is this data for the validly published name or the invalid one? It was decided therefore to treat these names as the same and pool all nomenclatural status values for that name – it is then up to the consumer / viewer of that name to determine the status of the name. Therefore these names instances result in the same name, but there was two nomenclatural acts that led to this name.

Anchor
_GoBack
_GoBack

Integration By SQL

A simpler, but much faster approach to building an initial set of integrated names is to use SQL queries.

The idea is based on the fact that most names are distinct (about 98%).  It is therefore much more efficient to generate a "backbone" of names from these distinct names, rather than iterating through them all, performing a mathc to discover there are no matches, then inserting the name as a new consensus name.

This approach works with the most complete names first, with the theory that a name with less detail will match multiple names with more detail. 

The fields that are used for defining a distinct name are:

  • Canonical
  • Rank
  • Authors
  • Year 
  • Genus
  • Species
  • GoverningCode

Genus and Species are not fields of a name, but are calculated fields based on parent concepts.The theory with including these fields is to ensure and sub generic name or sub specific name matches other sub generic/specific names that do not have exactly the same parent hierarchy.

For example:

Name 1: Aus bus var. cus

  • Aus, genus
    • bus , species
      • cus, variety

Name 2: Aus bus xus var. cus

  • Aus, genus
    • bus, species
      • xus, subspecies
        • cus, variety

The fields for these 2 names will be:

Name CanonicalRankAuthorsYear GenusSpeciesGoverning Code 
 1cusvar.  AusbusICBN 
 2cusvar.  AusbusICBN

So according to these fields, the names will match even though the direct parents of the 2 'cus' names are different, which is correct.

Another example:

Name 1: Lecanorales Nannf., order

  • Ascomycetes, class
    • Lecanorales Nannf., order

Name 2: Lecanorales, order

  • Ascomycetes, class
    • Lecanoromycetidae, subclass
      • Lecanorales, order
NameCanonicalRankAuthorsYearGenusSpeciesGoverning Code
1LecanoralesorderNannf.   ICBN
2Lecanoralesorder    ICBN

Again, will match even though the parent names are definied to be different.  Again this is correct.

 

Generating Consensus Records

...