...
Some names have different nomenclatural status (e.g. whether the name was validly published under the code – if not the status = nom. inval.). It is possible for that name to be subsequently validly published. The details of the name will be the same, but the name now has a different nomenclatural status. At first it seems to be best to treat these as separate names that are linked. However this may cause issues during integration, for example if a provider does not provide the nomenclatural status, is this data for the validly published name or the invalid one? It was decided therefore to treat these names as the same and pool all nomenclatural status values for that name – it is then up to the consumer / viewer of that name to determine the status of the name. Therefore these names instances result in the same name, but there was two nomenclatural acts that led to this name.
Anchor | ||||
---|---|---|---|---|
|
Integration By SQL
A simpler, but much faster approach to building an initial set of integrated names is to use SQL queries.
The idea is based on the fact that most names are distinct (about 98%). It is therefore much more efficient to generate a "backbone" of names from these distinct names, rather than iterating through them all, performing a mathc to discover there are no matches, then inserting the name as a new consensus name.
This approach works with the most complete names first, with the theory that a name with less detail will match multiple names with more detail. If a name is integrated with less detail then
Generating Consensus Records
...