Database Table Naming
Introduction
The International Edition of SNOMED CT contains three release types (Full, Snapshot and Delta) and each of these release types include 20 files (2019-07-31 release)..
When designing a database to accommodate SNOMED CT release files, decisions need to be made about the names to give to each of the database tables. One option is to give the tables exactly the same names as the release files they represent. However, analysis of the release file naming conventions indicates that these conventions are not directly applicable to table names.
SNOMED CT release file naming conventions include some elements that represent information about the provenance, language and release date of a specific file. This information is useful and in some cases essential as a way of distinguishing releases files. However, this information is neither essential nor helpful when naming tables that may contain data from different SNOMED CT versions, editions and extensions.
The release file naming conventions do however include some essential elements that relate directly to the specification of the nature and structure of the data they contain. The following sections provide a summary of the release file naming conventions, identify the elements in release file names that are relevant to database table naming and describe a set rules that can be applied to derive consistent table names from release file names.
Release File Naming
All SNOMED CT release file are named in accordance with the Release File Naming Convention. The naming conventions result in names that can be decomposed into parts as illustrated by examples with color coding in the table below.
Table 1: Illustrations of the Release File Naming Conventions
Description of the pattern or file illustrated
Example release file names
prefix_[refsetPattern]componentType_[refsetType][extensionName]releaseType[-language]_country _releaseDate.txt
International edition full release concepts file for 2019-07-31
sct2_Concept_Full _INT _20190731.txt
International edition snapshot release english descriptions file
sct2_Description_Snapshot -en_INT _20190731.txt
Spanish extension full release spanish descriptions file
sct2_Description_SpanishExtensionFull -es_INT _20190430.txt
International edition snapshot release extended maps reference set file
der2_iisssccRefset_ExtendedMapSnapshot _INT _20190731.txt
Spanish extension full release spanish language reference set file
der2_cRefset_LanguageSpanishExtensionFull -es_INT _20190430.txt
International edition snapshot release english language reference set file
der2_cRefset_LanguageSnapshot -en_INT _20190731.txt
File Name Element Relevance to Table Names
The table below identifies the elements of the release file naming pattern that are relevant to the naming of the database tables containing content from those files. It also outlines the reasons why some elements that form an important part of the release file names can or should be omitted from the relevant database table names.
Table 2: Relevance of File Name Pattern Elements to Database Table Names
prefix
No
The prefix sct2 or der2 distinguishes components from derivatives (refsets). This information is present in the componentType and refsetType.
refsetPattern
No
This information relates to the datatypes of additional columns in the file and the table. The table structure includes the required columns so there is no reason to include this in the table name.
componentType
Yes
This is essential as it indicates either the type of components represented in the table or that this is a reference set
refsetType
Yes
This is essential to distinguish the tables representing different reference set types (and not present in other file names).
extensionName
No
This is not required as data from extensions files should be included in the same tables as the equivalent data from the international release. Individual records maintained in extensions can be distinguished by moduleId
releaseType
Yes
This is essential if importing data from both the full and snapshot release. However, since this is a fundamental grouping, it is probably sensible for this to be a prefix to the table name. Otherwise with long table names this key distinction may be easier to miss. A short prefix denoting release types with a convention that also allows database views to be named in a similar consistent manner is recommended.
language
No
This is not applicable to the description table name. All descriptions should be accommodated in a single table with the languageCode column indicating the language of the associated term. Similarly it is not applicated to a language reference set table name. All language reference sets should be accommodated in a single table with the refsetId column indicating the language and dialect of each language preference.
country
No
This is not required in the table name as the country or other point of origin of the components and reference set members is indicated by the moduleId.
releaseDate
No
This is not required as data from many releases is included in the full release file tables. In the case of the snapshot it would be possible to include the date of the snapshot in the table name. However this is not recommended because, as noted in Release Type Options multiple sets of tables representing different snapshot releases multiply the required storage capacity required.
Deriving Table Names from Release File Names
The analysis in Table 2, identifies three elements in the release file name that are relevant to table names. There are various ways in which table names could be derived by combining these elements and one of these is shown in Table 3. The end result (shown in Table 4) is a set of table names that:
Are as short as possible while clearly identifying:
The release type from which they are derived
The component or reference set type specification to which they conform
Are not specific to a particular SNOMED CT release or edition.
Note
The rules shown here are those applied to the example SNOMED CT database. Alternative table naming patterns may be preferred by those developing their own SNOMED CT database. However, is important is to ensure that the table naming pattern should be consistently applicable to all release files. Furthermore, it also should be readily applicable to any additional reference set types that may be added to future releases of the International Edition (or included in other SNOMED CT editions and or extensions).
Table 3: Rules Applied to Release File Names to Generate Table Name for the Example Database
Start with file name pattern
prefix_[refsetPattern]componentType_[refsetType][extensionName]releaseType[-language]_country _releaseDate.txt
Remove element that are not required
componentType[_refsetType]releaseType
Make release type the prefix
releaseType_ componentType[_refsetType]
Abbreviate the prefix to 4 characters (full or snap)
rtyp _componentType[_refsetType]
Table 4: Results of Mapping Release File Names to Example Database Table Names
sct2_Concept_Full_INT_20190731.txt
sct2_Description_Full-en_INT_20190731.txt
der2_cRefset_AssociationFull_INT_20190731.txt
der2_cRefset_AttributeValueFull_INT_20190731.txt
der2_ciRefset_DescriptionTypeFull_INT_20190731.txt
der2_iisssccRefset_ExtendedMapFull_INT_20190731.txt
der2_cRefset_LanguageFull-en_INT_20190731.txt
der2_ssRefset_ModuleDependencyFull_INT_20190731.txt
der2_cissccRefset_MRCMAttributeDomainFull_INT_20190731.txt
der2_ssccRefset_MRCMAttributeRangeFull_INT_20190731.txt
der2_sssssssRefset_MRCMDomainFull_INT_20190731.txt
der2_cRefset_MRCMModuleScopeFull_INT_20190731.txt
sct2_sRefset_OWLExpressionFull_INT_20190731.txt
der2_cciRefset_RefsetDescriptorFull_INT_20190731.txt
der2_Refset_SimpleFull_INT_20190731.txt
der2_sRefset_SimpleMapFull_INT_20190731.txt
sct2_Relationship_Full_INT_20190731.txt
sct2_StatedRelationship_Full_INT_20190731.txt
sct2_TextDefinition_Full-en_INT_20190731.txt
sct2_Concept_Snapshot_INT_20190731.txt
sct2_Description_Snapshot-en_INT_20190731.txt

... list continues for all Snapshot release files
... list continues for all the snap tables_
Last updated