Release File Naming Convention
Overall Naming Pattern
The basic pattern for SNOMED CT release file names consists of five elements, each separated by an underscore (" _ ") and followed by a full stop (" . ") and a file extension:
[FileType]_[ContentType]_[ContentSubType]_[CountryNamespace]_[VersionDate].[FileExtension]
Each element in the above structure is described in more detail by table in the following section.
FileType Element
The FileType element of the filename designates the type and intended use of the release file. It consists of a 3 to 5 alphanumeric code with letters in lowercase.
The code comprises the following three sub-elements. The Type sub-element is required in all cases, other elements are required where relevant and otherwise omitted.
FileType Element - Sub-elements and Permitted Values
Sub-element
Values
Description
Status
<blank>
General release file
"
x
Provisional release file (e.g. part of an alpha or beta release package ).
"
z
Archival or unsupported file
Type
sct
Terminology Data File
"
der
Derivative Work Data File (e.g. Reference set release file)
"
doc
Documentation
"
res
Implementation Resource Data File (e.g. a data file not following a SNOMED CT standard release file format)
"
tls
Implementation Resource Tool (e.g. scripts or other software made available to process a release file)
Format
1
Release Format 1
"
2
Release Format 2
"
<blank>
Not specific to a release version
ContentType Element
The ContentType element is mandatory for all FileTypes. It describes the content and purpose of the file. It consists of 2-48 alphanumeric characters in camel case.
The content of this element depends on the first element (FileType) of the filename, as described below:
ContentType Element - Permitted Values for FileType "sct"
Value
Usage
Concept
The file conforms to the Concept File Specification and contains data related to a set of concepts.
Relationship
The file conforms to the Relationship File Specification and contains relationships that represent the distribution normal form inferred view of a set of concept definitions.
sRefset
The file conform to the single string reference set format. This only applies to the OWL Expression Reference Set and followed by the content sub-element _OWLExpression which contains stated concept definitions represented as OWL axioms and additional OWL ontology information.
Description
The file conforms to the Description File Specification and contains at set of descriptions with description types |Synonym| and |Fully specified name|.
Note that both these description types have a maximum term length of 255 characters.
TextDefinition
The file conforms to the Description File Specification and contains at set of descriptions with description type .
Note: This description type has a maximum term length of 4096 characters.
StatedRelationship
The file conforms to the Relationship File Specification and contains relationships that represent the stated view of a set of concept definitions.
Note: It is likely this file will be phased out and replaced with a reference set containing a richer OWL representation of stated concept definition.
Identifier
The file conforms to the Identifier File Specification.
Note: This file does not contain any data rows in the International Edition.
ContentType Element - Permitted Values for FileType "der"
Value
Description
Refset
The file conforms to the Simple Reference Set specification and contains the members of one or more simple reference sets.
<pattern>Refset
The file conforms to the Basic Reference Set Member File Format and include one or more additional columns, The number and order of the columns and their basic data types are specified by the <pattern> which precedes Refset.
The <pattern> consists of a sequence of lowercase letters each of which represent an additional column with a datatype specified by the letter as listed below
Pattern letter c
A SNOMED CT component identifier (SCTID) referring to a concept, description or relationship.
Pattern letter i
A signed integer.
Pattern letter s
A UTF-8 text string.
Examples
cRefset : A refset with one additional column containing a component identifier. This pattern supports refset types including: Attribute Value Reference Set, Language Reference Set and Association Reference Set).
ciRefset : A refset with two additional columns, one containing a component identifier and one containing an integer. This pattern supports refset types including: Ordered Association Reference Set.
sRefset : A refset with one additional column containing a string. This pattern supports refset types including: Simple Map from SNOMED CT Reference Set, Simple Map to SNOMED CT Reference Set and DEPRECATED: Annotation Reference Set
ContentType Element - Permitted Values for FileTypes "doc","res" and "tls"
FileType
Value and Description
doc
The title of the document in CamelCase, abbridges if necessary to fit within the length constraint.
Note: Abbreviations should not be used unless they are essential to fit the title within the available length.
Examples of ContentType for Documents
doc_SnomedDecisionSupport_Current-en-US_INT_20170331.pdf (Title: Decision Support with SNOMED CT)
doc_SearchDataEntryGuide_Current-en-US_INT_20171122 (Title: SNOMED CT Search and Data Entry Guide)
res
tls
The value of the ContentType element may be determined on a case-by-case basis but, in conjunction with the ContentSubType element, should be adequate to identify the content and purpose of the file.
ContentSubType Element
The ContentSubType element is mandatory for all FileTypes. It provides additional information to describe the content and purpose of the file, including the language/ dialect, where appropriate. Its format is 2-48 alphanumeric characters in camel case (except for the capitalization rules specified below for languagecode). Hyphen (" - ") is a permitted character in conjunction with a language code, as described below.
ContentSubType Element - Sub-elements and Permitted Values for FileTypes "sct" and "der"
Sub-elements
Values
Description
Summary
An optional short camel case summary of the usage of the file. The value of this sub-element may be determined on a case-by-case basis but, in conjunction with the ContentType element, should be adequate to identify the content and purpose of the file.
Examples:
For references sets a brief indication about the type or purpose the reference set(s) in the file.
Note: If there is a summary the ReleaseType or DocStatus follows this Summary sub-element immediately without a space or other separator.
ReleaseType
Full
The file contains the Full view of the components or refset members within its scope (i.e. every version ever released).
"
Snapshot
The file contains the Snapshot view of the components or refset members within its scope (i.e. only the most recent version released).
"
Delta
The file contains the Delta view of the components or refset members within its scope (i.e. only additions/changes since previous release).
LanguageCode
Where it is necessary to specify the language or dialect used in a file, the appropriate language code must be included as the final sub-element of the ContentSubType. If a Summary or DocStatus sub-element is also included, the LanguageCode must be added after the last of those sub-elements and must be separated from it by a hyphen.
Representation of the LanguageCode
The language is specified with a 2 character ISO 639-1 language code (e.g. es = Spanish, fr = French, da = Danish). If necessary, a dialect code is added after the language code and separated from it by a hyphen.
Depending on the specificity required the dialect code comes from one of two sources:
If the dialect is general to an entire country, the two-letter ISO-3166 alpha-2 country code is used to specify the dialect (e.g. en-US = US English, en-GB British English)
If dialect is less common or not country specific, the IANA language subtag should be used. Note this code consists strings of lower case letters. IANA is the Internet Assigned Numbers Authority.
This approach follows Internet conventions.
ContentSubType Element - Sub-elements and Permitted Values for FileType "doc"
Sub-elements
Values
Description
Summary
An optional short camel case addition to the ContentType title.
If there is a Summary the DocStatus follows this Summary sub-element immediately without a space or other separator.
DocStatus
Current
The document is up-to-date and complete for the current release of SNOMED CT, as indicated by the VersionDate element.
"
Draft
The document is a draft version; it may be incomplete and has not been approved in a final version.
"
Review
The document has been released for review and comments from SNOMED International Members, Affiliates and other stakeholders.
LanguageCode
Where it is necessary to specify the language or dialect used in a file, the appropriate language code must be included as the final sub-element of the ContentSubType. If a Summary or DocStatus sub-element is also included, the LanguageCode must be added after the last of those sub-elements and must be separated from it by a hyphen.
ContentSubType Element - Sub-elements and Permitted Values for FileTypes "res" and "tls"
Sub-elements
Values and Description
Summary
The value of this sub-element may be determined on a case-by-case basis but, in conjunction with the ContentType element, should be adequate to identify the content and purpose of the file.
LanguageCode
If it is necessary to specify the language or dialect used in a resource data file or tool, the appropriate language code must be included as the final sub-element of the ContentSubType. If a Summary sub-element is also included, the LanguageCode must be added after the Summary sub-element and must be separated from it by a hyphen.
Examples of ContentSubType
der2_cRefset_AttributeValueSnapshot _INT_20180131.txt
Summary=AttributeValue (type of refset),
Release type=Snapshot,
Language not stated
sct2_Description_Snapshot-en _INT_20180131.txt
Release type=Snapshot,
Language=English
der2_cRefset_LanguageSnapshot-en _INT_20180131.txt
Summary=Language (type of refset),
Release type=Snapshot,
Language=English
doc_IhtsdoGlossary_Current-en-US _INT_20170817.pdf
DocStatus=Current,
Language=en-US.
CountryNamespace Element
The CountryNamespace element is mandatory for all FileTypes. It identifies the organization responsible for developing and maintaining the file. It is a string of 2 to 10 alphanumeric characters consisting of the two sub-elements described below. At least one of these two sub-elements must be present. SNOMED International or a National Release Center (NRC) may optionally include both sub-elements where they consider this to be appropriate.
CountryNamespace Element - Sub-elements and Permitted Values
Sub-element
Values
Description
CountryCode
INT
The file is maintained and distributed by SNOMED International.
"
AA to ZZ
The file is maintained and distributed by the NRC for the country represented by this ISO-3166 alpha-2 country code. The code consists of exactly two uppercase characters from the latin alphabet.
"
<blank>
The file is maintained and released by an SNOMED CT extension provider that is not an NRC.
NamespaceId
0000000 to 9999999
The file is maintained and released by an SNOMED CT extension provider that is not an NRC. In which case, this value is a 7 digit namespace identifier allocated to that organization by SNOMED International.
The file is maintained and distributed by either SNOMED International or an NRC and the distributing organization has chosen to include the namespace identifier to indicate that this is part of a release restricted to content in a single namespace.
"
<blank>
The file is maintained and distributed by either SNOMED International or an NRC and the distributing organization has not chosen to include the namespace identifier to indicate that this is part of a release restricted to content in a single namespace.
VersionDate Element
The VersionDate element is mandatory for all FileTypes. It identifies the SNOMED CT version with which the file is intended to be used. Its format is an 8-digit number in the pattern "YYYYMMDD", in compliance with the ISO-8601 standard.
For Data Files(sct ,der or res), and for Documentation (doc) with a status tag value of "Current ", the value of this element should always be the same as the SNOMED CT version date with which the file is associated.
For other file types, the VersionDate element will identify the (past) date of the SNOMED CT release for which the file was intended. A file distributed with a past version date has not been updated to reflect changes to SNOMED CT since that date, nor has it been validated as correct or appropriate for current use.
File Extension
The extension element of the filename identifies the file format (encoding convention) of the file, such as " txt ", " pdf " or " zip ". It has a format of 1-4 alphanumeric characters.
File Extensions Applicable to Different FileTypes
FileType
Values
Description
sct or der
txt
All RF2 formatted release files are distributed as plain text UTF-8 files with the .txt suffix.
doc
Portable Document Format is the default format for documents distributed and made available for download in a format suitable for local viewing or printing.
"
<other>
Other document formats including plain text (.txt) and HTML (.html) may be used where deemed appropriate. In all cases the file extension (suffix) used should be one of the widely recognized format. Unless there are exceptional requirements, the format should be accessible using freely available software.
res
txt
Most resources should be provided as plain text UTF-8 files with the .txt suffix.
"
zip
Where appropriate a resource file, or a collection of such files, may be distributed as zip archive.
"
<other>
Other data formats may be used where appropriate.
tls
<any>
No specific statements are made about the file extsions to be used for tooling files. However, in general such tools should be provided in a format that does not compromise system security. In most cases, tools should be provided through an interface such as GitHub and should not be included as part of general releases of the terminology.
Last updated