Language and Translation Facets
The following documentation explains how each record populates the Language and Translations facets.
Language Codes
The language facets use language labels for language codes derived from the Codes for the Representation of Names of Languages (ISO 639-2 Bibliographic). Only the three-letter codes from this list are valid for the language facets.
No linguistic content or None means the record does not contain any language content. The record uses the language code ‘zxx’. Pika will use ‘zxx’ as the grouping language.
Undetermined means the language of the record hasn’t been determined. The record uses a language code ‘und’ (Undetermined). Pika will use ‘und’ as the grouping language.
Unknown is used for unrecognized language codes. A few sites translate these unrecognized language codes to Other.
Primary Language
Pika attempts to determine the primary Language of every record. This is used as a sorting factor for the display of the editions table in search results and grouped work pages.
The primary language for a record is derived from the Marc tag 008 – Fixed-Length Data Elements-General Information. Members should have the Marc tag 008 or the Sierra language fixed fields populated.
When the 008 is missing, has an invalid code, the positioning of the code within the 008 is off, or the 008 has a code that translates to ‘Unknown’, we use the Sierra Fixed Field Language code (if a translatable code) for Sierra systems.
If the record does not have either of those fields, the first present 041a three-letter code will be used.
008
The Sierra 008 field has a language field with a dropdown menu to pick a language.
If the record contains the 008 fields but is not grouping correctly, please check for the following errors.
Language Code Typo
Position of Language Code Error
This happens when the 008 is incorrect somewhere before the language code. This could be a missing character(s) or too many characters.
N/A, not a valid code
This is when someone does not know the language code uses n/a. In this case, the code ‘und’ for Undetermined.
Sierra language fixed field
For Sierra systems, records that do not have the language code in the 008, have a placeholder like “|||” (three pipe characters) or otherwise have an invalid code, we will look at the sierra language fixed field.
The Sierra language fixed field has a dropdown menu to choose a language.
The fixed field will not override a valid 008-language code for the primary language.
041a
For all systems, if the 008 is unrecognized (and the Sierra Fixed Field Language is not present or unrecognized), we will use the first 3-letter valid code of the first 041 (with an indicator of 0) subfield a, 041a Language Code of text/sound track or separate title
If the 008 and Sierra fixed-length fields are missing or invalid, we will use the 041|a for the primary language.
Language facet
The language facet is populated with the Primary Language and valid language codes from 041 tags with the first indicator of 0 that has the subfields a, b, d.
The 041 tags with a second indicator of 7 are ignored. These tags are used to hold non-standard language code.
041a Language Code of text/sound track or separate title
041b Language Code of summary or abstract
Language(s) are recorded in English alphabetical order. For textual resources, record the language of the summary regardless if it is the same or different from the language recorded in subfield $a.
041 0#$aeng$bfre$bger$bspa [Text is in English with summaries in French, German, and Spanish.]
For music, subfield $b contains the language code(s) of material accompanying sound recordings if the accompanying material contains summaries of the contents of a non-music sound recording or summaries of songs or other vocal works (not translations of the text(s)) contained on a music sound recording.
041d Language Code of sung or spoken text
Language code(s) for the audible portion of an item, usually the sung or spoken content of a sound recording or computer file. The language code in the first occurrence of subfield $d, if there is no subfield $a, may also be recorded in field 008/35-37.
Note: The language code(s) for the textual portion of an item is entered in subfield $a.
041 0#$deng$eeng$efre$eger [Recording in English with accompanying libretto in English, French, and German.]
041 subfields occasionally have multiple language codes in a single subfield for multi-language titles, including sideloaded eContent sources. We will process subfields with multiple language codes within a single subfield for each valid 3-letter code.
Marmot recommends that when you use the Language facet, to have it open by default so that foreign language readers have a better chance of selecting their preferred language.