Language and Translation Facets

The following documentation explains how each record populates the Language and Translations facets.


Table of Contents

Language Codes

The language facets use language labels for language codes derived from the Codes for the Representation of Names of Languages (ISO 639-2 Bibliographic). Only the three-letter codes from this list are valid for the language facets.

  • No linguistic content or None means the record does not contain any language content. The record uses the language code ‘zxx’. Pika will use ‘zxx’ as the grouping language.

Screenshot highlighting no linguistic content display in the catalog
  • Undetermined means the language of the record hasn’t been determined. The record uses a language code ‘und’ (Undetermined).  Pika will use ‘und’ as the grouping language.  

  • Unknown is used for unrecognized language codes. A few sites translate these unrecognized language codes to Other.

Top of page

Primary Language

Pika attempts to determine the primary Language of every record. This is used as a sorting factor for the display of the editions table in search results and grouped work pages.

The primary language for a record is derived from the Marc tag 008 – Fixed-Length Data Elements-General Information.  Members should have the Marc tag 008 or the Sierra language fixed fields populated.

 

When the 008 is missing, has an invalid code, the positioning of the code within the 008 is off, or the 008 has a code that translates to ‘Unknown’, we use the Sierra Fixed Field Language code (if a translatable code) for Sierra systems. 

 

If the record does not have either of those fields, the first present 041a three-letter code will be used.

008

The Sierra 008 field has a language field with a dropdown menu to pick a language. 

If the record contains the 008 fields but is not grouping correctly, please check for the following errors.

  • Language Code Typo

 

 

 

  • Position of Language Code Error

    • This happens when the 008 is incorrect somewhere before the language code. This could be a missing character(s) or too many characters. 

 

 

  • N/A, not a valid code

    • This is when someone does not know the language code uses n/a.  In this case, the code ‘und’ for Undetermined.

 

 

 Top of page


Sierra language fixed field

For Sierra systems, records that do not have the language code in the 008, have a placeholder like “|||” (three pipe characters) or otherwise have an invalid code, we will look at the sierra language fixed field.  


The Sierra language fixed field has a dropdown menu to choose a language. 

The fixed field will not override a valid 008-language code for the primary language.

Top of page

041a

For all systems, if the 008 is unrecognized (and the Sierra Fixed Field Language is not present or unrecognized), we will use the first 3-letter valid code of the first 041 (with an indicator of 0) subfield a, 041a Language Code of text/sound track or separate title

  • If the 008 and Sierra fixed-length fields are missing or invalid, we will use the 041|a for the primary language.

Top of page

Language facet

The language facet is populated with the Primary Language and valid language codes from 041 tags with the first indicator of 0 that has the subfields a, b, d.

  • The 041 tags with a second indicator of 7 are ignored. These tags are used to hold non-standard language code.

  • 041a Language Code of text/sound track or separate title

           

 

 

Language(s) are recorded in English alphabetical order. For textual resources, record the language of the summary regardless if it is the same or different from the language recorded in subfield $a.

041 0#$aeng$bfre$bger$bspa [Text is in English with summaries in French, German, and Spanish.]

For music, subfield $b contains the language code(s) of material accompanying sound recordings if the accompanying material contains summaries of the contents of a non-music sound recording or summaries of songs or other vocal works (not translations of the text(s)) contained on a music sound recording.

Language code(s) for the audible portion of an item, usually the sung or spoken content of a sound recording or computer file. The language code in the first occurrence of subfield $d, if there is no subfield $a, may also be recorded in field 008/35-37.

Note: The language code(s) for the textual portion of an item is entered in subfield $a.

041 0#$deng$eeng$efre$eger [Recording in English with accompanying libretto in English, French, and German.]

  • 041 subfields occasionally have multiple language codes in a single subfield for multi-language titles, including sideloaded eContent sources. We will process subfields with multiple language codes within a single subfield for each valid 3-letter code.

Marmot recommends that when you use the Language facet, to have it open by default so that foreign language readers have a better chance of selecting their preferred language.

Top of page

Translations facet

The Translations facet is populated with languages from 041 tags with the first indicator of 1 with subfields b, d, j. These show up in the Translation/Subtitles facet.

  • Language codes for written languages providing access to moving image materials in the form of subtitles. It does not include the languages of the credits, packaging, or accompanying material. If needed, the language of credits is recorded in field 546 (Language Note) and the language of packaging or accompanying material is recorded in 041 subfield $g (Language code of accompanying material other than librettos and transcripts).

041 1#$aeng$bger$jger [An English language video contains a German language summary on its package and German subtitles.]

The original language code is not populated into the translations facet.

Top of page