Grouping Language

This documentation describes the Grouping Language grouping factor as used in Pika’s grouped work grouping logic.


Table of Contents

Top of page

Language Codes

The grouping language uses language codes derived from the Codes for the Representation of Names of Languages (ISO 639-2 Bibliographic). Only the three-letter codes from this list are valid for the grouping language.

  • No linguistic content or None means the record does not contain any language content. The record uses the language code ‘zxx’. Pika will use ‘zxx’ as the grouping language.

  • Undetermined means the language of the record hasn’t been determined. The record uses a language code ‘und’ (Undetermined).  Pika will use ‘und’ as the grouping language.  

  • Unknown is used for unrecognized language codes. A few sites translate these unrecognized language codes to Other.

Top of page

Grouping Language

The grouping language for a record is derived from the MARC tag 008 – Fixed-Length Data Elements-General Information. Members should really have the MARC tag 008 or the Sierra language fixed fields populated. 

When the 008 is missing, has an invalid code, the positioning of the code within the 008 is off, or the 008 has a code that translates to ‘Unknown,’ we use the Sierra Fixed Field Language code (if a translatable code) for Sierra systems. 

If the record does not have either of those fields, the first present 041a three-letter code will be used.

Top of page


008

The Sierra 008 field has a language field with a dropdown menu to pick a language. 

If the record contains the 008 fields but is not grouping correctly, please check for the following errors.

  • Language Code Typo

  • Position of Language Code Error

    • This happens when the 008 is incorrect somewhere before the language code. This could be a missing character(s) or too many characters.

  • N/A, not a valid code

    • This is when someone does not know the language code uses n/a.  In this case, the code ‘und’ for Undetermined.

Top of page

Sierra language fixed field

For Sierra systems, records that do not have the language code in the 008, have a placeholder like “|||” (three pipe characters) or otherwise have an invalid code, we will look at the Sierra language fixed field.  


The Sierra language fixed field has a dropdown menu to choose a language. 

The fixed field will not override a valid 008-language code for the grouping language.

Top of page

041a

For all systems, if the 008 is unrecognized (and the Sierra Fixed Field Language is not present or unrecognized), we will use the first 3-letter valid code of the first 041 (with an indicator of 0) subfield a, 041a Language Code of text/sound track or separate title

  • If the 008 and Sierra fixed-length fields are missing or invalid, we will use the 041|a for the language grouping.

  • The 041 tags with a second indicator of 7 are ignored

    • These tags are used to hold non-standard language codes

  • 041 subfields occasionally have multiple language codes in a single subfield for multi-language titles, including sideloaded eContent sources. We have handling to process each valid 3-letter code within the subfield.

Top of page