Pika Indexing Profiles
This documentation describes how to configure indexing profiles used in Pika.
Table of Contents
- 1 Indexing Profile Basics
- 1.1 Display Name
- 1.2 Source Name
- 1.3 Record URL Component
- 1.4 MARC File Settings
- 1.5 Pika Driver Settings
- 1.6 Format Determination Settings
- 1.7 Record Settings
- 1.8 Item Tag Settings
- 1.8.1 Call Number Settings
- 1.8.2 Item Tag Settings continued
- 1.9 Item Statuses Settings
- 1.10 Non-holdable Settings
- 1.11 Suppression Settings
- 1.12 Translation Maps
- 1.13 Time to Reshelve
- 1.14 Sierra API Item Field Mappings
- 1.15 Related Documentation
Indexing Profile Basics
Indexing Profiles are typically configured by Marmot and rarely need to be changed. Only Pika OPAC Admins will have access to indexing profiles. All Discovery Partners will have at least one person who has access to Indexing Profiles.
Marmot recommends that any changes to indexing profiles are performed by Marmot staff with input from libraries on their needs.
The Indexing Profiles are located under the Indexing Information menu.
Click Add New Indexing Profile button, or search for an existing profile with the search bar. The option to Add New IndexingProfile should be limited to use by Marmot staff.
Display Name
The Display Name displays as the eContent Collection source and facet label (for profiles that describe e-Content). The Display Name can be edited as needed.
Source Name
Source Name is the internal value for the indexing profile. Any edits to the Source Name will require regrouping and reindexing of the collection. The Source Name displays in the grouped work staff information in the Bib Id column of the item_details section.
Record URL Component
The Record URL Component controls what displays in the URL. Each collection will need a unique record URL Component.
Here is an example of the Record URL Component for the Creativebug eContent sideload.
Group unchanged files forces the indexing to regroup records even if there have not been changes made to them. The default behavior is to not group/regroup files where changes have not been made to save resources. This option is defaulted on, particularly for new sideloaded collections, but is typically toggled off once the sideload configuration is complete.
MARC File Settings
MARC Path is the directory where the files live on the Pika server. This directory will be provided by Marmot and should not be changed without letting Marmot know. All files in this directory and any subdirectories within the path will be processed.
Filenames to Include is a regular expression that allows for control of what filenames are included in the indexing. It allows us to include very specific filenames. The pictured regular expression allows for anything ending in MARC or .mrc (the a? means the letter a is optional). It is not case-sensitive.
MARC Encoding has a dropdown menu. This is the expected encoding of the original file(s) found in “MARC Path” directory. When record grouping splits everything out into individual MARC records, it is converted into UTF8 for consistency. UTF8 is the preferred encoding method for MARC files.
Minimum size of the MARC full export file to process
Individual Record Files settings
A full MARC export is usually a large file. Record grouping breaks the larger file into individual files based on the ID for faster loading within Pika.
The Individual MARC Path is the directory where those items go. Marmot will set up these directories. Part of sideloading is to set up directories and copy things to the right places.
Number of characters to create folder from and the Create Folder From Leading Characters settings determine the names of sub-folders created within the Individual MARC Path. There is a performance issue with having a very large number of files stored in a single folder. The use of subfolders helps mitigate performance issues.
Pika Driver Settings
Grouping Class controls how grouping is done for this collection. For example, Sideloaded eContent or Hoopla have their own Grouping Class. It is mainly used to determine the grouping category of a record.
Indexing Class controls how indexing is handled for the collection.
Record Driver controls how holdings display in Pika.
Cover Source is the method that determines where the indexing profile should fetch cover art from. Some sideloads might have unique options.
Patron Driver is used to connect the indexing profile to a circulation system for the given ILS.
Format Determination Settings
The Determine Format based on setting controls if the formats are loaded from the Bib Record, Item Record, or an assigned Specified Value. Specified Value is typically only used for indexing profiles for sideloads.
Bib Format Determination Settings
The Format Determination Method drop-down menu contains Bib Record and Material Type.
Bib Record format determination method is the process as detailed in the Format Facet Logic document.
Material Type format determination process is based on a bib record’s material type, usually only found in Sierra.
Material Type Values to Ignore (ils profile only) is used to list the Material Type (MatType) values that should be ignored when using the Material Type format determination. The bib format determination method will be used for any Material Type values that are ignored.
Specified Format Settings
Specified Format is the format to set when using a defined format.
Specified Format Category is the category to which the specified value should belong. If the collection does not match any of the 5 listed categories, then Other is used. This can also be left blank for a specified value that should not match any of the categories.
When the format determination method is Specified Value, the value of the Specified Format Boost will be applied for format boosting for every record in the collection.
A Boost ranges from 1 to 12. The higher the number, the more the format appears in search results. Consult the format boosting maps of other indexing profiles for guidance on setting the value for your chosen format.
For example, in a Translation Map (as pictured below), if a book has a boost of 10 and an eBook also has a boost of 10, the total boost would be 20. Grouped works with multiple formats rank higher in the results.
Specified Grouping Category is the Grouping Category to which the specified format should belong. Choose one of five available options.
Record Settings
Record Number Tag is the MARC tag where the bib record number can be found. The Record Number Tag field is a unique identifier for each record in the indexing profile’s collection. Typically, sideloaded collections designate the 001 as the Record Number Field, but this isn’t always the case.
The record number can not contain slash characters, since the record number contributes to the URL for the record view page.
Record Number Field is the subfield of the record number MARC field where the record number can be found when the record number tag is not a control field e.g. the 001. For example, it could have a designated Record Number subfield that has a default value of 'a'.
Record Number Prefix is a prefix to identify the bib record number if multiple MARC tags exist. For example, it could use the .b to show that the identifier starts with the numeric part after the letter b.
Sierra Record/Bib level Fixed Field Tag (ils profile only) is the MARC tag where the Sierra fixed fields can be found, specifically the bcode 3. It is also the field in Sierra where the bib level data is stored (typically the 998). The tag stores the bib level call number, mat types, etc.
Material Type Subfield is the bib level subfield for Material Type. This depends on the setting in the Sierra Record/Bib level Fixed Fields Tag field.
Sierra Language Fixed Field is the bib-level subfield for language. This depends on the setting in the Sierra Record/Bib level Fixed Fields Tag field.
Item Tag Settings
Item Tag Settings (ils profile only) are settings related to data points in item records and their corresponding subfields.
Item Tag designates which MARC tag individual items are located in.
Item Record Number is the subfield for the record number for the item.
If information is not exported from the ILS, then this field would be left blank.
Call Number Settings
Call Number Settings are located in the Item Tag Settings section.
Some libraries Use Item Based Call Numbers, while others use the call number from the Bib record. Generally, consortia use item-level call numbers.
The following values are set in the Call number settings:
Call Number Prestamp
Call Number
Call Number Cutter
Call Number Poststamp
Volume
Item Tag Settings continued
When MARC records are exported from the ILS, there is typically an export profile that the library has the ability to makes changes to or, sometimes, view only. Sometimes the codes coming from vendors may be unclear and need to be mapped appropriately. These codes show as single letters like a, b, d, h, j, etc. These will usually be subfields. These letters will likely be different with each implementation.
Location : Information in an item’s location subfield is used to determine an item’s ownership. Location codes are used in Library and Location setting Records Owned and Records to Include to determine which item (and records) are included in the library’s or location’s search results.
Shelving Location and Collection are the subfields used to determine the shelving location of an item and the collection it belongs to. These are values, but they are usually determined by the same sub-field; they wind up getting translated into different facets.
Item URL is the subfield code for the eContent item external link.
Barcode is the subfield code for the item barcode.
Status is the subfield code for the item status.
Total Checkouts, Last Year Checkouts, and Year To Date Checkouts are circulation statistics that contribute to logic for the relevance search sort option.
The item popularity is added to the grouped work popularity that contributes to the relevance by popularity. If a title is checked out frequently, it is likely something that users want to see higher in the search results.
Due Date is the item subfield that contains the Due Date information.
Due Date Format is the Java date pattern needed to interpret the Due Date.
Date Created is when the item was created.
Date Created Format is the Java date pattern needed to interpret the Date Created.
It is important to add the date as "MM" for the month. Do not use "mm". Find more information about Java date formats at the linked resource here..
Last Check in Date : This field is required to use Time to Reshelve features.
Last Check In Format is the Java date pattern needed to interpret the Last Check in Date. This is necessary because the Last Check in Date may not match the Date Created. This field is also required to use Time to Reshelve features.
Item Suppression Field is for any library that uses item suppression codes. This is typically the icode2 subfield field for Sierra libraries.
Opac Message Field (Sierra Only) is the subfield where Sierra OPAC messages are stored e.g. ‘On Display.'
Format subfield is used when determining formats when the Load Format from is set to “Item Record.” The Format will have its own field because sometimes it will be a location code, an itype, or something completely different depending on the collection.
iType is the subfield in the item record that contains the item type code from the ILS.
eContent Descriptor is used by Marmot to handle Marmot’s ILS eContent – records stored in the circulation system that are meant to describe eContent rather than physical items. It determines ownership.
Item URL is the subfield for a URL specific to the item. This is for libraries using the Marmot ILS eContent standard.
Item Statuses Settings
Available Statuses is a list of status values that are valid Available item statuses.
Checked Out Statuses is a list of status values that are valid Checked Out item statuses.
Library Use Only Statuses is a list of status values that are valid Library Use Only statuses.
These are items that are not checked out, but not available for patrons to checkout or take out of the library, e.g. staff use collections.
Non-holdable Settings
Non Holdable Statuses is a regular expression for any status that should not allow holds.
Non Holdable Locations is a list that uses regular expressions for any location that should not allow holds.
Non Holdable ITypes determines which itypes are not holdable and is typically used for non-Sierra ILSs.
For Sierra, loan rules and loan rules determiners are directly loaded to Pika to determine holdability.
Suppression Settings
Suppression Settings (ils profile only) contains Item Level Suppression Settings and the Bib Level Suppression Settings.
Item Level Suppression Settings
Statuses To Suppress (use regex) is a regular expression for any statuses that should be suppressed. Some of the statuses that might be suppressed are lost or missing.
Itypes To Suppress (use regex) is a regular expression for any itypes that should be suppressed.
Locations To Suppress are the locations we want to suppress.
Collections To Suppress is used to suppress specific collections.
Use Item Suppression Field suppression for items is the option that decides whether or not we should suppress items based on the item suppression field.
Item Suppression Field Values To Suppress (use regex) is a regular expression for any item suppression field that should be suppressed.
Bib Level Suppression Settings
Suppress Itemless Bibs will suppress any Bibs that do not have any items, volumes, or order records attached.
Do Automatic eContent Suppression decides whether or not eContent suppression for OverDrive and Hoopla records is done automatically for ILS records. This suppresses both grouping and indexing.
It will exclude any ILS records for Overdrive. It will also exclude Hoopla records if an item’s eContent descriptor field starts with “hoopla:”
BCode3 Subfield is used to display the subfield for a BCode3 which is a suppression field.
BCode3 To Suppress (use regex) is a regular expression for any BCode3s that should be suppressed.
Translation Maps
Translation Maps are referenced in indexing profiles to determine how codes and values found in MARC will be displayed to users in the catalog. Typically, translation maps are only set in the indexing profile for the ILS for a global site.
For more information about Translation Maps, please review the documentation here.
Time to Reshelve
Time to Reshelve is used to override the status displayed in the catalog for recently checked-in items. There are several scenarios that Time to Reshelve can support e.g. give staff a longer time to shelve items or preventing patrons from seeing an item as available, but not being able to find the item on the shelf.
The Time to reshelve function requires the Last Check in Date and Last Check In Format values to be set in the Item Tag Settings section.
When the last check-in date has a time component it must be exported in Coordinated Universal Time (UTC). ILSs that export just the last check-in date, without a time component their Num of Hours to Override, need to be multiples of 24 hours. The last check-in dates with a time component need to be exported in the UTC time, (not local time zone) for the override period to be calculated correctly.
Multiples of 24 hours are the recommended values for Num. Hours to Override because the overridden statuses are not replaced at the end of the override period. The overnight full reindex will reset those statuses.
This process will still work for libraries that do not have time interval in their Last Check In Format.
Locations are the item locations to the status override that should be applied.
The Locations can be entered for all locations using the .* after the location letters. This allows for varying overrides to be made for different branches and libraries.
Status Code To Override will default to the dash value (-) for available, since this is the default value for available in Sierra.
Num. Hours to Override is the number of hours you want the items to have a certain status and should be added in multiples of 24 hours.
Status is the text displayed in the detail copies section at the bib level while the item’s status is overridden.
Grouped Status sets how the override should be treated when collected with other statuses in search results and Grouped Work views.
Grouped status options :
Currently Unavailable
Available to Order
On Order
Coming Soon
In Processing
Checked Out
Available by Request
Shelving
Recently Returned
Library Use Only
Available Online
In Transit
On Display
On Shelf
Time to Reshelve status and grouped status catalog displays
There is unique handling for certain statuses for libraries using the Time to Reshelve functionality in a consortium setting. The Shelving and Recently Returned grouped statuses have specialized display logic that indicates to library patrons a more accurate status of recently returned items in a number of scenarios.
If the local library owns a copy on a bib with multiple other copies that are unavailable and the local library’s copy is recently returned and in Being shelved status, it will display as Shelving.
If the local library owns a copy on a bib with no other library holdings and the local library’s copy is recently returned and in Being shelved status, it will display as Shelving.
If the local library does not own a copy on a bib with multiple copies that are available or recently returned at other libraries using the Time to reshelve function, the local library will see Available at another library.
If there are no local items on shelf, at least one copy was recently checked in and at least one non-local copy on shelf, the local library will see Shelving/Available elsewhere.
If the local library does not own a copy on a bib with either multiple copies or a single copy that are/is recently returned at a library that is using the Time to reshelve function and all other items are unavailable/checked out, the local library will see Shelving at another library.
The Pika team will need to compare what the local check-in time actually is against what gets exported in the MARC record to Pika. It is very likely displayed within the ILS as localtime. They assume that the timestamp will likely be exported in MARC as a UTC time.
The easiest way to determine this is to send the Pika team the item record ID (not the barcode) of an item checked-in current to the day you provide it (probably working with a test record so you can handle the check-in locally) and the actual local time it was checked-in from the system. The Pika team will then compare this with the MARC export to determine the timezone.
Sierra API Item Field Mappings
Please reference the Sierra Field Mapping documentation for more information about Sierra API Item Field Mappings.
Related Documentation
-
Enable Pika Offline Mode (Marmot Knowledge Base)
-
Pika Translation Maps (Marmot Knowledge Base)
-
Pika Indexing Profiles (Marmot Knowledge Base)
-
Service/Topic and Roles Label Key (Marmot Knowledge Base)