Pika Indexing Profiles

This documentation describes how to configure indexing profiles used in Pika.


Top of page

Table of Contents

Indexing Profile Basics

Indexing Profiles are typically configured by Marmot and rarely need to be changed. Only Pika OPAC Admins will have access to indexing profiles. All Discovery Partners will have at least one person who has access to Indexing Profiles.

Marmot recommends that any changes to indexing profiles are performed by Marmot staff with input from libraries on their needs.

The Indexing Profiles are located under the Indexing Information menu.

Click Add New Indexing Profile button, or search for an existing profile with the search bar. The option to Add New IndexingProfile should be limited to use by Marmot staff.

Top of page

Display Name

The Display Name displays as the eContent Collection source and facet label (for profiles that describe e-Content). The Display Name can be edited as needed.

 

 

Top of page

Source Name

Source Name is the internal value for the indexing profile. Any edits to the Source Name will require regrouping and reindexing of the collection. The Source Name displays in the grouped work staff information in the Bib Id column of the item_details section.

 

Top of page

Record URL Component

The Record URL Component controls what displays in the URL. Each collection will need a unique record URL Component.

Here is an example of the Record URL Component for the Creativebug eContent sideload.

 

Top of page

MARC File Settings

MARC File Settings drop-down menu 

MARC Path is the directory where the files live on the Pika server.  This directory will be provided by Marmot and should not be changed without letting Marmot know.  All files in this directory and any subdirectories within the path will be processed. 

Filenames to Include is a regular expression that allows for control of what filenames are included in the indexing.  It allows us to include very specific filenames.  The pictured regular expression allows for anything ending in MARC or .mrc (the a? means the letter a is optional).  It is not case-sensitive. 

MARC Encoding has a dropdown menu.  This is the expected encoding of the original file(s) found in “MARC Path” directory. When record grouping splits everything out into individual MARC records, it is converted into UTF8 for consistency. UTF8 is the preferred encoding method for MARC files.

Group unchanged files forces the indexing to regroup records even if there have not been changes made to them.  The default behavior is to not group/regroup files where changes have not been made to save resources. This option is defaulted on, particularly for new sideloaded collections, but is typically toggled off once the sideload configuration is complete.

Individual Record Files settings

A full MARC export is usually a large file.  Record grouping breaks the larger file into individual files based on the ID for faster loading within Pika. The Individual MARC Path is the directory where those items go.  Marmot will set up these directories.  Part of sideloading is to set up directories and copy things to the right places. 

Number of characters to create folder from and the Create Folder From Leading Characters settings determine the names of sub-folders created within the Individual MARC Path.  There is a performance issue with having a very large number of files stored in a single folder. The use of subfolders helps mitigate performance issues.

Top of page

Pika Driver Settings

Pika Driver Settings drop-down menu 

Grouping Class controls how grouping is done for this collection.  For example,  Sideloaded eContent or Hoopla have their own Grouping Class. It is mainly used to determine the grouping category of a record.

Indexing Class controls how indexing is done for the collection.  Each Discovery Partner may have different processes, so Marmot creates a custom index depending on their needs. This is set up by Marmot.

Record Driver controls how things display in Pika.

Cover Source is the method that determines where the indexing profile should fetch cover art from. Some sideloads might have unique options.

 

 

Patron Driver is used to connect the indexing profile to a circulation system for the given ILS.

Top of page

Format Determination Settings

Format Determination Settings drop-down menu 

Determine Format based on a dropdown menu.  This controls if the formats are loaded from the Bib Record, Item Record, or if we assign a Specified Value. The only time Specified Value is useful is for sideloading. This is used when loading something like Kanopy that only has eVideos. We know that there will be no information in the Bib Record, so it will be hard coded to only be eVideos.  Bib Record and Item Record will depend on the Indexing Class.

Bib Format Determination Settings

The Format Determination Method drop-down menu contains Bib Record and Material Type.

Bib Record format determination method is the process as detailed in the Format Facet Logic document.

Material Type format determination process is based on a bib record’s material type, usually only found in Sierra.

Material Type Values to Ignore (ils profile only) is used to list the Material Type (MatType) values that need to be ignored when using the Material Type format determination.  The bib format determination method will be used for any Material Type values that are ignored.

Specified Format Settings

Specified Format is the format to set when using a defined format.  For example, where the specified value like eMagazine would be entered.

Specified Format Category is the category to which the specified value should belong. It is a dropdown menu.  There are only 5 categories.  If the item does not match any of the 5 listed categories, then Other is used.  This can also be left blank for a specified value that should not match any of the categories.

When the format determination method is Specified Value, the value of the Specified Format Boost will be applied for format boosting for every record in the collection.

A Boost has a value ranging from 1 – 12.  The higher the number, the more that format is boosted in search results. Consult the format boosting maps of other indexing profiles as a guide for what the value should be set to for the specified format you have chosen.

Here is an example of a format boost from a Translation Map.  These add up to a total sum, so if there is a title that is a book (10) and an eBook (10) there would be a total boost of 20.  Grouped works with more formats show higher in the results.

Top of page

Record Settings

Record Number Tag is the MARC tag where the bib record number can be found. The Record Number Tag field is a unique identifier for each record in the indexing profile’s collection. Typically, sideloaded collections designate the 001 as the Record Number Field, but this isn’t always the case.

The record number can not contain slash characters, since the record number contributes to the URL for the record view page.

Record Number Field is the subfield of the record number MARC field where the record number can be found when the record number tag is not a control field e.g. the 001. For example, it could have a designated Record Number subfield that has a default value of 'a'.

Record Number Prefix is a prefix to identify the bib record number if multiple MARC tags exist.  For example,  it could use the .b to show that the identifier starts with the numeric part after the letter b. 

Sierra Record/Bib level Fixed Field Tag (ils profile only) is the MARC tag where the Sierra fixed fields can be found, specifically the bcode 3.  It is also the field in Sierra where the bib level data is stored (typically the 998).  The tag stores the bib level call number, mat types, etc.

Material Type Subfield is the bib level subfield for Material Type. This depends on the setting in the Sierra Record/Bib level Fixed Fields Tag field.

Sierra Language Fixed Field is the bib-level subfield for language. This depends on the setting in the Sierra Record/Bib level Fixed Fields Tag field.

Top of page

Item Tag Settings

Item Tag Settings (ils profile only) drop-down menu is for settings related to data points in item records and their corresponding subfields. 

Item Tag tells us which MARC tag the items are located in.

Item Record Number is the subfield for the record number for the item. Additionally, it depends on the export profile within the ILS. If information is not exported from the ILS, then this field would be left blank. Some libraries may have an item number, others may use the barcode, and it may not exist in other scenarios.

Top of page

Call Number Settings

Call Number Settings drop-down is located inside the Item Tag Settings.

Some libraries Use Item Based Call Numbers, while others use the call number from the Bib record. Generally, consortia are going to use item-level call numbers.  The non-consortia will do either, or a mix of both.

The call number choices are Call Number Prestamp, Call Number, Call Number CutterCall Number Poststamp, and Volume.

Top of page

Item Tag Settings continued

When MARC records are exported from the ILS, there is typically an export profile that the library has the ability to makes changes to or, sometimes, view only.  Sometimes the codes coming from vendors may be unclear and need to be mapped appropriately. These codes show as single letters like a, b, d, h, j, etc.  These will usually be subfields.  These letters will likely be different with each implementation.

Information in an item’s location subfield is used to determine where an item “belongs.”  Location codes are used in Library and Location setting Records Owned and Records to Include to determine which item (and records) are included in the library’s or location’s search results.  We determine where the Location code is stored. 

Shelving Location and Collection are the subfields used to determine the shelving location of an item and the collection it belongs to.  These are values, but they are usually determined by the same sub-field; they wind up getting translated into different facets.

Item URL is the subfield code for the eContent item external link. Barcode is the subfield code for the item barcode.  Status is the subfield code for the item status.

Total Checkouts, Last Year Checkouts, and Year To Date Checkouts are all different circulation statistics that are used partially for relevance. The item popularity is added to the grouped work popularity that contributes to the relevance by popularity. If a title is checked out frequently, it is likely something that users want to see higher in the search results.


Due Date is the item subfield that contains the Due Date information.   

Due Date Format is the Java date pattern needed to interpret the Due Date.

Date Created is when the item was created.

Date Created Format is the Java date pattern needed to interpret the Date Created

Sometimes an ILS will export the Last Check in Date. This field is required to use Time to Reshelve features.

Last Check In Format is the Java date pattern needed to interpret the Last Check in Date. This is necessary because the Last Check in Date may not match the Date Created. This field is also required to use Time to Reshelve features.

Item Suppression Field is for any library that uses item suppression codesThis is typically the icode2 subfield field for Sierra libraries.

The Opac Message Field (Sierra Only) is the subfield where Sierra OPAC messages are stored e.g. ‘On Display.'

The Format subfield is used when determining formats when the Load Format from is set to “Item Record.”  The Format will have its own field because sometimes it will be a location code, an itype, or something completely different depending on the collection.

iType is the subfield in the item record that contains the item type code from the ILS.  

The eContent Descriptor is used by Marmot to handle Marmot’s ILS eContent – records stored in the circulation system that are meant to describe eContent rather than physical items. It determines ownership.

The Item URL is the subfield for a URL specific to the item. This is for libraries using the Marmot ILS eContent standard.

Top of page

Item Statuses Settings

The Item Statuses Settings (Sierra ils profiles only) dropdown menu.

The Available Statuses is a list of status values that are valid ‘Available’ item statuses.

The Checked Out Statuses is a list of status values that are valid ‘Checked Out’ item statuses.

The Library Use Only Statuses is a list of status values that are valid ‘Library Use Only’ statuses. These are items that are not checked out, but not available for patrons to checkout or take out of the library, e.g. staff use collections.

Top of page

Non-holdable Settings

The Non-holdable Settings (ils profile only) drop-down menu

The Non Holdable Statuses is a regular expression for any status that should not allow holds. It shows which statues are holdable.

The Non Holdable Locations is a list that uses regular expressions for any location that should not allow holds.

We can specifically say that some iTypes are Non Holdable ITypes. For Sierra, we have a copy of the loan rules and loan rules determiners to determine holdability.  For other systems, we may not have that information, or it is very convoluted logic, so it would not be worth copying.   

Top of page

Suppression Settings

The Suppression Settings (ils profile only) drop-down menu contains Item Level Suppression Settings and the Bib Level Suppression Settings drop-down menus.

Item Level Suppression Settings

The Item Level Suppression Settings drop-down menu 

Statuses To Suppress (use regex) is a regular expression for any statuses that should be suppressed.  It shows which ones are suppressed. Some of the statuses that might be suppressed are lost or missing.

Itypes To Suppress (use regex) is a regular expression for any itypes that should be suppressed.

Locations To Suppress are the locations we want to suppress. 

Collections To Suppress is used to suppress specific collections.

Use Item Suppression Field suppression for items is the option that decides whether or not we should suppress items based on the item suppression field.

Item Suppression Field Values To Suppress (use regex) is a regular expression for any item suppression field that should be suppressed.

Top of page

Bib Level Suppression Settings

The Bib Level Suppression Settings drop-down menu

Suppress Itemless Bibs will suppress any Bibs that do not have any items, volumes, or order records attached.

Do Automatic eContent Suppression decides whether or not eContent suppression for OverDrive and Hoopla records is done automatically for ILS records. This suppresses both grouping and indexing. It will exclude any ILS records for Overdrive.  It will also exclude Hoopla records if an item’s eContent descriptor field starts with “hoopla:”

BCode3 Subfield is used to display the subfield for a BCode3 which is a suppression field.

BCode3 To Suppress (use regex) is a regular expression for any BCode3s that should be suppressed.

Top of page

Translation Maps

The Translation Maps are for the indexing profiles. Click on the Add New button to create a new translation map.

Translation Maps are used to determine how codes and values found in MARC will be displayed to users in the catalog. For example, if we look at the collection, we can see that a collection is a list of locations “as” for Marmot. This means that anything with the value of “as” is translated as Main Collection.

We can use different Translation Maps for different types of mapping. For example, If we look at the Translation Map for shelf_location, we can see a list of shelf locations for Marmot. This means that anything with the value of “as” is translated as ASU Nielsen Library.

Sometimes when a Translation Map (like itype or shelf_location) has more than a few hundred values, they are unable to be updated with the current updating tool at the bottom of the page. The easiest way to edit larger maps is to use a CSV file or an INI file. View as INI allows the list to be copied and pasted.

Use the Load From CSV/INI to paste information into the box. A comma or equal sign can be used. It will ignore quotation marks.  The Append/Overwrite Values are used if the value of an item changes.  The changed value is overwritten by the new value, or if there is a new value, this would insert it. Reload Map Values means deleting everything that is there already and loading it again with the values pasted into the box.

Top of page

Time to Reshelve

Time to Reshelve is used to override the status displayed in the catalog for recently checked-in items. There are a few scenarios that Time to Reshelve can support e.g. give staff a longer time to shelve items or also preventing patrons from seeing an item as available, but not being able to find the item on the shelf.

Click on the Add New button to add a new indexing profile Id.

This requires the Last Check in Date and Last Check In Format values to be set in the Item Tag Settings section.  When the last check-in date has a time component it needs to export in Coordinated Universal Time (UTC). ILSs that export just the last check-in date, without a time component their Num of Hours to Override, need to be multiples of 24 hours. The last check-in dates with a time component need to be exported in the UTC time, (not local time zone) for the override period to be calculated correctly.

Locations are the item locations to the status override that should be applied. The Locations can be entered for all locations using the .* after the location letters (for example - pc.*). This allows for varying overrides to be made for different branches and libraries.

Status Code To Override will default to the dash (-) for available.

Num. Hours to Override is the number of hours you want the items to have a certain status and should be added in multiples of 24 hours.

Status is what is wanted to be displayed while the item’s status is overridden (for example, On shelving cart or In Transit). 

Grouped Status section has a dropdown menu that sets how the override should be treated when collected with other statuses in search results and Grouped Work views. There is logic in Pika that will pick the grouping status as the one that is most available.

Top of page

Sierra API Item Field Mappings

Please reference the Sierra Field Mapping documentation for more information about Sierra API Item Field Mappings.

Top of page