Export-program requirements for Eric Willis
(last revised October 29, 1997)
Bibliographic/Holdings Extract
Extractor capabilities
1. Extra-large records.
-
The extractor should identify MARC records whose overall size (bib+holdings)
exceeds 99,999 characters.
-
The program should output the bibliographic record and as much of the holdings
information up to the 99,999 character limit as possible to tape.
-
Mutiple bibliographic records will be written for MARC records that exceed
the 99,999 character limit.
-
Each duplicate MARC record should have an identical tag 001 to assist Endeavor
in consolidating these records later. An identifier (a GRSN, or 245$a,
etc) of these records should be output to error logs.
-
The 930 tags on the first MARC record will be written to all MARC records
that describe the same GRSN
2. Corrupt records
The GRSN of any record that cannot be written to tape should be output
to error logs.
3. Extract file
The extractor should be able to extract records based on an input range
of GRSN numbers or the extract files **MXTR, **EXTR and **HMTR. This will
permit extract of specific records or record- ranges for phase-2 testing
in the Voyager Schedule.
4. Summary report
The extractor should output the number of records written during a
run, with a talley of GRSN and 966 tags, and a count of error records.
5. Run-time option re secondary location
At run time allow the operator to choose to write secondary locations
or not when creating 966 tags. See table below.
6. Run-time option re selection of MBIB or EBIB as the bibliographic source
file
At run time allow the operator to select the bibliographic source file
MBIB through the GRSN pointer GMREC or the EBIB file through the GRSN pointer
GEREC.
7. Output sequence
For each bibliographic record, output the data-elements to tape in
this order: Bib data, **HOLD (930) data, 966 data. It is important that
the following BIBCONVERT program encounter all **HOLD data before any 966
data.
8. Format
The data should be written in standard MARC format such that standard
UNIX tape utilities can dump the data to a UNIX file system.
MARC Leader
1. Convert leader values:
byte 6: "x" and "y" to a
byte 22: any non-zero value to "0"
byte 23: any non-zero value to "0"
bytes 20-21: any non "45" values to
"45"
Bibliographic information
1. GRSN record number
Systems tag 001 RSN value on UoW bib database is corrupted; many do
not have a check-digit. Generate a new GRSN record number, without the
NUC prefix "CaOWtU", on extract to tag 901. Regenerate tag 001 in
the form "CaOWtUxxxxxxx" where "xxxxxxx" is the GRSN of the GEAC record
without a checkdigit. Endeavor's BIBCONVERT program will do its usual 001-->035$9
mapping.
2. Deletion of tags
Provide $include file such as RITAGS in RPTAPE so site can specify
tags to be deleted on export.
Current list includes tags 019,088,091,265,899,996,998,999.
3. Tag mappings
-
212 --> 246. Change indicators to 3<blank>
-
315 --> 310
-
570 --> 500. Change indicators to <blank><blank>
4. Delete and move tags
For serials format (byte 6&7 of leader) = "as", do:
-
Delete tag 088
-
Move tag 087-->088
5. Tag 007 padding for cartographic format
Tag 007 is fixed in length for each bibliographic record upon conversion
and cannot be changed later online (same dataconversion problem exists
for tag 008 below).
For all records with byte 6 of the leader equal to "e" whose tag 007
field length is less than 8, pad out the field with blanks to 8 characters
in length.
6. Tag 008 truncation of length
Tag 008 is fixed in length for each bibliographic record upon conversion
and cannot be changed later online. Analysis of the UW db indicates that
there are thousands of 008 tags that are 41 characters in lenght, and about
200 records whose 008 tags are 49 characters in length. All others are
40 characters in length. Truncate ALL 008s on conversion to USMARC
length of 40 characters: character positions 0 to 39.
7. Tag 007 creation
Specifications for creating 007 fixed fields
General rules:
-
Do not create 007 fields if the bibliographic record has one or more 007s
in the existing GEAC record already.
-
007 is a fixed-length field. The length is determined by the first character
position.
-
Create 007 fields based on the material type in the item record (IDLOAN)
-
If more than one holding is on a record, create a 007 field for every group
of material types according to the groups specified below:
Microform group
Material types included in this group: MICCRD, MICFIC, MCF3DY, MCF2WK,
MICFLM, MCL3DY, MCL2WK, MICPRT, THSMCF. ( IDLOAN=6,7,8,34,33,36,35,9,38)
The 007 will be fixed at 13 characters, offset 0-12.
c.p. 0 h
c.p. 1
d if MICFLM, MCL3DY, MCL2WK (IDLOAN=8,36,35)
e if MICFIC, MCF3DY, MCF2WK, THSMCF (IDLOAN=7,34,33,38)
g if MICCRD, MICPRT (IDLOAN= 6,9)
c.p. 2 r
c.p. 3 u
c.p. 4
f if MICFLM, MCL3DY, MCL2WK (IDLOAN=8,36,35)
o if MICPRT (IDLOAN=9)
m if THSMCF (IDLOAN=38)
m if 966 $sbegins with MFCL
l if 966 $s begins with MFC<space> or MED or MCD
u if none of the above
c.p. 5 u
c.p. 6-8 ---
c.p. 9 b
c.p. 10 u
c.p. 11 c
c.p. 12 u
Computer File Group
The material types included in this group are: CD2WK, CD1HR, CDLIB, CDNET,
CDPER, DSK2WK, DSKLIB, DSKPER, ELEC, CPTAPE. (IDLOAN=58,53,54,60,57,56,4,55,59,2)
The 007 will be fixed at 6 characters, offset 0-5.
c.p. 0 c
c.p. 1
f if DSK2WK, DSKLIB, DSKPER (IDLOAN=56,4,55)
h if CPTAPE (IDLOAN=2)
o if CD1HR,CDLIB,CDPER,CD2WK,CDNET (IDLOAN=53,54,57,58,60)
r if ELEC (IDLOAN=59)
c.p. 2-5 uuuu
Text Group
The material types included in this group are: STACKS, SSTACK, STACK3,
GENNC, DOC, DOCNC, OVRSZE, THESIS, UNGESS, REF, ABS, PER, PERUNB, PERNCB,
PERNCU, NEWS, GAZET, SCORE (IDLOAN=1,52,49,39,21,40,10,17,19,15,23,12,37,50,51,32,30,16)
This group should also contain items with the material type LIBUSE
(IDLOAN=47) but only if 966 $s does not begin with CGC.
The 007 will be fixed at 2 characters, offset 0-1
c.p. 0 t
c.p. 1 a
Sound Recording Group
The material types included in this group are: PHNDSC, PHNTAP. (IDLOAN=
14,13)
This group should also contain items with the material type LIBUSE
(IDLOAN=47) but only if 966 $s begins with CGC.
The 007 will be fixed at 14 characters, offset 0 -13.
c.p. 0 s
c.p. 1
d if PHNDSC and LIBUSE (IDLOAN=14,47)
s if PHNTAP (IDLOAN=13)
c.p. 2 o
c.p. 3-8 uuuuuu
c.p. 9 m
c.p. 10-13 uuuu
Video Recording Group
The material type included in this group is: VIDEO (IDLOAN=20)
The 007 will be fixed at 9 characters, offset 0 -8.
c.p. 0 v
c.p. 1 f
c.p. 2-3 uu
c.p. 4 b
c.p. 5 a
c.p. 6 h
c.p. 7-8 uu
8. Tag mapping
-
Tag 009, c.p. 8: map to tag 008, c.p. 25.
-
in all cases delete tag 009 after this mapping
-
if there is no tag 008 to map to, just delete tag 009
-
note: BIBCONVERT of Endeavor will reject any bibliographic record that
has a fixed field whose tag number is greater than "008" so all 009s must
be deleted.
9. Deletion of bibliographic records
-
do not write to tape any bibliographic record if any of the items' ILOC
fields contain "31" (location "delete")
-
do not write to tape any bibliographic record if the bibliographic record
status is "deleted"
Summary Holdings
1. Extract summary holdings information tags 930 in **HOLD -as-is-.
(Endeavor to do special MFHD work on this).
2. Extract 930 data before 966 data for each bibliographic
record. BIBCONVERT needs the 930 data first.
3. Do not extract **HOLD tags 001 or 995; only process tags 930s
Detailed Holdings
1. Extract detailed holdings information in MARC $966 format according
to GEAC usage
Here is the 966 translation
table.
2. Convert volume-level notes
During conversion discussions with Tom Owens of Endeavor, it was decided
that existing circulation item-level notes (ILOCAL $n) that are extracted
to the 966$n note fields should be mapped by Endeavor into MFHD 866 $z
tags, and that any volume-level notes (CNOTE) on the GEAC database should
be mapped to 966$a instead of 966$n; this will permit the volume-level
notes to be mapped during conversion to the item-level note on the Voyager
item record.
-
if a number in the format nnn/a/nnn (storage number) appears in the volume-level
note field (CNOTE) on the GEAC circulation system, write the entire volume-level
note field into 966 $a (prefix such that the storage number is always at
the front of the subfield) if there already is an existing item level note
field) for every item on that volume that has the location STOR (ILOC=5).
Do not write the storage number to a 966 if the location is not 5.
-
please note that all storage numbers are at the beginning of the CNOTE
field; the whole CNOTE field need not be examined to determine if a storage
number is present.
-
for all other volume-level notes append to the item-level note field 966$a
for all copies regardless of location
-
if an item has the missing status, append the missing type long name to
the 966$a.
3. Bound-with considerations on export
On the lowest-numbered GRSN of a boundwith structure, output all the
GRSNs linked to the same item in repeatable subfields $j of the 966 item.
Use the same format as in systems tag 001. For all subsequent GRSNs linked
to that item do not write a 966 item record at all.
Eg. if a boundwith item is linked to GRSNs 100, 200 and
300, on the 966 tag for GRSN 100 ONLY write a 966:
966 $jCaOWtU00000100$jCaOWtU00000200$jCaOWtU00000300
4. Items to be deleted on export:
-
items with location "delete" (ILOC=31).
-
items with statuses "erase"
5. Mapping of item statuses from GEAC to Voyager (output status as described
in Italic-Bold in column 3.)
| GEAC Item status |
Missing Category (IMISS) |
Endeavor status |
| delete & missing |
70,76,82,87,90,94,97 |
WITH (the long name of the GEAC missing status is appended
to item level note field, 966 $n ) |
| delete & missing |
all others |
do NOT extract (unless BFR exisits) |
| delete |
none |
do NOT extract item |
| bfr & missing |
other than MS2,MS3,MS4 (100,98,99) |
WITH (the long name of the GEAC missing status is appended
to item level note field, 966 $n ) |
| bfr & missing |
MS2 (100) |
MISS (missing-status long-name appended to item-level
notes) |
| bfr & missing |
MS3(98) |
LOST (if missing, append missing-status long-name to
item level notes) |
| bfr & missing |
MS4 (99) |
LOST (if missing, append missing-status long-name to
item level notes) |
| missing |
other than MS2,MS3,MS4 (100,98,99) |
WITH (the long name of the GEAC missing status is appended
to item level note field, 966 $n ) |
| missing |
MS2 (100) |
MISS (missing-status long-name appended to item-level
notes) |
| missing |
MS3 (98) |
LOST (if missing, append missing-status long-name to
item level notes) |
| missing |
MS4 (99) |
LOST (if missing, append missing-status long-name to
item level notes) |
| bfr |
none |
LOST |
| erase |
|
do not extract item (report) |
6. NBAR records
The NBAR file on GEAC is exactly the same as the ITEM file; NBAR records
relate to records that have no bar-codes.
Output the 966 data for these records in the same manner as you output
records for item records; they will have the same field values except that
there will be no barcode. The pointer from CALL to NBAR is CNBAR; the back-pointer
is NCALLP. (note: there are less than 500 of these NBAR records; if difficult
to do we could have them created online).