Project
DataConversion Profiles
Data-Conversion Profiles
1. Bibliographic-Holdings Load
The largest single component of the data-conversion process, measured in
volume of data, complexity of data manipulation and processing time, is
the load of bibliographic and holdings information. A description of the
process resides at http://library.uwaterloo.ca/~wcwoods/datacon.htm.
The contributions of the three libraries were placed on the IST server
l1.uwaterloo.ca by March 1, along with an LC Names Authority file and a
LC Subject Authority file. Voyager conversion processes began March 2 and
continued through to March 7. Cataloguers and Systems staff began analysis
on Monday, March 9. A reload of MFHD data was conducted on March 12.
The TRELLIS Cataloguing Module went live Monday, March 23. Restrictions
on deletion of records was lifted March xx, 1998.
2. Vendor Load
The load of vendor records from the University of Guelph and the University
of Waterloo occurred on March xx, 1998.
3. Patron Load
-
Input Profile
-
Output Profile
-
Analysis
4. Transaction Load
-
Input Profile
-
Output Profile
-
Analysis
1. Input Profiles
for Bibliographic-Holdings Load
-
Authority files:
| LC Authority Files |
Subject |
Names |
| TUG-Wide totals |
238,962 |
4,209,780 |
-
University of Waterloo:
| Record Type |
MARC records |
Items |
Barcodes |
| LC pt. 1 |
441,806 |
854,461 |
|
| LC pt. 2 |
662,421 |
847,966 |
|
| CODOC |
213,123 |
399,314 |
|
| Rare Books |
33,580 |
38,521 |
|
| Reserves |
5,018 |
8,157 |
|
|
|
|
|
| Totals: |
1,315,948 |
2,148,419 |
2,148,259 |
-
The final UW
MasterFile report on the GEAC WATCAT database yields
a more detailed analysis of the contributing database.
-
University of Guelph:
| Type |
MARC records |
Items |
Barcodes |
| Regular |
688,052 |
1,219,401 |
787,736 |
| Docs |
343,581 |
476,075 |
360,100 |
| Other |
59,105 |
66,329 |
0 |
| Reserve |
26,100 |
n/a |
0 |
|
|
|
|
| Totals: |
1,116,838 |
1,761,805 |
1,147,836 |
-
Wilfrid Laurier University:
| MARC Records |
MFHDs |
Items |
Barcodes |
| 582,748 |
602,435 |
765,523 |
598464 |
Tug-Wide Totals of Incoming MARC and Item records:
| Contributor |
MARC Records |
Item Records |
| University of Waterloo: |
1,315,948 |
2,148,419 |
| University of Guelph: |
1,116,838 |
1,761,805 |
| Wilfrid Laurier University: |
582,748 |
765,523 |
| Totals : |
3,015,534 |
4,675,747 |
Output
Profile for Bibliographic-Holdings Load:
Bibliographic Records per Owning Library:
| Library |
Bibliographic Records |
MFHDs |
Items |
Barcodes |
| Guelph |
1,116,550 |
1,212,445 |
1,802,829 |
1,192,247 |
| Laurier |
582,747 |
602,387 |
765,475 |
598,418 |
| Waterloo |
1,315,784 |
1,373,851 |
2,174,630 |
2,167,519 |
| Totals: |
3,015,081 |
3,188,683 |
4,742,934 |
3,958,184 |
Further Profiles:
Analysis:
University of Waterloo:
MARC records
-
Input MARC records= 1,315,948
-
Output records= 1,315,784
-
Difference= 164 records
-
Reconciliation:
All 164 can be accounted for. Tom Owens identified 164 records which
were missing a mandatory tag 245; the conversion process cannot load records
missing this essential tag. 19 of these records were reserves records that
should not have been submitted; the other 145 records were "pseudo" MARC
records that were created for individual titles that had a combined record
size that exceeded the MARC standard-maximum; these were split into multiple
MARC records by the UW extract program and reassembled into single MARC
records by the Endeavor conversion program.
MFHD records
-
Input MFHDs= 0
-
Output MFHDs= 1,373,851
-
Reconciliation:
MFHD record structures did not exist on the GEAC system; it is a new
data-structure on the Voyager system.
Item Records
-
Input Item records: 2,148,419
-
Output Item records: 2,174,630
-
Difference: 26,211
-
Reconciliation:
26,211 new items seemingly were created during the load:
-
6,960 items in the "no966" error log representing MFHD records that have
no item records and hence have a pseudo item record created that would
have no barcode.
-
19,260 records are "boundwith" items: these records are the reverse of
normal bib-item records, i.e. instead of one bib-many items a boundwith
has many bibs-one item. When counting items per owning library via Cognos
these records are counted at least twice when they should be counted once.
-
this leaves 9 items unaccounted for.
Barcodes
-
Input barcode records: 2,148,259
-
Output barcode records: 2,167,519
-
Difference: 19,260 (increase)
-
Reconciliation: this approximately equals the number of boundwiths in the
UW contribution described above. Again, these are counted at least twice
when quantified by bibliographic owning library.
Wilfrid Laurier University:
MARC records
-
Input MARC records= 582,748
-
Output records= 582,747
-
Difference= 1 records
-
Reconciliation:
MFHD records
-
Input MFHDs= 602,435
-
Output MFHDs= 602,387
-
Difference= 48
-
Reconciliation:
WLU MFHDs for input and output are remarkably close because WLU programming
staff created the MFHD data before submission to Endeavor.
Item Records
-
Input Item records: 765,523
-
Output Item records: 765,475
-
Difference: 48
-
Reconciliation:
-
48 is the same difference as the 48 in the MFHD reconciliation
Barcodes
-
Input barcode records: 598,464
-
Output barcode records: 598,418
-
Difference: 46
-
Reconciliation: this approximately equals the differences in MFHDs and
Items
University of Guelph:
MARC records
-
Input MARC records= 1,116,838
-
Output records= 1,116,550
-
Difference= 288 records
-
Reconciliation: An analysis of the print-out of the records which failed
the load reveals problems such as a missing $a in tags 110, 245, 260, 440,
and 800 tags; two 110's, two 245's, blanks or digits before the $a or other
subfield codes.
MFHD records
-
Input MFHDs= 0
-
Output MFHDs= 1,212,445
-
Reconciliation:
MFHD record structures did not exist on the Guelph system; it is a
new data-structure on the Voyager system.
Item Records
-
Input Item records: 1,761,805
-
Output Item records: 1,802,829
-
Difference: 41,024 (increase)
-
Reconciliation:
-
Probably due to the pseudo MFHD process described above. L.Edwards reports:
**The increase may be accounted for by the following:
A few categories such as some Career Centre and map records with 930
but no
966's will have created extra items but the numbers would be quite
small.
The Guelph item id count by item type for UGOFFRES, which is the item
type
for Reserve personal materials, is 42919.
In 1. Input Profiles for Bibliographic-Holdings Load
Reserves items are listed as n/a.
This would mean overall a decrease of 1895 items. I can account for
263
items that could not be loaded from the 32 "badbib" records in the
list that was received. Perhaps the remaining items were attached to the
288 bibliographic records that did not load.
Barcodes
-
Input barcode records: 1,147,836
-
Output barcode records: 1,192,247
-
Difference: 44,411 (increase)
-
Reconciliation:
2. Vendor Load
| University of Waterloo Input |
6,284 records |
| University of Guelph Input |
9,101 |
| Total Input |
15,385 |
| Total on TRELLIS |
xx,xxxx |
| TRELLIS
Project |
March 18, 1998