OSOS
WA Secretary of State Wikis
RSS

Home

About
Grant Admin
Milestones
NDNP News
Newspaper Titles
Progress Reports
Working Groups

Search NDNP Wiki:

»
Advanced Search »

Browse

All pages
Categories

Links

Chronicling America (LOC)
About NDNP (NEH)
WSL Online Newspapers
WSL catalog
UW catalog

Processors

Batch file structure

RSS
Modified on 2011/01/31 14:43 by Claire Categorized as Documentation

NDNP Batch File Structure

Directory structure: batchID/snXXXXXXX/reel_barcode_number/issue_date_edition

Naming schemes and examples

  • batchID = the group of reels scanned together
    • BATCH.xml = Delivery batch manifest data
    • BATCH_1.xml = Validated version of delivery batch manifest data

  • snXXXXXXX (7 to 10 digits) = LCCN - Library of Congress serial number for newspaper title (sn87093449: Daily Republican)

  • 00211100503 (11 digits) = NDNP reel barcode number
    • 00211100503.xml - Reel metadata
    • 00211100503_1.xml - validated version of reel metadata

  • 1896022601 = YYYYMMDD01 issue date (e.g. 1896-02-26) and edition number (e.g. 01)
    • 1896022601.xml = issue information (where you can find date, title, vol., iss., page correlations)
    • 1896022601_1.xml = validated version of issue metadata
    • 0016.xml = ocr data for the page (file name is sequence number of scan)
    • 0016_1.xml = validated version of page data

Batch Validation

See Workflow::Validation

Technical Documentation

See Grant Admin::Technical Documentation

ScrewTurn Wiki version 3.0.1.400. Some of the icons created by FamFamFam.