OSOS
WA Secretary of State Wikis
RSS

Home

About
Grant Admin
Milestones
NDNP News
Newspaper Titles
Progress Reports
Working Groups

Search NDNP Wiki:

»
Advanced Search »

Browse

All pages
Categories

Links

Chronicling America (LOC)
About NDNP (NEH)
WSL Online Newspapers
WSL catalog
UW catalog

Processors

Page History: Grant 2 Batches

Compare Page Revisions



« Older Revision - Back to Page History - Newer Revision »


Page Revision: 2012/03/14 09:14



Batch details

american

  • Includes Wenatchee Daily World and Labor Journal
    • See titles and reel list on spreadsheet
  • due to error on doc# 38 had to leave reel 00211108630 out of batch (deleted reel and issue data from BATCH.xml)
  • manually named batch in BATCH.xml (batch_wa_american)
  • on validation there were integrity errors. I ran validation on the command line and learned that there are missing jp2 files. (see Sharepoint issue)
  • Re-exported 7 reels and added missing jp2s one-at-a-time.
  • Noticed some oddly named issues (batch_wa_american\sn88085620\00211107601\2010081901 and batch_wa_american\sn88085620\00211107601\2011062601)
    • After running QA_Report realized there are a lot of wrong issue dates - see QA Notes for each title.
    • Wenatchee and Labor Journal
  • Six reels missing their tech targets (e.g. SEVERE: VALIDATION: Failure: np:techtarget div must have 5 np:target div children; /mets1/structMap1/div1/div1 (F:\docWORKS\OUT\WA_2010\batch_wa_american\sn88085620\00211107625\00211107625.xml, 2011-07-06T05:12:56.890-0700) (see Sharepoint issue)
    • Targets need to be part of an issue container (don't fully understand this, tested it with Shawn and sometimes all targets are included in reel and sometimes not... actually seems like a bug but here's the unhelpful screen shot see screen shot).
  • Caption, Edition lables and Date As Labeled are not exported from DB to METS - see Sharepoint discussion.
    • decided to leave Spokane Press out of batch for now since it has most of the edition needs. Sent back the VH and added more Wenatchee reels.
    • to add Caption data to reels follow these directions:
      For each section insert a tag after the issuemodsBib1 tag. Example:
      <nowiki><dmdSec ID="sectionModsBib1"> 
      
      <mdWrap MDTYPE="MODS" LABEL="Section metadata">
      <xmlData>
      <mods:mods>
      <mods:part>
      <mods:detail type="section label">
      <mods:number>sectionLabel goes here (optional)</mods:number>
      </mods:detail>
      </mods:part>
      </mods:mods>
      </xmlData>
      </mdWrap>
      </dmdSec>

      Then modify the structure map for the issue (found further below in the mets xml file). Revising the structure map will allow you to designate where the section label occurs in the hierarchy of the issue, as well as pinpoint which particular pages are part of that section. Example:
      <structMap xmlns:np="urn:library-of-congress:ndnp:mets:newspaper">
      
      <div TYPE="np:issue" DMDID="issueModsBib">
      <div TYPE="np:page" DMDID="pageModsBib1">
      <fptr FILEID="masterFile1" />
      <fptr FILEID="serviceFile1" />
      <fptr FILEID="otherDerivativeFile1" />
      <fptr FILEID="ocrFile1" />
      </div>
      <div TYPE="np:page" DMDID="pageModsBib2">
      <fptr FILEID="masterFile2" />
      <fptr FILEID="serviceFile2" />
      <fptr FILEID="otherDerivativeFile2" />
      <fptr FILEID="ocrFile2" />
      </div>
      <div TYPE="np:page" DMDID="pageModsBib3">
      <fptr FILEID="masterFile3" />
      <fptr FILEID="serviceFile3" />
      <fptr FILEID="otherDerivativeFile3" />
      <fptr FILEID="ocrFile3" />
      </div>
      <div TYPE="np:page" DMDID="pageModsBib4">
      <fptr FILEID="masterFile4" />
      <fptr FILEID="serviceFile4" />
      <fptr FILEID="otherDerivativeFile4" />
      <fptr FILEID="ocrFile4" />
      </div>
      <div TYPE="np:section" DMDID="sectionModsBib1"> ** add this **
      <div TYPE="np:page" DMDID="pageModsBib5">
      <fptr FILEID="masterFile5" />
      <fptr FILEID="serviceFile5" />
      <fptr FILEID="otherDerivativeFile5" />
      <fptr FILEID="ocrFile5" />
      </div>
      <div TYPE="np:page" DMDID="pageModsBib6">
      <fptr FILEID="masterFile6" />
      <fptr FILEID="serviceFile6" />
      <fptr FILEID="otherDerivativeFile6" />
      <fptr FILEID="ocrFile6" />
      </div>
      <div TYPE="np:page" DMDID="pageModsBib7">
      <fptr FILEID="masterFile7" />
      <fptr FILEID="serviceFile7" />
      <fptr FILEID="otherDerivativeFile7" />
      <fptr FILEID="ocrFile7" />
      </div>
      <div TYPE="np:page" DMDID="pageModsBib8">
      <fptr FILEID="masterFile8" />
      <fptr FILEID="serviceFile8" />
      <fptr FILEID="otherDerivativeFile8" />
      <fptr FILEID="ocrFile8" />
      </div>
      </div> ** add this **
      </div>
      </structMap>

      The first four pages are part of the traditional newspaper issue and the last four pages are part of a new section label. You will notice that after pageModsBib4 there is an opening tag for the sectionModsBib1 section label. If you scroll down you will also notice that the section label closes after pageModsBib8, which associates pages 5, 6, 7, and 8 with the section label.


bumping


columbia

  • Currently in progress
  • Includes The Evening Statesman (Walla Walla) and the Spokane Press
  • Contains nine reels

Top

2010-2012 batch data

See Metadata spreadsheet on skydrive

Top

ScrewTurn Wiki version 3.0.1.400. Some of the icons created by FamFamFam.