Biographies

hermann_hahn.png

A biography of Hermann Hahn, taken from page 617 of Priester unter Hitlers Terror:  Eine Biographische und Statistische Erhebung, Volume II.  Hermann Hahn was a German priest who also appears as an entry in the Excel spreadsheet with which I am working.

As a secondary dataset, I have mined the book entitled ­Priester unter Hitlers Terror:  Eine Biographische und Statistische Erhebung by Ulrich von Hehl, Christoph Kösters, Petra Stenz-Maur, and Elisabeth Zimmermann.  The title translates to Priests under Hitler's Terror:  a Biographical and Statistical Inquiry.  This two-volume book contains, among other statistical information, a series of biographies of German priests who were victims of the Holocaust, including both those who were murdered and those who survived.  Significantly, 445 of the clergy listed in the Excel spreadsheet were German nationals, and a fraction of these clergy have biographies in the two volumes.

To the right, I have included a picture of the biography of Hermann Hahn, who is listed in the Excel spreadsheet.   Both volumes of this book have been written in German; fortunately, I have a working proficiency in reading German and can translate the essentials of the text (though it should be noted that I am not fluent in reading German). 

The extraction of clergy biographies from the two volumes of Priester unter Hitlers Terror: eine Biographische und Statistische Erhebung requires the most steps of all of my datasets for the simple reason that the dataset consists of physical books and thus requires a number of additional steps to import the dataset into a digital data structure.  The main steps required to accomplish this are as follows:

  • identifying and scanning the pages of the volumes containing the relevant biographies
  • using ABBYY FineReader to perform OCR on these scanned pages
  • using regular expressions to extract biographical information from the OCR text files in a data structure that is well-suited for being merged with the Excel file (my primary dataset)

Of course, it is instructive to understand how these steps are performed, and for this reason, I  elaborate in detail on what each of these steps entails in the relevant sub-pages of this section here, here, and here.

IMG_1977.JPG

Physical Copies of Both Volumes of "Priester unter Hitlers Terror: Eine Biographische und Statistische Erhebung"