Sunday, March 4, 2012

Checking The Whole Page


In my last post, I shared how I was discovering new bits and pieces of my family's history through a more thorough search of local newspapers. The newspaper, the Toronto Star in this case, has digitized every page of every edition from 1894, spanning about 116 years. The newspaper is important to my family because Toronto is the city in which several generations of both my maternal and paternal families lived.

The digital copies of newspapers are in PDF format and they are searchable by keyword, exact phrase or Boolean query (like a Google search). The pages on which I have found articles, birth, marriage or death notices about family members, I save in PDF on my computer and then I attach the file to the person and event or fact in my genealogy software program. Overall, it's a labour intensive process to go through the hundreds of pages of 'hits' in the various search results I receive but well worth it.

One of the search features is the highlighting in yellow of the search term on a viewed newspaper page. For example, if I was searching for "Hadden," the search engine would, or should, provide me with all pages in the time period (a maximum of five years) containing my search term. I've come to learn that my tendency to quickly find and examine the highlighted reference and make a determination of it's connection to my family and then move on limits the potential for results.

The best example I can offer occurred when I was searching for "O'Neill" (my maternal family) references. In the Saturday, August 24, 1957 edition of the newspaper, the search engine provided me with an O'Neill 'hit.' The search term O'Neill was highlighted in an obituary for a person that is not connected at all to my family but the deceased person's funeral was being held in the chapel of the "L.E. O'Neill" funeral home.

If I had quickly moved to examine the next search result, I would not have noticed elsewhere on the page an article about the death of Herbert Caskey, the father-in-law of my wife's cousin, Louis Orville Breithaupt. The headline for the article "Herbert Caskey, 94 Dies At U.S. Home" takes up almost as much space as the short two paragraphs that followed.




Dated August 24 at Asheville, North Carolina, the article states: "Herbert K. Caskey, father-in-law of Ontario's lieutenant-governor, Louis O. Breithaupt, died at his home here today. He was 94. Mr. Caskey, who lived in Toronto in his early years, had spent many years of retirement here. His wife died here a year ago. Besides Mrs. Breithaupt, he is survived by a son, Paul, of Rockport, Ill."

Experience tells me that OCR, the optical character recognition technology used in this sort of newspaper database, is not yet perfect but neither is the quality of scanned images that are included for searching. Only by examining the whole of pages can I really determine if it contains information that is of importance to me. Lesson learned - so no more short cuts!

1 comment: