Optical character recognition is improving, making it easier for medical laboratories to scan paper documents and convert that data into digital information

Endless flows of paper are the curse of clinical laboratories and anatomic pathology groups everywhere. Few medical laboratory organizations in the United States have successfully transitioned to a fully paperless environment.

But there is good news for pathologists and clinical lab managers who feel overwhelmed by the daily flood of paper test requisitions and other documents that flow into their labs every day. Several active trends hold the potential to allow more medical laboratories to eliminate all paper and achieve a true digital working environment.

How Clinical Laboratories Can Capture Data Contained on Paper Documents

One trend involves new technologies specifically designed to make it easier to digitally capture information currently only found on paper documents. Associated with this is the trend of increased use of the Internet and cloud-based solutions by physicians, payers, and patients. When individuals log into such a digital system to provide information and access data, it eliminates the need for paper.

Most clinical laboratories and pathology groups would like to reduce and even eliminate paper records. New capabilities for optical character recognition (OCR) are improving the accuracy and speed of capturing data from paper and converting it into structured data that can be archived and accessed digitally. (Image by HealthcareIT.com.)

Most clinical laboratories and pathology groups would like to reduce and even eliminate paper records. New capabilities for optical character recognition (OCR) are improving the accuracy and speed of capturing data from paper and converting it into structured data that can be archived and accessed digitally. (Image by HealthcareIT.com.)

Another trend that complements the two trends mentioned above is the movement of hospitals and physicians away from paper records and onto electronic medical record (EMR) systems. To accelerate this trend, the federal government is offering financial incentives for providers who adopt electronic health record (EHR) systems and meet “meaningful use” criteria.

Collectively, these trends will have a direct consequence for how office-based physicians practice medicine. Experts say that adoption of EHRs by large numbers of physicians will force other physicians to do the same thing. “If practitioners all around you are using health IT at a certain level, you are going to have a hard time collaborating with your peers if you don’t [also use health IT],” stated Robert Kocher, M.D., co-author of an article in the New England Journal of Medicine (NEJM), to a reporter for a story published by The New York Times.

This creates an opportunity for clinical laboratories and pathology groups. By developing strategies that allow them to evolve into fully-paperless environments, they will be prepared to add value to client physicians who use EMRs within their daily practice environment.

Paper Abounds Across Healthcare and Clinical Laboratories

Paper continues to abound in all sectors of healthcare for a simple reason. A great deal of information is still produced on paper. Integrating this paper-based information into an EHR for digital archiving, search, retrieval and analysis is a difficult, expensive and time-consuming task.

“Different types of medical information can be broadly categorized as ‘structured data’ and ‘unstructured data’,” stated David Rasmussen, President of Extract Systems, based in Madison, Wisconsin. “Medical laboratories need different capabilities to capture and handle each of these two types of data in order to become a fully digital organization.”

Structured data is information that comes in numbers, tables and rows. This is the easiest type of data for computers to capture, archive, and manipulate. Unstructured data is generally found in a narrative, but can also include handwritten notes, charts and various medical images like digital pathology images, x-rays and CT scans.

“You can further break down unstructured data into ‘textual unstructured data’ and ‘non-textual unstructured data’,” observed Rasmussen, “Another way to define unstructured data is any data that does not reside in an organization’s information systems.”

“External documents are one source of structured data,” he continued. “Any type of document or information that comes from outside the system’s format is external. In other words, it is an external document if the information comes via paper and must be manually handled and/or significantly modified before it can be used in the electronic system.”

Paper and “external documents” present a significant challenge to modern healthcare institutions and clinical laboratories. Tom Murray and Laura Berberian, in a Computerworld blog from March 31, 2011 titled, “The Importance of Structured Data Elements in EHRs,” stated, “The general physician and hospital population have hastily ‘automated’ the paper process and in doing so, have compromised creating standards and charting using structured data. The end result is what exists today:

  • “patient medical records comprised of ‘data dumps’ from different systems;
  • “the seemingly insurmountable task of meeting meaningful use requirements set forth by the American Recovery and Reinvestment Act of 2009; and,
  • “major obstacles in sharing healthcare information seamlessly between the entities within HIEs.”

So what are the solutions when clinical laboratories must handle external documents? The first—and most basic one—is to do nothing. Take the paper and dump it into a filing cabinet or a file box that is picked up and sent to a storage facility to live for many years—with the lab required to pay for that document story. The information on this paper is essentially worthless that way, but it’s easy and cheap.

The second solution typically involves having a person manually enter the data into the health IT system. But this is a costly and time-consuming task. It also often generates additional errors.

A third solution is to scan the documents and attach them to the patient’s laboratory test requisition. “Although this provides a solution, it is a halfway solution,” noted Rasmussen. “After all, the scanned information is not searchable by the computer system. It is a ‘blob’ of unstructured data and likely can only be viewed as an image on a computer screen.”

A number of companies now offer a fourth solution. First, documents are scanned. Second, an Optical Character Recognition (OCR) engine is applied to the scanned document. This process may not necessarily involve integrating this data into an electronic health record or the LIS. But it does make it possible to do a word search of the scanned document. Further, the accuracy of the OCR application is dependent upon the quality of the original document and the resolution of the scan.

Pathologists and clinical laboratory managers dealing with these issues may be interested in the just-published White Paper titled “Closing The Medical Data Gap: Using IT To Close The Gap Between Health Information Systems And External Documents.” Published by The Dark Report and Dark Daily, it is available free to laboratory professionals as a PDF download http://darkdaily.com/white-papers/closing-the-medical-data-gap-using-it-to-close-the-gap-between-health-information-systems-and-external-documents-23012.

Recently, several companies have entered the market with an enhanced solution. Their respective technologies take that information from the OCR engine and apply some sort of structure to it, typically with proprietary algorithms. The procedure is fairly straightforward. Once the documents are scanned, the images undergo OCR. A number of companies and researchers such as MAVRO Imaging, Profdoc (Compugroup), and KnowleSys have developed approaches to this problem. Each company has a focus on a particular type of data or platform, i.e., websites or email.

One company that has designed a solution specifically for clinical laboratories is Extract Systems. It states that its LabDE is an automated technology solution for extracting structured medical laboratory data and integrating it into laboratory information systems (LIS).

On the road to the fully-paperless medical laboratory, all signs point to expanded use of OCR technology and proprietary algorithms to external documents and paper records as a way to capture unstructured medical data and integrate this data into the electronic health record. Once such data is integrated into an LIS or EHR, it can then be searched and utilized for a wide array of patient care purposes.

What is important to recognize is that advances in OCR capabilities and associated technologies are giving clinical laboratory managers and pathologists more accurate and faster ways to capture the unstructured data found on paper and convert it into a digital form. It brings the clinical laboratory testing industry closer to the day when all functions and activities can be handled digitally and paper cannot be found anywhere within the lab facility.

—By Mark Terry 

Related Information:

What ‘Big Medicine’ Means for Doctors and Patients

Closing The Medical Data Gap: Using IT To Close The Gap Between Health Information Systems And External Documents

Integrity Health Network goes paperless: All 15 primary care clinics have shifted to an electronic health records system to improve patient care and make it more efficient

HITECH Act Enforcement Interim Final Rule. U.S. Department of Health & Human Services 

The importance of structured data elements in EHRs

Electronic Exchange of Clinical Laboratory Information: Issues and Opportunities

Optical Character Recognition (OCR): What is OCR, how does it work, and what should you expect from it? LegalScans: Paper to Searchable Databases. 

Data extraction from a semi-structured electronic medical record system for outpatients: A model to facilitate the access and use of data for quality control and research

THE DARK REPORT Laboratory Intelligence

;