subject: Ten Things You Should Know About Data Capture [print this page] Capturing data is what creates documents whether they are paper or electronic. The more well designed the data capture exercise, the quicker and less expensive it will be. The kinds of data that needs to be captured are varied and data capture methods need to be tailored to the type of data.
1. Computerized business applications, such as accounting software, typically capture data through data entry forms. Operators might complete these forms from paper documents, or enter the data direct into the system when a transaction takes place. POS terminals are examples of the latter.
2. Data from paper documents can be captured more easily if details are bar-coded on the paper document. Barcode readers can read the code and convert them into transaction data with a minimum of operator action.
3. Paper -based correspondence, contracts and reports are also data documents. Data is captured in print format as these are created. By scanning print documents they can be converted into electronic documents.
4. Correspondence can be electronic if electronic means are used to create them, as in the case of word-processed correspondence, e-mails and recorded instant-messaging chats. Even contracts can be electronic if digital signatures are affixed to the relevant electronic document setting out the agreement.
5. Scanning of paper documents does not immediately convert them into full-fledged electronic documents. Scanning only creates a digital image of the document in such format as jpg. Text characters saved in graphic formats will not be readable as text by computers. Hence, further processing with Optical Character Recognition (OCR) software is needed to convert the text into machine-readable formats such as ASCII or Unicode.
6. Where the volumes of paper documents to be scanned are large, batch scanning will typically be used. Batch scanning involves automated scanning and processing of many documents fed in a batch to the scanner.
7. Batch scanning will typically involve a lot of preparation. Staples and paper clips will have to be removed. Where documents are not uniform as when some are single page while others are multi-page, methods will have to be adopted to indicate the start and end of documents. Scanner settings will have to accommodate single-side or dual-side scanning as needed, and accept documents of different sizes.
8. OCR software does not always provide accurate results. Humans often check the OCR outputs and clean up errors made by the software. Where the source document is of poorer quality, as when it is handwritten, Intelligent Character Recognition (ICR) software is typically used.
9. Data capture can also take the form of data conversion as when legacy system data is converted into modern formats to make them readable under current system environments or to use them with current applications.
10. Captured data is stored in several different kinds of media. They are typically stored on hard disks, but can be saved on magnetic tapes, CD/DVD, Flash drives, or on the Web, for example. Some of the media are easy to use, others are highly portable while yet others are low cost. The requirements in each context will determine the media used.
Documents are created by capturing data. There are structured and unstructured data that are captured and stored in different ways. Then there are numerous data-capture methods such as POS terminals, scanning of paper documents and barcode reading in addition to the obvious one of entering the data by typing it in.
About Author:
Ademero, Inc. develops paperless office software . Based largely on user experience, the company's flagship product, Content Central, is a browser-based document management software system created to provide businesses and other organizations with a convenient way to capture, retrieve, and manage information originating in hard copy or digital form. Access a live preview of this document management solution by visiting the Ademero web site.