The recognition phase of capture may seem to be the most important step relevant to automated indexing – since it is, after all, the phase where OCR is performed. But at least half of the factors relevant to successful indexing occur during the pre-recognition steps, particularly in obtaining appropriate image quality for OCR and indexing.
I found the latest research paper from AIIM “Winning the Paper Wars – capture the content and mobilize the process troops.” to be an interesting read, especially since I have seen many of the points discussed when working with our customers—specifically with growth areas in mobile capture, OCR, BPM/Workflow and AP/AR processes.
I am happy to say that the report addressed not just scanning paper records but also looked at the hundreds/ thousands of external paper documents that flow into a company every day.
At ImageSource we help address capture via ILINX Capture® Many of our customers have experienced progress towards “paper-free processes” and are achieving payback of their investment. If you are looking at addressing some of the issues that were discussed in the report – feel free to reach out to ImageSource…or me…
Inside Sales Account Executive
Does your organization waste valuable time and resources to manually prep documents? Are you tired of manually typing in data which oftentimes isn’t inputted accurately and error-free? If you want to venture away from these tedious slow processes, there are solutions out there! Advanced capture technologies will streamline and automate the transformation of documents into structured electronic information for your business processes.
Accounts Payable is a document intense department with complicated workflows and approval processes. Although our world today is going digital, AP departments are still bogged down by a majority of documents (purchase orders, invoices and supporting documentation) coming in paper form followed by manual routing and approval processes, keyed data entry to the Enterprise Resource Planning (ERP) system, and the physical filing and storage of documents.
While ERP systems handle the information they are designed to process very well, they do not have the capacity to manage data and images simultaneously or perform complicated workflows. By automating the entire AP process organizations can reduce labor requirements, lessen storage costs, decrease late fees, increase early pay discounts and improve vendor satisfaction. And because Accounts Payable touches almost every department it is a great launch pad for ECM technology providing a large impact on the organizational business processes and substantial ROI.
ImageSource is an ECM integrator with vast experience implementing Accounts Payable Solutions and an industry leader in implementing the Oracle 11g technology. Most recently we completed a project for a large manufacturing company integrating Oracle 11g ECM technology with the Accounts Payable system, Oracle Enterprise Business Suite. As a manufacturer poised for growth due to multiple acquisitions, they were looking for a solution that could handle the increase in invoices without increasing staff.
With Oracle EBS already in place, Oracle 11g was the perfect ECM solution. Oracle’s Solution Accelerator enables such a tightly coupled integration that it makes the two technologies ideal for integration. The result is an Accounts Payable system that has provided an efficient system for imaging, processing and viewing documents related to AP with a reduction in overall labor.
The manufacturer receives over 8,000 invoices a month, 80% in paper form and the other 20% through email and fax. All are now processed by Oracle Document Capture and Oracle Forms Recognition (OFR) when received. Oracle Document Capture scans the document and Oracle Forms Recognition automatically extracts relevant data and does some basic validation. The document is then automatically sent to Oracle ECM for more validation and storing of the image. Due to the sophisticated logic with code and BPEL workflows, the system performs all kinds of checks, and is able to move information from Oracle ECM to Oracle EBS without human assistance. Knowledge workers can stay in their familiar EBS system and reference any invoice they need by clicking on the recognizable paper-clip. EBS pushes the parameters to the ECM repository and brings back the search results in a web link.
The ease of processing invoices has been improved dramatically. What used to be a manual process of keying in information is now completely automated. The only time an invoice needs any type of processing, barring exceptions, is the physical act of scanning it in. After it’s scanned the Oracle OCR technology adds the appropriate index fields and the Solution Accelerator automatically moves it through the system for validation and storage until it is available to view through Oracle EBS.
Times are changing now, and there are demands for student portals, integrations with Student Information Systems, the ability to capture electronic transmissions, the need for extensive workflows, and quite frankly, the ability to work more efficiently. More work has to be completed by less people. In most cases, staffing levels have been reduced.
The problem is, most systems currently installed in these institutions were built by Hershey, ImageNow, Nolij and other client server technology software manufacturers, and were built for specific purposes, primarily to eliminate space.
However, In order to meet the demands for campus-wide ECM solutions, there is a need to move from client server platforms into true n-tier architecture products, where standard document imaging has been integrated with Web Content Management, where records management and digital asset management is all part of a suite of products. Content needs to be ingested at the beginning of the process and pushed to knowledge workers.
In today’s world, content is not about scanned pieces of paper, it is not about bar code indexing, or OCR technology. Universities are demanding products that are versatile and can be supported by a light IT staff. The cost to repeatedly introduce an RFP for every IT need has been realized. The requirements now set forth in an RFP for content management solutions far surpasses the ability of the traditional scan, store retrieve systems of old.
The new technologies available in Enterprise Content Management platforms will be demonstrated at the ACCRAO conference in Seattle, the 14th through 16th of March. At booth 106, you will see how a student portal will work. At the ImageSource presentation Tuesday evening at 5pm, you will see the inner working of the system designed for San Jose State University, due to go live early spring.
Chief Solutions Officer
As with any new job there is a learning curve- learning about the industry, specific products and just how things are run. Having come from outside the ECM industry, I expected my first few months at ImageSource to be a huge learning curve. No one could have prepared me for how big that learning curve, or rather mountain, was going to be.
I thought that the biggest challenge for me was going to be understanding the variety of products available and technical aspects of them. I could not have been more wrong. While the products and technical foundation are complex, there was a whole other mountain that I needed to conquer before I could even begin to understand what it is we do.
Those of you in the ECM know what I’m talking about. ECM, BPM, OCR, EMR…I could go on for days!
I remember on my second day of work someone tried to explain to me what was going on with one of our cutomers, “client XYZ is upgrading from IBPM 10g to IPM 11g and their AP department needs an ECM solution implemented.”
I nodded in agreement, like I knew what I was talking about. In my head I really was thinking…WOAH! Who needs a what now?? Are 10g and 11g a type of car? Since when do we deal with cars? I thought we were a tech company…
A few more conversations like this and I quickly realized that I had a new challenge in front of me- learning all of the acronyms.
I used to think that the military had a lot of acronyms, but I think that the ECM industry could give them a run for their money. There are so many industry specific acronyms. The tricky part is that they are constantly changing, new acronyms are being created and that there are a number of acronyms that can mean the same thing.
To keep on top of all the acronyms I heard people using all day long I started a list. It started on a Post-it and has now grown into three pages front and back of a legal pad. I am constantly adding to the list and looking up new ones too.
I still don’t always know what people are talking about. If you look at the top of my notes from meetings, I have all kinds of three-letter acronyms written across it.
How do you handle all of the acronyms? What is your secret? How do you keep up with the ever-changing and growing list of ECM acronyms?
Apparently I’m not the only one clueless about them either…
Check out our own Ruben Kerson at Nexus ’10 asking people what ECM is!
I know I’m not the only person this has happened to. You have a friend, loved one or even a stranger ask you what you do for a living or what industry you work in. What I’ve found is that being in the Enterprise Content Management (ECM) space sometimes makes it difficult to describe what ECM is exactly (in plain English).
My grandma has asked me at least five times what exactly I do and more specifically, what the industry is about, and the dialogue goes something like this:
Grandma: “So Kristina, what exactly does your company do? I know you’ve told me before, but I can’t really remember.”
Kristina: “We are an enterprise content management solutions provider, an integrator and we have our own line of products called ILINX®. We also have this really cool conference every year called Nexus® with presenters, networking time, a vendor expo, etc.”
At this point, my grandma is just staring at me, blankly.
Grandma: “Um, so what does that mean?”
Kristina: “It basically just means we help people try to automate their business processes by scanning in documents, implementing automated workflows, utilizing capture software, storing information in electronic databases, things like that.”
My grandma’s eyes are beginning to glaze over…
Kristina: “In a nutshell, grandma, we help businesses become more efficient and more paper-conscious.”
After being in this industry for a few years, living and breathing it every day, I can sometimes easily forget that others don’t understand what capture is, how workflows come together, or why you’d ever need to get rid of fax machines and paper. They don’t throw acronyms around like OCR, BPM and ERM, which have become part of my everyday language.
I am glad my grandma isn’t the only person who doesn’t know or understand what ECM is. Check out this awesome video played at Nexus ’10 this year where people were asked to answer the question, What is ECM?
My grandparents just mastered text messaging. I’ll continue to work on a good description of my industry that doesn’t make my grandma ask me the question every time she sees me, “Kristina, what does your company do again?”
What was your “ah-ha” moment in communicating ECM?
Working in Enterprise Content Management for over 12 years often times I have found it somewhat difficult to explain what we do and/or sell. Have you?
I have found that who your audience is often dictates how you explain it. To an IT group I have described ECM in terms of storage and retrieval of images in to database/repository with searching capability, ability to apply rules for authentication and accessibility, removing silos of information, ability to do workflow and BPM, and other things like Meta-Data, networks, through-put and HA/DR. Sometimes their eyes gloss over and other times they “understand.”
To some business folks when I’ m talking ECM I most usually reference things like accessibility of their documentation, being able to search on key fields and automatically route work/documents/content without the use of email or paper files (at its simplest form) and its all stored in a database otherwise known as a “repository.” Or, when describing workflow, using the old analogy of a restaurant. When you go in to the establishment a hostess seats you, then you get a menu, a waiter comes up and then you order, that order goes back to the kitchen and you get your meal prepared, then after you have dessert, you get a bill, pay and get a receipt then the bus boy comes and cleans everything up – that’s a workflow.
But what do you say to your mother or father, sister or brother and even children (aka the layman)? I’ve tried things like, “I sell software that lifts information off paper or documents and puts that data in a data base that allows people to find it. Then the people can see the documents on their computer necessary to do their job.” But I still get a ‘blank stare.’
Then one day, maybe three or four months ago, my dad was asking me for his usual P.C. help and he said, “my printer/scanner isn’t reading the words as well as it used to.” Of course, that got my attention! Could my dad know what O.C.R. is? After 12 years of me talking about IBM, FileNet, EMC/Documentum, Microsoft , Captiva, Kofax, ImageSource and ILINX(r) and him saying, “I still don’t get what you do.” NO WAY! How could my dad possibly know about O.C.R?
So I asked him, “Dad, you know what OCR is?” Guess what, he replied YES! “Its that software that I use when I want to take words off my documents that are PDF or Tiffs”. BAM! He knew! Finally after 12 years he “figured it out” partially what I did for a living. Putting this in context, my dad is an automotive guy, first sales and then executive, who had never a need to do any “computing” most of his professional career.
We have a lot of acronyms in our ECM vocabulary: OCR, ICR, OMR, BPM, OSR, ODAR, HIPI, TIFF, etc etc etc. (I can go on for a lifetime of our acronyms). But what do you say so that IT people get what ECM is? What do YOU say to a business user, who never ever ever thought of this stuff day to day? What do you tell your mom, dad, brother, sister, what you do every day? What have you said that brings blank stares? But, most importantly, what have you said to a customer and then you saw the “light bulb” go off? It appears O.C.R. is making it in to the mainstream vocabulary, if my dad is any example, because he knows his, “HP MFP does OCR.”
Scanning your documents into a Document Management System is a great way to improve efficiency and reduce the amount of paper in your office. And, depending on what kind of paper you are scanning, there are lot of document capture tools available. Tools such as Full Text OCR, Zonal OCR, and ICR can greatly reduce time spent indexing and validating your documents once they are scanned.
Full Text OCR (Optical Character Recognition) is software that captures every character in the document being scanned and processing it into a fully searchable PDF. One effective use of this technology is when government users needing to search hundreds of pages of agendas and meeting minutes for a certain topic. The process of searching these documents is quite time-consuming and most OCR processing occurs overnight when there is less use on the company’s day-to-day activities.
Zonal OCR is similar in that it captures information, but in this case, the software is programmed to look in the same location or “zone” every time. This is helpful when scanning in documents where the information is in the same location, such as an invoice number. Most invoices are in a set format so zonal OCR is very effective in this scenario. When the operator is validating this information, the Zonal OCR zooms in on the zone area that has been predetermined where the information is captured from so they can easily read if the information captured is correct or not. Hence, one of the biggest advantages of Zonal OCR is that it improves the efficiency of searches which translates into a savings of both time and money. Some software available that features Zonal OCR also allows the user to draw a box to establish a zone around required text rather than typing in keywords therefore allowing the document to be automatically indexed.
ICR (Intelligent Character Recognition) is the ability for the software to read hand-written information and process this into searchable information. This is especially beneficial in the financial industry. And although this tool can be very useful in some situations, the error rate is much higher because handwriting is so varied from person to person.
Andrea Latham, CDIA+
Content Management Systems are one of the most useful resources companies have available to keep their managers, staff, and customers informed. Managing those files effectively is an ongoing challenge, but a well-planned, best practices implementation makes it significantly easier. Most Content Management Systems start with Scanning as the starting point in the lifecycle of any document. The decision of whether to go with a centralized or distributed scanning model must be carefully evaluated to see which may be a better fit for the organization. Many times a hybrid model of both remote and centralized is required and becoming more popular. When it is done designed and implemented correctly scanning ensures that the data stored in the document management repository is valid, readable, secure, accessible, and useful throughout the enterprise.
Some important things to remember when deploying document a document scanning system:
- Establish clear goals and objectives before you start or deploy a Document Scanning System.
- Establish clear and concise business rules around your company’s requirements.
- Consult a well established Systems Integrator with the knowledge and expertise to help you with defining “Best Practices for Document Scanning” and always check references.
- Understand the nature of your documents, the quality of many documents may be poor, this in turn will require you to use Image Enhancement Technologies that will automatically clean up the document and improve its readability. These types of technologies are a must especially when utilizing OCR or any advanced form of capture.
- Scanning and especially the Indexing of documents can be somewhat laborious, so anything to help automate these tasks such as Bar Coding, OCR, database lookups and electronic forms will make life a lot easier.
- Use the KISS Principle in dealing with data taxonomy and avoid capturing too many fields, but make sure it’s enough to do valuable searches. Here at ImageSource we try to have 10 document types maximum and 8 data fields which allows for effective searches, retrieval and reporting.
Lastly, don’t lose sight of your short and long term goals, do your homework and study your documents and see how they fit into your business lifecycle and corporate governance. Talk with people throughout you organization and get their input to better understand your documents are used. Finally, if you’re unsure get help, this is not an area where you can afford a mistake. Remember, it all starts with getting information into the system.
Senior Account Executive
“This is going to be the year that document imaging really takes off. This is the year…………. “. The adoption of storing files with a standard naming convention on shared drives is still being forced around corporate America when the technology has been around for 20 years to securely scan, index, store, and retrieve in a single repository. What makes us think that the concept of having scanners at every desk or even on the same physical floor is going to catch on in the 20 years?
“When it is easy enough for the CEO to scan documents from his office, we know that document imaging is mainstream. “ In the industry, we have all heard this claim of the past 20 years. We have had visions of every organization, big or small, regardless of industry, deploying scanners and efficiently capturing paper documents at the source of receipt. Multi-Function Devices (copiers that scan and fax) have now become mainstream where most people are comfortable in scanning a document and feel comfortable with the concept. – This is the easy part. The reason we still fight with adoption is that traditional document capture software that allows for the scanning and indexing, has been difficult to understand, use AND had to be loaded on every person’s desktop that wanted to scan and index their documents.
Internet security, reduced file sizes, increased bandwidth, web services, and development of web parts have all been in development high gear. Companies that have been focusing on these technologies and disciplines have made significant breakthroughs in the Distributed Capture / Remote Scanning marketplace. See Kofax, Cardiff, EMC / Captiva, ImageSource / ILINX, Oracle / Captovation, ReadSoft. Out are the complex interfaces that require understanding of terms such as batches, document classes, OCR. In are the simple interfaces that allow for the scanning at a push of a single button, simple drop down menus, and few key strokes. Web based scanning applications should be commonplace where the user can scan a document from anywhere, provide simple indexing functionality, start a workflow, and have the document committed to the secure repository. The concept of “collecting” documents and taking / sending / Fed Ex’ing them to the mail room or scanning supervisor for processing is now not necessary / antiquated / a big waste of time. The technology is here and now.
Distributed Capture solutions should be thin client, scalable, not linked to a page count, able to support multiple ECM systems, and require little to no training. This shouldn’t be too much to ask.
How long is it going to take to get the message out to the masses? Will this be the year of distributed capture? Comments and feedback welcome.
When you own a scanner, or multiple scanners, you are responsible for keeping that equipment running efficiently by keeping consumables on hand. The components of a scanner that touch the paper and are designed to wear out and be replaced every 3-6 months are called “consumables.” They are different to what are referred to as “parts” of a scanner. Consumables are designed this way to maximize the performance of the scanner and are end user replaceable, meaning you don’t have to be tech savvy to perform the operation.
The most common types of consumables are rollers, lamps, and pad assemblies. Depending on the scanner manufacturer (Fujitsu, Bell & Howell, Canon, Panasonic, Kodak, etc..), you may have to replace one or more at least a couple times a year. When a scanner starts jamming or double-feeding paper, the most common cause of this problem is usually worn out consumables. Other imaging problems like: no longer reading bar codes, poor OCR results, or getting an optical alarm can usually be solved by replacing the lamps
When a scanner has a maintenance contract in place, it usually just covers the parts and not the consumables. ImageSource receives a lot of calls from customers asking why the consumables are not covered and parts are. The answer is because the consumables are almost always end-user replaceable and must be replaced much more often than parts. And if your scanner is under maintenance, it’s usually required to have parts replaced by a certified technician. See our blog on benefits of having a maintenance contract.
Not sure where to get parts or consumables for your scanner? Contact ImageSource, they are happy to help!
Andrea Latham, CDIA+