cancel
Showing results for 
Search instead for 
Did you mean: 

Identify junk pages with Decipher

AbiramiTS
Level 2
Is it possible to identify the junk pages in a pdf document using Blueprism Decipher and identify the appropriate pages?

------------------------------
Abirami TS
Architect
Cognizant
Europe/London
------------------------------
5 REPLIES 5

BenLyons
Staff
Staff
Hi Abirami,

We can do this with an 'attachment' document type. There's a few steps, but it will mean that Decipher will 'ignore' any attachments when looking for the specified fields.

  1. Create a new Document Type e.g. "Email Attachment", don't update any configuration at this stage. No need to mark as an attachment yet.
  2. Create a Classification Model e.g. "Email Classification" and set as extensible. Don't mark for training just yet.
  3. In your configuration page, enable "Classification" and "Class Verify". You can set this stage to be automatically skipped when you're happy with it.
  4. Update your Batch Type to include your new Document Type and Classification Model.
  5. Split out some examples of your document to be read and the attachment document, these will need to be uploaded separately to train the classification model. Depending on variety, 10 of each might be enough.
  6. In Blue Prism, upload separate batches of each document type to be used for classification training. This is a separate action to the usual create new batch one.
  7. In the Admin panel>Batches, wait for your batches to reach the stage "Waiting for classification training"
  8. Go to your classification model and select the "Mark for training" box. Click save
  9. It may take a few minutes, but these batches will now training Decipher to recognise these different documents. Wait until the batches say that training is complete.
  10. Return to your new Document Type "Email Attachment" and set it as an "Attachment Document Type"
  11. Now when you upload your next document, any pages identified as an "Email Attachment" will have a paperclip on them. This means your training has been successful.
Let me know if any of that needs clarification and how you get on.

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons Senior Product Specialist - Decipher SS&C Blue Prism UK based

Thanks for the reply. I still have queries w.r.t to my original question. The problem statement is not specific to email attachment. Let me explain it below

Consider a set of pdf documents that will have name, address, Insurance details. The Bot will have to add a new page to the document with a new current address and send the document back to the recipient. Before sending the document, The Bot will have to identify the junk pages which would have already existed. I would like to understand if Decipher can tell us the quality of the page with certain result so that, we can make the Bot to delete those pages from the document

------------------------------
Abirami TS
Architect
Cognizant
Europe/London
------------------------------

Hi Abirami,

Decipher's really just designed for extracting written data from documents, not so much determining the image quality or comparing documents.

We have had a look at way to determine image quality, but you would face a challenge with PDF's as the dpi does not necessarily indicate a high quality image.

You may need to think outside the box on this one, as it certainly sound like an interesting use case.

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons Senior Product Specialist - Decipher SS&C Blue Prism UK based

Hello! I need some clarification!

On stage 1 - When creating Document Type, should I include the original Document Form Definition of the invoices with attachment Junk Pages?
On stage 4 - I included on my batch type additional​ Document Type but I cant add multiple Classification Model. Should I create another batch type for attachments?
On stage 7 - "Waiting for classification training" sign does not show up. I got Decipher 1.2.

------------------------------
Tushigbayar Tseveenbayar
------------------------------

Hi Tushigbayar,

Have you seen the video on our help pages that walks through the process? it may answer your questions, but happy to revisit if not.

Thanks

------------------------------
Ben Lyons
Product Consultant - Decipher Specialist
Blue Prism
UK
------------------------------
Ben Lyons Senior Product Specialist - Decipher SS&C Blue Prism UK based