At my current project, we have setup robots to read PDFs daily from Outlook through MAPIEx using Ctrl + A and Ctrl + C, then storing the clipboard to Blue Prism.
However, there's a problem with this technique, Ctrl + A & Ctrl + C doesn't store vital formatting information from the PDF.
For example, if a table in the PDF is setup like this:
A B C
1 23 4
Ctrl + A & Ctrl + C often stores the information as:
ABC
1234
Which can make it problematic, if not impossible to retrieve all information in larger / complex PDF tables as there's no control over which column the values are at (in some cases).
I've testet out iTextSharp, a .NET PDF library, and it works perfectly in keeping formatting information (tabs, spaces etc.)! However, it's not free for closed source projects (at least for v5 and above, but v4 doesn't have the right functionality).
So I'm wondering if anyone here knows of a another .NET PDF library that's free and works well with reading PDFs? Alternatively, if there's a way to keep the formatting in the .NET library with a code stage? (preferably in C#)
Thank you.