cancel
Showing results for 
Search instead for 
Did you mean: 

PDF to XML

AimanNishat
Level 3
Guys, Need help on how can I convert PDF file to XML.

------------------------------
Aiman Nishat
------------------------------
11 REPLIES 11

PvD_SE
Level 12
Hi Aiman,

Would opening the PDF in Word, and then having Word save the document in .XML format do the trick?

------------------------------
Happy coding!
---------------
Paul
Sweden
------------------------------
Happy coding!
Paul, Sweden
(By all means, do not mark this as the best answer!)

sonsharm
Staff
Staff

Hi @AimanNishat ,

One way of achieving this is by leveraging code stage in blue prism.

1.  Makes sure below namespaces are imported in your object

Imports System
Imports System.Collections.Generic
Imports System.Text
Imports Bytescout.PDFExtractor
Imports System.Diagnostics

2. Call below code in your code stage

            Dim extractor As New XMLExtractor()
            extractor.RegistrationName = "demo"
            extractor.RegistrationKey = "demo"
            ' Load sample PDF document
            extractor.LoadDocumentFromFile("test.pdf")
            extractor.SaveXMLToFile("output.XML")
 



------------------------------
Sonam Sharma
------------------------------
Sonam Sharma Manager, Blue Prism SS&C

Hi @Sonam Sharma ,

Can you please share the dll file of PDFExtractor as well?



------------------------------
Manpreet Kaur
Manager
Deloitte
*If you find this post helpful mark it as Best Answer
------------------------------

Hi @ManpreetKaur1 

PFA !



------------------------------
Sonam Sharma
Manager, Blue Prism
SS&C
------------------------------
Sonam Sharma Manager, Blue Prism SS&C

Attached here !



------------------------------
Sonam Sharma
Manager, Blue Prism
SS&C
------------------------------
Sonam Sharma Manager, Blue Prism SS&C

Great !!

Thanks @Sonam Sharma !!

Any specific guidelines to use the attached code?



------------------------------
Manpreet Kaur
Manager
Deloitte
*If you find this post helpful mark it as Best Answer
------------------------------

Hi Sonam,

I have imported all the namespace and added attached dll and getting below error could you please help.

Internal : Could not execute code stage because exception thrown by code stage: Could not load file or assembly 'Bytescout.PDFExtractor, Version=13.3.0.4514, Culture=neutral, PublicKeyToken=f7dd1bd9d40a50eb' or one of its dependencies. The system cannot find the file specified.



------------------------------
Saumitra Kumar Sharma
------------------------------

Hi @Saumitra_KumarS 

Did you place the dll file at  location "C:\Program Files\Blue Prism Limited\Blue Prism Automate" ? 

------------------------------
Sonam Sharma
Manager, Blue Prism
SS&C
------------------------------

Sonam Sharma Manager, Blue Prism SS&C

Hi @Sonam Sharma It's already resolved Thanks



------------------------------
Saumitra Kumar Sharma
------------------------------