cancel
Showing results for 
Search instead for 
Did you mean: 

PDF TO EXCEL

delacochesilver
Level 4
HI ALL i hope you're feel good
i have a preocupation , i have a project to consit to conect to a plateform dans to somme activity , at end i download a file , but the file is in pdf , i need to transofrm the pdf in excel file.
if some one have any code or solution to transform a pdf file in excel file in blueprism.
i found an issue but it to use an API but this API is not free.
some one have a other solution? a code in blueprism? thx

------------------------------
delacoche silver
------------------------------
11 REPLIES 11

Hello there,

We have a VBO on Digital Exchange to convert PDF to Excel you can check it out here

Also, check out the recent discussion for the same PDF to Excel file conversion if you notice any difficulties.


------------------------------
If I was of assistance, please vote for it to be the "Best Answer".

Thanks & Regards,
Tejaskumar Darji
Sr. Consultant-Technical Lead
------------------------------

Hi Tejas, I am receiving Extension Error while trying to open the .mht file to excel  as below, I saw the threads but couldnot reply or move forward with resolving the issue : Were you able to get through below error : 11786.png



------------------------------
Mukesh Kumar
------------------------------
Regards,

Mukesh Kumar

Hi Mukesh, 

Instead of VBO, you can perform the same task with the help of global send keys. 

1.) Launch the pdf files

2.) Activate the pdf and send the Ctrl+Shift+S, then you will get save as  window

3.) Select the XLSX option. It will automatically save your pdf in excel format.



------------------------------
Sahil Chankotra
------------------------------

Hi @Mukesh Kumar ,

As I can see in your screenshot, the file name somehow has not been saved with the .mht format as it still says .xls. I have seen this issues in past at times while importing the release files which perhaps maybe due to some version issues. What I would suggest you is that debug the workflow till the point where saving operation is used for the file in Open file in Word page as shown below:

11789.png

Here, basically we are saving the PDF file .mht format. So, ideally your problem should lies in this space since it can be due to many things which can be filename not having .PDF extension in the file name data item which basically is being replaced with mht extension. When you step through this action stage, check if a .mht file got generated or not. If it gets generated then your issue should be resolved going further in the next stages. 

The way it works in short is, PDF file gets saved as a .MHT extension file with help of Word.Application package and then the .mht file is read in excel application using Excel.Application package that trims down all the images and just keeps your text in a readable table format in the destination spreadsheet file.



------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

sonsharm
Staff
Staff

Hi @delacoche silver 

You can try using below code. Attaching the dll as well which needs to be placed at location "C:\Program Files\Blue Prism Limited\Blue Prism Automate"        

Imports System.IO
Imports Bytescout.PDFExtractor
Imports System.Diagnostics

       ' Create Bytescout.PDFExtractor.XLSExtractor instance
        Dim extractor As New XLSExtractor()
        extractor.RegistrationName = "demo"
        extractor.RegistrationKey = "demo"
       ' Load sample PDF document
        extractor.LoadDocumentFromFile("test.pdf")
        ' Save the spreadsheet to file
        extractor.SaveToXLSFile("test.xls")



------------------------------
Sonam Sharma
Manager, Blue Prism
SS&C
------------------------------
Sonam Sharma Manager, Blue Prism SS&C

Hi @Sonam Sharma 

Is this a part of the free trial or if any license agreement needs to be in place between the vendor and the firm who would be using this DLL on a commercial level. 

The reason I ask is because if it is not supposed to commercially used without any proper agreement in place this might be a concern in many organizations prior implementing such services at their firm from a legal perspective. From what I understood at my end, implementing this on a production scale might require a quote as well as in has been mentioned on their licensing page:

11793.png



------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

Hi @devneetmohanty07 

It's like project dependencies that we import in our projects, so all the dll's that we need are available on NuGet (https://www.nuget.org/)It is the package manager for .NET.  and is the central package repository used by all package authors and consumers. Since Blue Prism allows to utilize  .NET as coding language, we won't require special licensing to leverage packages available in the marketplace.



------------------------------
Sonam Sharma
Manager, Blue Prism
SS&C
------------------------------
Sonam Sharma Manager, Blue Prism SS&C

Hi @Sonam Sharma

NuGet is an open-source package manager compatible with VB .NET as far as I am aware since it consists of all the packages that are registered under the same.

 However, it necessarily does not mean always that the packages which are hosted are free to use. I have faced similar issues with iTextSharp before when I tried implementing it in a previous client where it had GNU license under which some commercial agreement needed to be agreed which came up as a major issue and even that is a part of NuGet. With that being said, I also am aware that many companies still do use them without any proper agreement however not sure if that is authorized or not or maybe they don't have any audits performed as such but yes, many clients do pick on these things as per my experience.

I just wanted to check if this utility had any MIT license or anything else since I was not clear on this take from their website, but it seems to be not free totally. Probably would need to do some research on this. But I necessarily don't agree with the point that any package which is on NuGet is freely available since I got the license page which I showed earlier from NuGet Repository only. Same can also be said from the following post: licensing - do all packages in nuget have free licence to use? - Stack Overflow



------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

@Sonam Sharma 

@devneetmohanty07 is correct here. The Bytescout DLL should not be attached to your post. Instead, you should direct anyone to the associated Nuget or project pages as there is a provider specific license associated with that utility. This is not open-source software and is not licensed under MIT, BSD, or any of the other myriad open source licenses.

FWIW - Bytescout are a provider on the Digital Exchange too. You can find their DX asset here. This uses Bytescout's REST API for extracting information from PDFs.

Cheers,



------------------------------
Eric Wilson
Director, Integrations and Enablement
Blue Prism Digital Exchange
------------------------------