18-04-24 04:53 PM
Hi,
Is there any VBO available to extract images from PDF files.
Thanks,
Maheshwar
18-04-24 06:22 PM
Hi Maheshwar,
you can check "Connector for Blue Prism - Adobe PDF Services - API - 2.0.0" asset from digital exchange.
https://digitalexchange.blueprism.com/dx/entry/3439/solution/blue-prism---adobe-pdf-services---export
19-04-24 08:42 AM
Hi Harish,
Thank you for your reply.
Unfortunately, the connector requires the creation of a paid account in Adobe PDF Services, which isn't suitable for my current needs.
I'm searching for a simpler utility similar to those available in Power Automate and Ui Path.
My Business case is to extract images in a Text PDF and store images in required folder.
Thursday
I searched on every forum on internet and eventually end up fixing this by searching for a native Blue Prism VBO for PDF image extraction can be tough since it's not a common out-of-the-box feature. Your best bet will probably be looking for a free Digital Exchange VBO that wraps a simple .NET library (like an older iTextSharp version) or one that uses VBA/VBScript for basic file manipulation.
Thursday
You can install Poppler pdfimages.exe application and in blueprism using Environment object use start process and specify pdfimages.exe and output folder path and provide below cmd code
pdfimages -all yourfile.pdf outputpath
Thursday - last edited Thursday
there is another way of doing this is , using python code. you have to install python software and pypdf library , save the file with .py extension and run this python file using Environment object - start process
from pypdf import PdfReader
import os
pdf_path = r"C:\yourpath\testPDFImage\image-doc.pdf"
output_folder = r"C:\yourpath\outputimage"
os.makedirs(output_folder, exist_ok=True)
reader = PdfReader(pdf_path)
img_count = 1
for page_num, page in enumerate(reader.pages):
if "/XObject" in page["/Resources"]:
xObject = page["/Resources"]["/XObject"].get_object()
for obj in xObject:
if xObject[obj]["/Subtype"] == "/Image":
data = xObject[obj].get_data()
file_name = f"image_{page_num+1}_{img_count}.png"
with open(os.path.join(output_folder, file_name), "wb") as f: f.write(data)
img_count += 1