18-04-24 04:53 PM
Hi,
Is there any VBO available to extract images from PDF files.
Thanks,
Maheshwar
18-04-24 06:22 PM
Hi Maheshwar,
you can check "Connector for Blue Prism - Adobe PDF Services - API - 2.0.0" asset from digital exchange.
https://digitalexchange.blueprism.com/dx/entry/3439/solution/blue-prism---adobe-pdf-services---export
19-04-24 08:42 AM
Hi Harish,
Thank you for your reply.
Unfortunately, the connector requires the creation of a paid account in Adobe PDF Services, which isn't suitable for my current needs.
I'm searching for a simpler utility similar to those available in Power Automate and Ui Path.
My Business case is to extract images in a Text PDF and store images in required folder.
2 weeks ago
You can install Poppler pdfimages.exe application and in blueprism using Environment object use start process and specify pdfimages.exe and output folder path and provide below cmd code
pdfimages -all yourfile.pdf outputpath
2 weeks ago - last edited 2 weeks ago
there is another way of doing this is , using python code. you have to install python software and pypdf library , save the file with .py extension and run this python file using Environment object - start process
from pypdf import PdfReader
import os
pdf_path = r"C:\yourpath\testPDFImage\image-doc.pdf"
output_folder = r"C:\yourpath\outputimage"
os.makedirs(output_folder, exist_ok=True)
reader = PdfReader(pdf_path)
img_count = 1
for page_num, page in enumerate(reader.pages):
if "/XObject" in page["/Resources"]:
xObject = page["/Resources"]["/XObject"].get_object()
for obj in xObject:
if xObject[obj]["/Subtype"] == "/Image":
data = xObject[obj].get_data()
file_name = f"image_{page_num+1}_{img_count}.png"
with open(os.path.join(output_folder, file_name), "wb") as f: f.write(data)
img_count += 1