cancel
Showing results for 
Search instead for 
Did you mean: 

Extract value from the HTML body of an Email

MadalinaAcsinte
Level 3

Hello,

I have a question regarding the following table:

Name

Number

Hours

Date

Approve? (Y/N)

Maria

123

8

11/16/2020

Y

Ana

456

10

11/16/2020

N


I read this table from the HTML body of an Email, so I have to work with the HTML format of the table.
I want to extract the value of the last column with "Y" or "N".  Also, I want to know for which name is approved or not.

Any ideas on how should I do that?

Thanks a lot,



------------------------------
Madalina Acsinte
------------------------------
13 REPLIES 13

MichalSzumski
Level 6
Hi Madalina,

HTML is usually tricky. Is that format of the table always the same? If it is then you can read email (including HTML code), do some text splitting using HTML elements and then loop on rows.

------------------------------
Michal Szumski
RPA developer
Rockwell Automation
------------------------------

In addition to parsing the HTML text, another possibility is saving the HTML to a local file and opening it via a web browser. Then the normal Blue Prism spying methods could be used to extract the table.

------------------------------
Nicholas Zejdlik
RPA Developer
------------------------------

ShashikantPatil
Level 3
Hi Madalina Acsinte,

If you able to extract that data into BP collection then you can apply the filter from Business Object: Utility - Collection Manipulation, Action: Filter Collection
Please find below url for filter criteria

https://portal.blueprism.com/customer-support/support-center#/path/Automation-Design/Studio/Visual-Business-Objects/1194312962/What-is-the-syntax-for-an-expression-used-by-the-Filter-Collection-action...



------------------------------
Shashikant Patil
Senior Associate
Cognizant
America/New_York
------------------------------

AndreyKudinov
Level 10
You can just parse it yourself, if it is simple enough. Although in general you better use a proper parser, because it can get tricky.
HtmlAgilityPack.dll is part of blueprism, so you can do something like this:
// Ref: HtmlAgilityPack.dll, System.Core.dll
// NS:  HtmlAgilityPack, System.Linq
// In:html (Text), Out:dt (Collection)
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);

dt = new DataTable();

HtmlNode table = doc.DocumentNode.SelectNodes("//table")[0]; // Pick first table
HtmlNodeCollection headers = table.SelectNodes("//tr/th"); // Headers
// Columns
foreach (HtmlNode header in headers)
    dt.Columns.Add(header.InnerText, typeof(string));
// Rows 
foreach (HtmlNode row in table.SelectNodes("//tr"))
    dt.Rows.Add(row.Descendants("td").Select(td => td.InnerText).ToArray());


Assuming table has headers and no colspans, this should work with any number of columns/rows.

Can do it without Linq, just a bit longer. 

------------------------------
Andrey Kudinov
Project Manager
MobileTelesystems PJSC
Europe/Moscow
------------------------------

Hi Andrey Kudinov,

I think parsing HTML with HtmlAgilityPack.dll is a great idea. Can you please provide the download source?

------------------------------
Shashikant Patil
Senior Associate
Cognizant
America/New_York
------------------------------

If you don't have the DLL file already, you can download it from their website: https://html-agility-pack.net/

The download link will send you to nuget; if you download the package, the DLL file should be in there.

------------------------------
Nicholas Zejdlik
RPA Developer
------------------------------

Ok, scratch that part about it being part of Blueprism, it seems I just toyed around with it before or maybe it came with some other package I installed.
Either way, Nicholas provided download link already. nupkg file is just zip archive.


------------------------------
Andrey Kudinov
Project Manager
MobileTelesystems PJSC
Europe/Moscow
------------------------------

Hi Andrey,

Could you please provide me the bp release file of the above html parsing code? I have similar requirement where I want to read the first table from the body of email. 

Thanks
Ashis

------------------------------
Ashis Kumar Ray
RPA Developer
TCS
Europe/London
------------------------------

Make sure you put HtmlAgilityPack.dll in your blueprism folder.
This is mostly a proof of concept. I left both linq and nonlinq code stages just in case.

------------------------------
Andrey Kudinov
Project Manager
MobileTelesystems PJSC
Europe/Moscow
------------------------------