cancel
Showing results for 
Search instead for 
Did you mean: 

Help figuring out a way to get text information from a website.

JariCarlson
Level 2

Hello, 

I am looking to get some advice or help on a project I am working on. The basis of the project is I am searching a website for information about court hearings based on a certain attorneys name / possible dates of the hearings. 

Searching the website based on which attorney and what dates I am inputting works perfectly fine but I am running into some trouble when I go to grab the data. There could be anywhere between 1-150 potential court dates that I want to grab information for.

So far I have grabbed the amount of total results they are and have stored that into a data item so I can set up some kind of loop with that as the maximum amount of times. I am unsure of how to proceed from here however. I could individually spy each separate DIV up to 150 and loop through grabbing the text from each and then finding a way to utilize the blue prism text functions to trim the text down to what I need before adding it to a collection. 

This seems like it is inefficient and would take a super long time to setup though. Does anyone have any efficient ideas on how I could loop through each result and export the text I need into a collection?

Thanks in advance I appreciate it!



------------------------------
Jari Carlson
------------------------------
2 REPLIES 2

LeonardoSQueiroz
Level 10
Hello,
 
Does the system have the ability to filter or generate reports? I believe that these 2 paths can drastically reduce the number of results or allow for faster capture.
 
Another possibility, depending of course on how the information is visible, is to extract the source code of the page and, using REGEX, extract only the texts you want, and add this data to a collection.
 
If the system has an API or database access, this could also be a possibility to be analyzed.
Regards,


------------------------------
Leonardo Soares
RPA Developer Tech Leader
Bridge Consulting
América/Brazil
------------------------------
Leonardo Soares RPA Developer América/Brazil

JonathanPeters
Level 2

Difficult without seeing the webpage, but are the results displayed as a table? If so you may be able to spy the entire table and then use a Read stage using "Get Table Items".

Alternatively spy the first line or whatever, and then take out (untick) the Web path/Xpath attribute so that the element matches with ALL 150 of the lines (you may need to tweak the selection of elements further until you nail it). Having done that, tick the Match Index attribute and make it dynamic. If it works you should be able to read line 1 by passing Match Index 1, line 2 by passing 2 and so on.

Only been working in Blue Prism 18 months, but have found not every idea works with all web pages, however just a couple of suggestions to try.



------------------------------
Jonathan Peters
------------------------------