cancel
Showing results for 
Search instead for 
Did you mean: 

Google search

AleksandrNikuli
Level 3
Hi Is there any API-s for Google search we can use with BP? We have a plan to automate google searches and actually we already have draft of the process, but now we have doubts that Google can just ban us if we do it in our way. Here is short explanation about the process: Every day we have to Google around 1-2k customers  Google search is performed with a customer name and some specific keywords (with OR operators), like "John Johnson AND (drugs OR crime OR ...) Google search is performed using browser URL, not search field, e.g https://www.google.com/search?as_q=John%20Johnson%20AND%20(drugs%20OR%20crime)  When search is performed, bot saves the search in HTML format and proceeds with next customer Firefox is used for automation (do not have last version of BP, so Chrome is out of scope and Explorer is not able to save all the files) and hotkeys for all the steps, like: CTRL+L in order to activate the link field, CTRL+V to paste the link, CTRL+S to save the HTML etc.. Each search takes around 20 seconds I think (including saving the file). So yes, my questions are: Is there any risk we can get banned? or "CAPTCHA" is the first thing google will apply here? Is there any Google API-s we can link with new version of BP or any other "official" solutions? :) Thanks in advance!  Aleksandr  
5 REPLIES 5

KenChastain
Level 2
What happens when you try to download the google results programmatically? I wonder if you would still get hit with the capcha requirement if the WebClient object were always closed after each iteration. What are you hoping to get out of this? Just links? I would think you could also then take the string value the below code produces and pass it to an html parser that could pull all of the links out of the string and then store them in a database.   public static string downloadWebPage(string theURL) { //### download a web page to a string WebClient client = new WebClient(); client.Headers.Add(""user-agent"", ""Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)""); Stream data = client.OpenRead(theURL); StreamReader reader = new StreamReader(data); string s = reader.ReadToEnd(); return s; }

BenKirimlidis
Level 7
GDPR We considered using google to help us with a task, however we realised early on that this would mean processing customer data with a third party without customers permission, so that idea was shut down very early. GDPR means sending data to google to look up any information in this way is totally off the table for any commercial operation operating in the EU, as it should be anywhere in the world. We have no idea what google will do with that data, we have no idea what the ISP we are using are goingto do with that data.  There are too many cases of ISPs and other data collectors selling data to 3rd parties or logging and maintaining that data for any number of purposes that the data subject has not consented to. Any data being sent external in this fashion should be run by you legal and/or compliance departments.

AleksandrNikuli
Level 3
@kchastain75 , thank you for the feedback! The thing is that my task is to save the Search in exactly the same way person is seeing this - links, part of articles below the links, pictures etc. The idea is that during the night robot will prepare those HTML files for specialists, who will proceed with further analysis in the morning - employee will have a look on the HTML and make a decision, if there is anything bad written about the customer and if needed, he just clicks on the link and proceeds with further investigation online. Thus, having just links or the string of web page does not seem a solution here. And yes, I would like to link this task with Blue Prism, as that is my only tool in daily work.

AleksandrNikuli
Level 3
Ben_M31, GDPR is a different topic, but I still appreciate you are mentioning that.   I have an opinion that googling just customer names without any othere related data like address or ID is ok?

BenKirimlidis
Level 7
Hi Aniku, I work in the banking sector and we have to rigorously and zealously protect customers privacy in everything we do. Make sure and run it by compliance/legal team because this definitely exposes you to some risk.  It may be minimal but many organizations have very low or zero tolerance for customer data breach risks events. Any EU customer data processing is subject to GDPR, and using a Google search does count because you are effectively handing over customer data without entering into a mandatory data processing contract with the 3rd party which is a breach of privacy.  GDPR doesn't care if its only a name or a phone number, unless your customers consent to have their details put into the Google search at the bare minimum you might have a problem. Google is under no obligation to keep that data private, you would have freely given over customer data without their consent, indicating to Google that these individuals have contacted your company for X services, meaning that Google can now target ads at those customers for similar services or sell that data on to other 3rd parties meaning customers trusting your company with their data can now make a claim against you and your company could be fined. That data has also been given over to your ISP as well, and now they have that data, which they will most certainly sell. All of this definitely falls under GDPR and would definitively constitute a breach.  GDPR is the legislation that just keeps on giving. Check out this PDF:http://gdprandyou.ie/wp-content/uploads/2018/05/Guidance-for-Data-Processing-Contracts-GDPR.pdf