cancel
Showing results for 
Search instead for 
Did you mean: 

String Manipulation-Fetch data in between the string

NirbhayMishra
Level 2
Hi Community,

I have a string in a variable that looks as following:
"Name : Nirbhay Mishra
Profile : Intelligent Automation
Age : 42
abc@xyz.in"
I want to fetch the value of Name, Profile and Age in separate variables for further processing.
What are the possible ways to do that ?
8 REPLIES 8

EslamGhandour
Level 4

Hi!

Maybe you can provide a screenshot of the value inside the data item you're talking about? 

The reason for my question is to know if the value is a one line string that holds all the value which will looks something like "Name : Nirbhay MishraProfile : Intelligent Automation" or it will be separated by a line. 

If by lines, you can easily use the "Split Text" action from Utility - Strings, this will give you a collection with each value in a row for easy manipulation. 

If it's a one line string, you can use the below expression in a calculation stage. Assuming your input is saved in a Data Item called "Data1" and there is a space before and after the colons as the example you provided. This should be giving you an output of the name as it lies between "Name" and "Profile" words. I added a screenshot and the exact syntax for easier copying.

Mid([Data1], InStr(Upper([Data1]), "NAME")+7, InStr(Upper([Data1]), "PROFILE")-(InStr(Upper([Data1]), "NAME"))-7)

35241.png

You can modify the words and the numbers to get different values. 

MichealCharron
Level 8
@NirbhayMishra

If you look in Blue Prism's "Utility - Strings" VBO , there is a nice little action called "Extract Regex All Matches" (you might have to pull a later one from the Digital Exchange depending on your version of the VBO) which allows you to extract values into a collection using a Regex pattern with capture groups. For the example above, the following Regex pattern would extract the values:
Name\s:\s(?<Name>.*)[\r\n]+Profile\s:\s(?<Profile>.*)[\r\n]+Age\s:\s(?<Age>.*)[\r\n]+​


The action would look like the following:

35242.png
The output collection would be:
35243.png

With that you can simply refer to the collection fields [Regex Matches.Name], [Regex Matches.Profile] and [Regex Matches.Age] to get your data.

The real neat thing about this action though is that if you have a document with multiple sets of that same formatted data, the action can extract each set into it's own row in the collection.
Micheal Charron
RBC
Toronto, Ontario
Canada

Soumya21
Level 6
Hi @NirbhayMishra

You can try by using action Extract Text Field action from Utility Strings ​VBO. give inputs pretext   Name :  and post text  Profile.
Use separate actions to extract Name, Profile and Age by giving proper pretext and post text 
I hope this helps,

Thanks,
Soumya

Soumya21
Level 6
Hi @NirbhayMishra

You can try by using action Extract Text Field action from Utility Strings ​VBO. give inputs pretext   Name :  and post text  Profile.
Use separate actions to extract Name, Profile and Age by giving proper pretext and post text 
I hope this helps,

Thanks,
Soumya

NirbhayMishra
Level 2
35246.jpg

Hi @EslamGhandour above is the screenshot of the value inside the data item I am talking about.
I need to fetch Contact Person, Telephone number and Number of cartons from this data item.

Can you please help how can I achieve that ?
​​

NirbhayMishra
Level 2
Hi @MichealCharron
Using the solution you suggested I tried using  "Extract Regex All Matches"  for extracting the data from data item as in snip below:
35251.jpg
I used following regex :
"Contact person\s:\s(?<Contact person>.*)[\r\n]+Telephone number\s:\s(?<Telephone number>.*)[\r\n]+Number of cartons\s:\s(?<Number of cartons>.*)[\r\n]+​"

It is erroring out with Exception : 
Internal : Could not execute code stage because exception thrown by code stage: parsing "Contact person\s:\s(?<Contact person>.*)[\r\n]+Telephone number\s:\s(?<Telephone number>.*)[\r\n]+Number of cartons\s:\s(?<Number of cartons>.*)[\r\n]+​" - Invalid group name: Group names must begin with a word character.
refer the snip below for reference:
35252.jpg

I tried finding the solution and making some changes here and there but nothing seems to be working, can you please suggest how can we resolve this or any workaround to fetch Contact Person, Telephone number and Number of cartons
​​​

EslamGhandour
Level 4

Hi @NirbhayMishra

You can use the "Split Text" action from "Utility - Strings" Object​. If you don't have the object, you can get it from here. 

It should look like this 35256.png

And your output will be like in a collection like this 35257.png
Alternatively, you can use calculation stage that will look like this, 35258.png

Mid([Data1], InStr(Upper([Data1]), "CONTACT PERSON")+16, InStr(Upper([Data1]), "TELEPHONE")-(InStr(Upper([Data1]), "CONTACT PERSON"))-16)

This exact calculation stage will give you the exact name. The way it works that it gets you everything that is between the words "CONTACT PERSON" and "TELEPHONE".
You can apply the same logic by changing the words you're searching for and as well the number of digits which is 16 in the previous example. You might want to reduce it or increase it.

Hope that helps.

MichealCharron
Level 8
@NirbhayMishra

The following Regex pattern would probably work for you:
Contact\sperson:\s(?<ContactPerson>.*)[\r\n]+Telephone\snumber:\s(?<TelephoneNumber>.*)[\r\n]+Number\sof\scartons:\s(?<NumberOfCartons>.*)[\r\n]+​

The differences between my pattern and yours is:
  • The Named Capture Groups "?<CaptureGroupName>" cannot contain spaces. The name can contain an underscore instead of a space but I am a CamelCase person so you never see me using underscores.
  • The "\s" is a shorthand character class for for white space. Most of the time you can just use a simple space but I have been caught so many times with a tab that I've conditioned myself to use the "\s" instead.

Micheal Charron
RBC
Toronto, Ontario
Canada