<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Karan, in Product Forum</title>
    <link>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90363#M40593</link>
    <description>Karan,
Please understand, there is no simple solution here that would give 100% accurate results. To understand what makes up a sentence, in a set of rules that can be given to computer software without any cognitive understanding is pretty hard. There are cognitive tools available that can provide some syntactic analysis of a document, but might be overkill for what you are looking to achieve, especially if you want a pure Blue Prism solution.
As Ivan has demonstrated, you can get pretty close by looking for the first period, question mark, or exclamation mark followed by white-space.
Regex:  /^(.*?)[.?!]\s/
In Blue Prism you could keep it simple with the expression, Trim(Replace([text],Left([text],InStr([text],"". ""))),"""") However, given your use case in your post, this would yield,
""The Japanese loan will be available at 0.1% interest rate on Oct.""
You would also need to consider the remaining characters that suggest the end of a sentence.
Comparing the integer output from, InStr([text],""! ""), InStr([text],"". "") and InStr([text],""? "") to find the expression the outputs the lowest value, before performing the full expression to manipulate the text.
You also need to consider what if [text] is only 1 sentence. You would need to do a decision stage to check if there is more than 1 period (e.g. InStr([text],""."")&amp;gt;1). You then also need to consider what if there are no periods in the data item.
Even if you check that the first character after "". "" is a lowercase character, meaning that this is still part of the first sentence, won't be accurate as, again, in your use case this is a numeric value, which doesn't indicate if it is still part of the first sentence, or a separate sentence.
Tom</description>
    <pubDate>Tue, 19 Sep 2017 19:37:00 GMT</pubDate>
    <dc:creator>TomBlackburn1</dc:creator>
    <dc:date>2017-09-19T19:37:00Z</dc:date>
    <item>
      <title>Extracting first sentence from a paragraph</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90361#M40591</link>
      <description>Hi,

Is there a way we can extract first sentence from a paragraph. Can regex be used here. if yes How?

say for example the paragraph below has two sentences, and I need first sentence:

The Japanese loan will be available at 0.1% interest rate on Oct. 25 and India will be able to repay this in 50 years. Repayment will begin 15 years after the loan is received.

My Desired output: The Japanese loan will be available at 0.1% interest rate on Oct. 25 and India will be able to repay this in 50 years.

how can i do that?

Regards
Karan</description>
      <pubDate>Mon, 11 Sep 2017 10:47:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90361#M40591</guid>
      <dc:creator>KaranJuneja</dc:creator>
      <dc:date>2017-09-11T10:47:00Z</dc:date>
    </item>
    <item>
      <title>The easiest option would be.</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90362#M40592</link>
      <description>The easiest option would be.
InStr([text], "". "") - This will output a [character number] when the next sentence starts;
Left([text], [character number]) - This will extract a text (preferably into [new sentence data item]
Replace([text], [new sentence data item], """") - This will replace sentence one in [text] so you can move on to the next one, if required.</description>
      <pubDate>Mon, 11 Sep 2017 13:50:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90362#M40592</guid>
      <dc:creator>ivan.gordeyev</dc:creator>
      <dc:date>2017-09-11T13:50:00Z</dc:date>
    </item>
    <item>
      <title>Karan,</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90363#M40593</link>
      <description>Karan,
Please understand, there is no simple solution here that would give 100% accurate results. To understand what makes up a sentence, in a set of rules that can be given to computer software without any cognitive understanding is pretty hard. There are cognitive tools available that can provide some syntactic analysis of a document, but might be overkill for what you are looking to achieve, especially if you want a pure Blue Prism solution.
As Ivan has demonstrated, you can get pretty close by looking for the first period, question mark, or exclamation mark followed by white-space.
Regex:  /^(.*?)[.?!]\s/
In Blue Prism you could keep it simple with the expression, Trim(Replace([text],Left([text],InStr([text],"". ""))),"""") However, given your use case in your post, this would yield,
""The Japanese loan will be available at 0.1% interest rate on Oct.""
You would also need to consider the remaining characters that suggest the end of a sentence.
Comparing the integer output from, InStr([text],""! ""), InStr([text],"". "") and InStr([text],""? "") to find the expression the outputs the lowest value, before performing the full expression to manipulate the text.
You also need to consider what if [text] is only 1 sentence. You would need to do a decision stage to check if there is more than 1 period (e.g. InStr([text],""."")&amp;gt;1). You then also need to consider what if there are no periods in the data item.
Even if you check that the first character after "". "" is a lowercase character, meaning that this is still part of the first sentence, won't be accurate as, again, in your use case this is a numeric value, which doesn't indicate if it is still part of the first sentence, or a separate sentence.
Tom</description>
      <pubDate>Tue, 19 Sep 2017 19:37:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Extracting-first-sentence-from-a-paragraph/m-p/90363#M40593</guid>
      <dc:creator>TomBlackburn1</dc:creator>
      <dc:date>2017-09-19T19:37:00Z</dc:date>
    </item>
  </channel>
</rss>

