Import CSV with delimitator ";"

SAINNATHSURAY · ‎28-03-22

Hi everyone,

I have a downloaded CSV file SAP ,file can be opened in excel but columns clubbed ";" So I need to diliminate the columns

1)I have tried to change code in existing object "Import CSV"
excel vbo action and change delimiter in the code stage to ";"
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False < change to True
.TextFileCommaDelimiter = True < change to False
.TextFileSpaceDelimiter = False

But I am getting an error "Invalid index. (Exception from HRESULT: 0x8002000B (DISP_E_BADINDEX)) "

2.I have tried another way
Read all text from file" is in utility-file management
"get csv as collection" is in utility-strings

unfortunately my CSV have duplicate columns so it is throwing error in this methode

------------------------------
SAINNATH SURAY
Business process Lead
TATA CONSULTANCY SERVICE
Asia/Kolkata
------------------------------

PabloSarabia · ‎28-03-22

Hi @SAINNATHSURAY

What exactly do you want to do? "translate" the CSV file into a Excel file? Or get it as a collection to work with?

I guess the problem is with the duplicate column names.

See you in the community, bye 🙂

------------------------------
Pablo Sarabia
Architect
Altamira Assets Management
Madrid
------------------------------

PvD_SE · ‎28-03-22

Dear folks @ BluePrism:

1. This issue was already handled last week in an old thread from 2021 that was not closed. Here it is again, now as a duplicate in a new thread. Can you do something about this?

2. There seem to be a lotta questions about getting CSV data into a collection within a process. This all works fine and dandy as long as the robot has region settings matching the divider in the incoming file. If not, the 'Get CSV Text As Collection' does not work as expected. While some have the possibility to require the sender of the CSV to use the preferred settings, others will have to work with whatever has been sent. With the CSV already in use in 1983 and being short for Comma Separated Values (wikipedia), still, in many a case, a semicolon or perhaps other dividers are used instead of a comma. Wouldn't it be nice to adjust the BP VBO so that it accepts whatever divider you specify? And no, I wouldn't prefer a DX solution as not everybody has the possibility to download, implement and use these.

3. While you're at it: duplicate headers seem to also be a case of concern to more than a few. Surely that could be handled in a nicer way that exception?

------------------------------
Happy coding!
Paul
Sweden
------------------------------

Happy coding!
Paul, Sweden
(By all means, do not mark this as the best answer!)

ewilson · ‎28-03-22

I've added an item to our backlog to look at including support for a configurable delimiter to the MS Excel VBOs Import CSV action. However, you'd still have issues with the your duplicate column names.

Cheers,

------------------------------
Eric Wilson
Director, Integrations and Enablement
Blue Prism Digital Exchange
------------------------------

PvD_SE · ‎28-03-22

<thumbsup>Backlog addition: Great, thanks! </thumbsup>

Perhaps duplicate headers could automatically be provided with some sort of random suffix? Same goes for headers with a . (or other illegal chars), where the illegal char could either be replaced or removed? Although we for the header issue already today have a workaround to ignore headers.

------------------------------
Happy coding!
Paul
Sweden
------------------------------

Happy coding!
Paul, Sweden
(By all means, do not mark this as the best answer!)

stepher · ‎28-03-22

Hello, @SAINNATHSURAY,

So, a few things come to mind. The first, and most direct is that @AmiBarrett posted some custom code a [long?] while back that allows Text-to-Collections using a delimiter other than a comma (although, I suppose you could use it for true comma-delimited if you wanted). We have incorporated that code, or a slight variant, into a few processes within our organization. I tried to find the thread to connect it directly here, but was not successful. If your source file is consistently semi-colons, then that would be my recommendation.

If that does not seem feesible, I have another solution, which is tedious but works:

Bring the text into a source collection with NO valid delimiters (so each line is whole to itself)
Loop through the line of the source collection
1. 'InStr' to find your next delimiter location
  1. Take the value to the left of your new found delimiter and pass that to your target collection
  2. Replace the same value, and the delimiter you just used, in the orginal value
2. Repeat to the end of the line
Repeat to the end of collection

This is certainly not rocket science, by any stretch, but, in my case, I had a text file which had three separate delimiters in the line--I just needed to keep track of where I was in the line to determine which value to look for.

The second part, about the duplicate headers can be easily addressed if your source file is consistent in its data presentation. In this case, I would predefine the columns (names and data types) in the target collection, bring in the data as if it had no headers, and then delete the first line (AKA, 'The Headers') before moving on. I have done this using a very explicitly defined collection, and I have also looped through a reference file to add the columns to the collection to create the framework. The 'advantage' of the reference file is that if your source column changes/evolves, you can quickly make the changes in the reference file.

The last part, is that would be nice to have a "blessed" Text-to-Collections action that allows for a definable delimiter.

Good Luck,
Red

------------------------------
Robert "Red" Stephens
Application Developer, RPA
Sutter Health
Sacramento, CA
------------------------------

Robert "Red" Stephens Application Developer, RPA Sutter Health Sacramento, CA

AmiBarrett · ‎28-03-22

I'm having trouble finding the post as well. I've got a couple VBOs posted to GitHub that should cover this though.

If you've got a CSV file, here's my Extended Collections object:
https://github.com/AmaliaKalio/Nephthis-BP-VBO---Extended-Collections-Tools

If it's text in a pre-existing data item, try Internal CSV:
https://github.com/AmaliaKalio/Nephthis-BP-VBO---Internal-CSV

------------------------------
Ami Barrett
RPA Blue Prism SME
Karsun Solutions
Plano TX
------------------------------

AmiBarrett · ‎28-03-22

Duplicate headers can be addressed one of two ways.

Import the CSV where headers is set to 'False' - this will generate unique column names (F1, F2, F3, etc). If you need the actual headers to be the column names, you can modify the data in the appropriate cell and use 'Set Column Names from First Row'
Alternatively, you can load the CSV into a text data item and replace the text where you know it's going to be an issue. For example, if your header row looked like "F1;F2;F3;F1;F5", you could do a find and replace for the second occurrence of "F1;" and set it to "F4;". It should import fine from there.
1. To get this to work in Blue Prism, you'd need to find the position of the first occurrence, trim the characters leading up to that InStr position + 2, then use InStr on the new text data to find the first full occurrence within the resulting string. (This should be the second iteration within the original.)
2. Or if you're lucky enough for it to be a duplicate of the very first column, you could search for ";F1;" and replace that, since the first entry wouldn't have a leading delimiter.

------------------------------
Ami Barrett
Solution Architect
Karsun Solutions
Plano TX
------------------------------

ewilson · ‎29-03-22

@SAINNATHSURAY,

FYI - We just posted version 6.10.3 of the MS Excel VBO to the Digital Exchange. The Import CSV action now support custom delimiters. However, duplicate header titles have not been addressed yet.

Cheers,

------------------------------
Eric Wilson
Director, Integrations and Enablement
Blue Prism Digital Exchange
------------------------------

SS&C Blue Prism Community

Import CSV with delimitator ";"