cancel
Showing results for 
Search instead for 
Did you mean: 

Using of CSV Files (Delimitter and Unicode)

StefanSchnell1
Level 7
Hello Community,
yesterday I tried to process a CSV file. Therefore I use Utility - File Management - Get CSV Text As Collection. It works well so far, but two points strucked me:

  1. I don't find a way to switch the delimitter. As far as I can see it is only the comma possible. Is there any other way to change the delimitter than swapping characters in the text? My problem is, when I download data from SAP that contains a comma, I choose another delimitter, e.g. tilde. Is there a way to process this data directly without transforming it first. I have already searched in the forum but unfortunately found no approach.

  2. To prepare the CSV I use a text editor which stores the file in UTF8 format with a BOM. As far as I can see it seems not possible to process the BOM with Get CSV Text As Collection. Can really only ASCII/ANSI files be processed? Or is it also possible to use Unicode directly?

Thanks for answers, hints and tips.
Best regards
Stefan

------------------------------
Stefan Schnell
Senior Systems Engineer at BWI GmbH
------------------------------
1 BEST ANSWER

Helpful Answers

MindaugasBresku
Level 4

Hi,

1. two possible ways:
a.
change registry HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text
"Format" = "TabDelimited"
or
"Format" = "Delimited(~)"

b.
have a "Schema.ini" file in the source folder with the data:
[your source directory]
Format=Delimited(~)

See for more details on schema.ini files there:
https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver?redirectedfrom=MSDN&view=sql-server-ver15


2. try to add "CharacterSet=65001;" to the connection string in the code (make a copy of action "Get CSV Text As Collection" )
instead of
Dim cn As New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & Folder & ";Extended Properties=""Text;HDR=" & HDRString & ";FMT=Delimited;""")
use
Dim cn As New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & Folder & ";Extended Properties=""Text;HDR=" & HDRString & ";FMT=Delimited;CharacterSet=65001;""")
see if data are correctly taken as UTF-8




------------------------------
Mindaugas Breskus
Software engineer
Swedbank
Europe/Vilnius
------------------------------

View answer in original post

2 REPLIES 2

MindaugasBresku
Level 4

Hi,

1. two possible ways:
a.
change registry HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text
"Format" = "TabDelimited"
or
"Format" = "Delimited(~)"

b.
have a "Schema.ini" file in the source folder with the data:
[your source directory]
Format=Delimited(~)

See for more details on schema.ini files there:
https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver?redirectedfrom=MSDN&view=sql-server-ver15


2. try to add "CharacterSet=65001;" to the connection string in the code (make a copy of action "Get CSV Text As Collection" )
instead of
Dim cn As New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & Folder & ";Extended Properties=""Text;HDR=" & HDRString & ";FMT=Delimited;""")
use
Dim cn As New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" & Folder & ";Extended Properties=""Text;HDR=" & HDRString & ";FMT=Delimited;CharacterSet=65001;""")
see if data are correctly taken as UTF-8




------------------------------
Mindaugas Breskus
Software engineer
Swedbank
Europe/Vilnius
------------------------------

@Mindaugas Breskus

Hello Mindaugas,
thank you very much for your interesting answers.
I will try that.
Best regards
Stefan

------------------------------
Stefan Schnell
Senior Systems Engineer at BWI GmbH
------------------------------