cancel
Showing results for 
Search instead for 
Did you mean: 

Merge two .dat files

KirtishTrivedi
Level 5
Hi Community,

I am trying to  merge two .dat files after comparing that the  first row has the same columns. Syntax for the .dat file is as below(when I open with notepad)

Project ID  Employee ID      Project Type
XXXX         xxxxxxxxxxx       xx

1 ) First I need to check if the column matches in both files
2) After that I need to append the content of second file to first file after removing the header

I thought of renaming .dat with .csv file,Append the records,save and rename back to .dat. but there were two problems :- 

1 ) Headers are in japanese. When i change extension to .csv , japanese characters are getting replaced with some other weird character.
2) When I append data to csv and rename back to .dat, " is added before the line and after the line as below

"Project ID  Employee ID      Project Type"
"XXXX         xxxxxxxxxxx       xx"

Could someone please suggest how to proceed ? 

------------------------------
Kirtish Trivedi
------------------------------
1 BEST ANSWER

Best Answers

AndreyKudinov
Level 10
Why not just check headers, remove header from 2nd file and append the rest to the first file?

If you don't want to use code stages that would be like:
1) Load as text files in file1, file2 data items
2) Get header from file1 (up to first newline)
3) Check that it exists in file2
4) Replace header+newline with empty string "" in file2
5) append file2 to file1
6) write result

If files are huge, you can do it with code stage without loading everything in memory.
Also in nix I'd just use tail -n +2 file2.dat >> file1.dat, there should be similar way to do it in powershell too.

------------------------------
Andrey Kudinov
Project Manager
MobileTelesystems PJSC
Europe/Moscow
------------------------------

View answer in original post

3 REPLIES 3

AndreyKudinov
Level 10
Why not just check headers, remove header from 2nd file and append the rest to the first file?

If you don't want to use code stages that would be like:
1) Load as text files in file1, file2 data items
2) Get header from file1 (up to first newline)
3) Check that it exists in file2
4) Replace header+newline with empty string "" in file2
5) append file2 to file1
6) write result

If files are huge, you can do it with code stage without loading everything in memory.
Also in nix I'd just use tail -n +2 file2.dat >> file1.dat, there should be similar way to do it in powershell too.

------------------------------
Andrey Kudinov
Project Manager
MobileTelesystems PJSC
Europe/Moscow
------------------------------

Hi @AndreyKudinov,

Thank you for your suggestion.
As per suggestion 2) Get header from file1 (up to first newline)

Could you please assist how to get just the first line from text ? Text file contains almost 20k records hence is there any way I can just access first line for header comparison and NOT the entire content ?​

Update : I was able to do it with help of your suggestion. I found position of first newline using InSTR function and then extracted header using MID function.

This resolved my overall issue.
Thank you for suggestion.

------------------------------
Kirtish Trivedi
------------------------------

Chr(10) is a LF (newline) symbol in a file, this should work:
Left([file1],InStr([file1], Chr(10)))

Store it in [header], then check condition (same header in file2?)
InStr([file2],[header])>0

Yes - remove it from file2 and add rest to file1: [file1] & Replace([file2],[header],"")

Write result to a new file

p.s. That is assuming headers are exactly the same ofc, but I assume its true in your case. Otherwise you'd need to compare them somehow, rest of the logic should not change much.
------------------------------
Andrey Kudinov
Project Manager
MobileTelesystems PJSC
Europe/Moscow
------------------------------