cancel
Showing results for 
Search instead for 
Did you mean: 

How to compare 2 word documents?

MartaGelea
Level 2

Hello!

I want to compare 2 word documents and check if there are differences between them, using C# or VB.net

Could you please help me giving some suggestions?

Thanks

1 REPLY 1

NicholasZejdlik
Level 9
It depends on what kind of differences you're looking for. Word files can be a bit ugly because of the markup and styling going on behind the scenes. If you only need to know if the files are the same exact file or not, you could do a byte-level comparison. It is quite possible two word files could be different even if the text within them is the same due to differences in the markup. If you want to compare the text within the file, you could pull the plain-text out of the file either via the MS Word VBO or, if you want to do it in your own code, check this post on StackOverflow. If you're looking to compare styling as well as text, you could load the word doc as an XmlDocument like in the StackOverflow post, but instead of looking at the InnerText property of the nodes, you can look at other attributes and whatnot.

If you want a byte-level comparison of the files, here's the code that I use:
' Inputs: [File 1], [File 2]
' Outputs: [Files Match]

Dim File1Size As Integer = new FileInfo(File_1).Length
Dim File2Size As Integer = new FileInfo(File_2).Length

If File1Size <> File2Size Then
	Files_Match = False
	Return
End If

Using FS1 As New FileStream(File_1, FileMode.Open)
	Using FS2 As New FileStream(File_2, FileMode.Open)
		Dim File1Buffer(4096) As Byte
		Dim File2Buffer(4096) As Byte
		
		Dim File1Bytes As Integer = 0
		Dim File2Bytes As Integer = 0
	
		Do
			File1Bytes = FS1.Read(File1Buffer, 0, 4096)
			File2Bytes = FS2.Read(File2Buffer, 0, 4096)
			For Index As Integer = 0 To File1Bytes - 1
				If File1Buffer(Index) <> File2Buffer(Index) Then
					Files_Match = False
					Return
				End If
			Next Index
		Loop While File1Bytes <> 0
	End Using
End Using

Files_Match = True​