<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to compare 2 word documents? in Product Forum</title>
    <link>https://community.blueprism.com/t5/Product-Forum/How-to-compare-2-word-documents/m-p/99196#M46787</link>
    <description>It depends on what kind of differences you're looking for. Word files can be a bit ugly because of the markup and styling going on behind the scenes. If you only need to know if the files are the same exact file or not, you could do a byte-level comparison. It is quite possible two word files could be different even if the text within them is the same due to differences in the markup. If you want to compare the text within the file, you could pull the plain-text out of the file either via the MS Word VBO or, if you want to do it in your own code, check &lt;A href="https://stackoverflow.com/questions/1011234/how-to-extract-text-from-ms-office-documents-in-c-sharp" target="_blank" rel="noopener"&gt;this post&lt;/A&gt; on StackOverflow. If you're looking to compare styling as well as text, you could load the word doc as an XmlDocument like in the StackOverflow post, but instead of looking at the InnerText property of the nodes, you can look at other attributes and whatnot.&lt;BR /&gt;&lt;BR /&gt;If you want a byte-level comparison of the files, here's the code that I use:&lt;BR /&gt;
&lt;PRE class="language-vbnet"&gt;&lt;CODE&gt;' Inputs: [File 1], [File 2]
' Outputs: [Files Match]

Dim File1Size As Integer = new FileInfo(File_1).Length
Dim File2Size As Integer = new FileInfo(File_2).Length

If File1Size &amp;lt;&amp;gt; File2Size Then
	Files_Match = False
	Return
End If

Using FS1 As New FileStream(File_1, FileMode.Open)
	Using FS2 As New FileStream(File_2, FileMode.Open)
		Dim File1Buffer(4096) As Byte
		Dim File2Buffer(4096) As Byte
		
		Dim File1Bytes As Integer = 0
		Dim File2Bytes As Integer = 0
	
		Do
			File1Bytes = FS1.Read(File1Buffer, 0, 4096)
			File2Bytes = FS2.Read(File2Buffer, 0, 4096)
			For Index As Integer = 0 To File1Bytes - 1
				If File1Buffer(Index) &amp;lt;&amp;gt; File2Buffer(Index) Then
					Files_Match = False
					Return
				End If
			Next Index
		Loop While File1Bytes &amp;lt;&amp;gt; 0
	End Using
End Using

Files_Match = True​&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 29 Nov 2021 13:57:38 GMT</pubDate>
    <dc:creator>NicholasZejdlik</dc:creator>
    <dc:date>2021-11-29T13:57:38Z</dc:date>
    <item>
      <title>How to compare 2 word documents?</title>
      <link>https://community.blueprism.com/t5/Product-Forum/How-to-compare-2-word-documents/m-p/99195#M46786</link>
      <description>&lt;P&gt;Hello!&lt;/P&gt;
&lt;P&gt;I want to compare 2 word documents and check if there are differences between them, using C# or VB.net&lt;/P&gt;
&lt;P&gt;Could you please help me giving some suggestions?&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Mon, 29 Nov 2021 11:54:30 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/How-to-compare-2-word-documents/m-p/99195#M46786</guid>
      <dc:creator>MartaGelea</dc:creator>
      <dc:date>2021-11-29T11:54:30Z</dc:date>
    </item>
    <item>
      <title>Re: How to compare 2 word documents?</title>
      <link>https://community.blueprism.com/t5/Product-Forum/How-to-compare-2-word-documents/m-p/99196#M46787</link>
      <description>It depends on what kind of differences you're looking for. Word files can be a bit ugly because of the markup and styling going on behind the scenes. If you only need to know if the files are the same exact file or not, you could do a byte-level comparison. It is quite possible two word files could be different even if the text within them is the same due to differences in the markup. If you want to compare the text within the file, you could pull the plain-text out of the file either via the MS Word VBO or, if you want to do it in your own code, check &lt;A href="https://stackoverflow.com/questions/1011234/how-to-extract-text-from-ms-office-documents-in-c-sharp" target="_blank" rel="noopener"&gt;this post&lt;/A&gt; on StackOverflow. If you're looking to compare styling as well as text, you could load the word doc as an XmlDocument like in the StackOverflow post, but instead of looking at the InnerText property of the nodes, you can look at other attributes and whatnot.&lt;BR /&gt;&lt;BR /&gt;If you want a byte-level comparison of the files, here's the code that I use:&lt;BR /&gt;
&lt;PRE class="language-vbnet"&gt;&lt;CODE&gt;' Inputs: [File 1], [File 2]
' Outputs: [Files Match]

Dim File1Size As Integer = new FileInfo(File_1).Length
Dim File2Size As Integer = new FileInfo(File_2).Length

If File1Size &amp;lt;&amp;gt; File2Size Then
	Files_Match = False
	Return
End If

Using FS1 As New FileStream(File_1, FileMode.Open)
	Using FS2 As New FileStream(File_2, FileMode.Open)
		Dim File1Buffer(4096) As Byte
		Dim File2Buffer(4096) As Byte
		
		Dim File1Bytes As Integer = 0
		Dim File2Bytes As Integer = 0
	
		Do
			File1Bytes = FS1.Read(File1Buffer, 0, 4096)
			File2Bytes = FS2.Read(File2Buffer, 0, 4096)
			For Index As Integer = 0 To File1Bytes - 1
				If File1Buffer(Index) &amp;lt;&amp;gt; File2Buffer(Index) Then
					Files_Match = False
					Return
				End If
			Next Index
		Loop While File1Bytes &amp;lt;&amp;gt; 0
	End Using
End Using

Files_Match = True​&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 29 Nov 2021 13:57:38 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/How-to-compare-2-word-documents/m-p/99196#M46787</guid>
      <dc:creator>NicholasZejdlik</dc:creator>
      <dc:date>2021-11-29T13:57:38Z</dc:date>
    </item>
  </channel>
</rss>

