file compare application

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15587
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

file compare application

Post by ChrisGreaves »

I have two text files “T_20230606.EFU” “Y_20230606.EFU”.
The first file holds 259,526 lines of text, each line being the drive, path, and filename of a file.
The second file holds 256,901 lines of text, each line being the drive, path, and filename of a file.
The first file represents a data partition T: while the second file represents a backup drive Y" of that data partition.

The obvious question is 'Why the difference of 2,625 items between the two?

As I sit down to dash of a few lines of VBA code to give me an idea of an answer, I wonder if there is a quick-and-easy file-comparison application out there?

An obvious preamble for me is to remove the drive name - Y or T from the first character of each line, but apart from that it ought to be a straight-forward operation.
Edited: the two files are close in size, but at around 250,000 lines of data each, Time and Space seem to mean that a chip-away from the font or end of the files is called for.
The Y-drive is supposed to be a mirror image of the T-drive, so the bulk of the lines should match, and I anticipate a small number of large clumps of differences.

Thanks, Chris
There's nothing heavier than an empty water bottle

User avatar
HansV
Administrator
Posts: 78416
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: file compare application

Post by HansV »

Do you use Notepad++? If so, you can activate the Compare plugin (Plugins > Plugins Admin...)

Otherwise: WinMerge
Best wishes,
Hans

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15587
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: file compare application

Post by ChrisGreaves »

HansV wrote:
06 Jun 2023, 20:11
Do you use Notepad++? If so, you can activate the Compare plugin (Plugins > Plugins Admin...)Otherwise: WinMerge
Thanks for this response, Hans.
No, I don't use Notepad++, but can give it a try.'
I saw Winmerge lauded on a geeks page, so I will try that first.

I may have to pre-process my data to remove the two leading characters of each line, because they identify the two different drives "T" and "Y".
Thanks again.
Chris
You do not have the required permissions to view the files attached to this post.
There's nothing heavier than an empty water bottle

JoeP
SilverLounger
Posts: 2067
Joined: 25 Jan 2010, 02:12

Re: file compare application

Post by JoeP »

Winmerge that Hans mentioned works well for that. We've used it for years.
Joe

robertocm
Lounger
Posts: 43
Joined: 07 Jun 2023, 15:34

Re: file compare application

Post by robertocm »

Try this:
ExamDiff prestosoft com

I've founded it from Jan Karel Pieterse:
jkp-ads com
download.asp

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15587
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: file compare application

Post by ChrisGreaves »

ChrisGreaves wrote:
06 Jun 2023, 16:57
I have two text files “T_20230606.EFU” “Y_20230606.EFU”.
The first file holds 259,526 lines of text, each line being the drive, path, and filename of a file.
The second file holds 256,901 lines of text, each line being the drive, path, and filename of a file.
The first file represents a data partition T: while the second file represents a backup drive Y" of that data partition.
My two text files are created by Everything.exe.
I opened them in Winmerge 2.16.30.0 (x86) and within three seconds felt that the files were just too dissimilar. This impression my be flawed by the fact that the first group of files listed from my backup drive contain several spurious items (Recycle Bin) and I should/could purge items like that as being irrelevant.
Untitled3.png
The double-circled hurdle is that the backup drive stores a copy of my T: data partition under a folder that is the laptop %computername% "HP15_077".
The second hurdle is that Everything, of course, uses the drive letter (single-circled), so a simple literal comparison is bound to fail.

I knew that these two hurdles would appear, which is why I contemplated a simple pre-processing (to remove the drive letters, and to remove the "HP15_077" data,
A five-minute perusal of the help files did not suggest a means of pre-processing, or even "ignore the first two characters of each record in the files".
I note that the first record (column headers!) came through unscathed.

Rather than master this application, or Notepad++, or write my own Compare/VBA, I suspect that I would be better served by eliminating any obvious candidates for the "the difference of 2,625 items between the two files".

Please note that I am not questioning Everything.exe's creation of these two file lists; I have confidence in Everything.exe. My concern is that a RoboCopy Mirror-image of the data partition produces that discrepancy of 2,625 items.
Thanks
Chris
You do not have the required permissions to view the files attached to this post.
There's nothing heavier than an empty water bottle

User avatar
jonwallace
5StarLounger
Posts: 1118
Joined: 26 Jan 2010, 11:32
Location: "What a mighty long bridge to such a mighty little old town"

Re: file compare application

Post by jonwallace »

If you feel like learning something new, then try AWK.

When I was employed and had _loads_ of text files to preprocess (similar to your problem) I spent an hour or so fiddling about with awk on the commandline. Simple processing turned out to be easy, and more complex processing was harder, but more fun than working.

Help can be found everywhere on the web, but here or here are useful. The language is OS neutral, but if you're doing filepaths, don't forget to convert Linux instruction to Windows ones.

HTH
John

“Always trust a microbiologist because they have the best chance of predicting when the world will end”
― Teddie O. Rahube

User avatar
John Gray
PlatinumLounger
Posts: 5406
Joined: 24 Jan 2010, 08:33
Location: A cathedral city in England

Re: file compare application

Post by John Gray »

Without wanting to hijack Chris' thread :evilgrin: , would any of the solutions mentioned above be useful to compare two Word documents each containing [virtually the same] multicolumn table?
I fear I may have inadvertently updated the wrong version....

Of course, I could convert-table-to-text for both of them, then compare the two as text files, maybe?
John Gray

"(or one of the team)" - how your appointment letter indicates you won't be seeing the Consultant...

User avatar
HansV
Administrator
Posts: 78416
Joined: 16 Jan 2010, 00:14
Status: Microsoft MVP
Location: Wageningen, The Netherlands

Re: file compare application

Post by HansV »

Word has a built-in Compare feature. You'll find it on the Review tab of the ribbon.

S2407.png
You do not have the required permissions to view the files attached to this post.
Best wishes,
Hans

User avatar
ChrisGreaves
PlutoniumLounger
Posts: 15587
Joined: 24 Jan 2010, 23:23
Location: brings.slot.perky

Re: file compare application

Post by ChrisGreaves »

jonwallace wrote:
14 Jun 2023, 21:31
If you feel like learning something new, then try AWK.
Thanks for this link Jon. I tracked down the documentation and found 11.2.6 Printing Nonduplicated Lines of Text which at first glance came closest to answering my question "Why do my two EFU files from Everything differ by 2,561 items (out of 500,0000)?" If I subtract the set of non-duplicated from the set of duplicated, I may spot a pattern in the set of differences.
That said, right now I am reluctant to wade into a new installation and a new language. Over the years I have developed several text-handling applications and hundreds of text-handling procedures.
However when the promised quiet moment comes along, I shall report back here!
Thanks again, Chris
There's nothing heavier than an empty water bottle

User avatar
John Gray
PlatinumLounger
Posts: 5406
Joined: 24 Jan 2010, 08:33
Location: A cathedral city in England

Re: file compare application

Post by John Gray »

HansV wrote:
15 Jun 2023, 18:02
Word has a built-in Compare feature. You'll find it on the Review tab of the ribbon.
Thank you! I have never even heard of this feature before! It works fairly well, once you get the hang of it - and you have an enormous screen!
John Gray

"(or one of the team)" - how your appointment letter indicates you won't be seeing the Consultant...