Compare two files line-by-line
by greyfell » Wed, 04 Feb 2004 12:18:36 GMT
I'm not sure if there is an easy way to do this, but I thought it was
worth asking. I have two files which contain filenames and sum data. I
would like to compare the files line-by-line and output filenames for
which the sum data is different to a third file. I have no idea how I
can read two files in parallel for comparison, though. Shell scripts
(mine, anyway) are very linear -- they don't exactly lend themselves
to multitasking. Anyone have any pointers?
Re: Compare two files line-by-line
by Barry Margolin » Wed, 04 Feb 2004 12:44:07 GMT
In article < XXXX@XXXXX.COM >,
exec 3<file1 4<file2
After this, you can use:
read line <&3
to read from file1, and:
read line <&4
to read from file2
--
Barry Margolin, XXXX@XXXXX.COM
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Re: Compare two files line-by-line
by greyfell » Wed, 04 Feb 2004 13:07:11 GMT
On Tue, 03 Feb 2004 03:44:07 GMT, Barry Margolin < XXXX@XXXXX.COM >
I've never dealt with exec, but the man page makes it sound pretty
simple. I was imagining something using paste to merge the files,
line-for-line, then use awk to compare $2 to $4. Unfortunately, both
your way and mine leave me with a problem I just ran across. This will
only work if the files are the same in both lists. If a file is
present in one list, but not the other, it will throw everything off
from that point forward.
Anything I can do to make the compare smarter? Basically, if a file is
in one list but not the other, log that as a failed checksum. If this
were Java, I would create a recursive loop to do the compares and
break when either a match was found or the end of the file was
reached, but I don't know if that is the way to do it in a shell
script. The first list would always be the master, so I would *only*
care if a file was in list 1 but not list 2, not vice versa. Is a
recursive loop my best option?
Re: Compare two files line-by-line
by greyfell » Wed, 04 Feb 2004 13:44:37 GMT
Okay, I've got something than seems to work. Again, though, I'm not
certain if it is the best way to go about it. This is just outputting
results to the screen for testing. I'll eventually make it output to a
file, then tar the needed files and transmit and untar them.
exec 3</tmp/master.list
while read line <&3
do
exec 4</tmp/compare.list
while read LINE <&4
do
if [ "$line" = "$LINE" ]
then
echo "$line matches $LINE."
MATCH=1
break
fi
done
if [ $MATCH -ne 1 ]
then
echo "$line has no match."
fi
MATCH=0
done
Am I going in totally the wrong direction here? This seems to work,
but I have a tendency to do things the long way in shell scripts.
Thanks for the feedback.
Re: Compare two files line-by-line
by Barry Margolin » Wed, 04 Feb 2004 14:51:23 GMT
In article < XXXX@XXXXX.COM >,
Maybe you can make use of "join" or "diff".
--
Barry Margolin, XXXX@XXXXX.COM
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
Re: Compare two files line-by-line
by William Park » Wed, 04 Feb 2004 16:09:31 GMT
man diff
--
William Park, Open Geometry Consulting, < XXXX@XXXXX.COM >
Linux solution for data management and processing.
Re: Compare two files line-by-line
by Julius_72 » Wed, 04 Feb 2004 17:37:06 GMT
> > Anything I can do to make the compare smarter? Basically, if a file is
Or "sdiff" and make "grep" on caracther ">", "<" and "|". On my own script
I'm using sdiff to compare two files that cointain numeric data and
reporting in output difference occuring on each couple file.
Jul
Similar Threads:
1.Comparing certain lines from two files
Given two files A and B, I want to compare lines between line numbers
a1 and a2 in A to lines between line numbers b1 and b2 in B. It will
be good if I am able to perform some reg-ex type substitution on the
lines before doing the actual comparison. How can this be done in a
step or two (preferably without writing a script and creating any
temporary files)? It is okay in case we have to use some perl command
as part of the command line. Thanks.
2.Comparing two text files with non-adjacent lines for unique entries
I am trying to find an easy and fast way to compare two files, each
with several thousand lines - only one column and spit out what is
unique only to one of the files.
So, compare file A and file B, and only lines that re unique to file A
are spit out to a new file.. comm and diff / sort and Uniq do not
work because in this case the two files will have non-adjacent lines.
Any help is GREATLY appreciated. Thank you in advance!
-TT
3.merge two files line by line...
I have 2 files:
file1:
x y z
a b c
file 2:
12:00
12:01
12:02
I want to merge these files such that the output looks like:
x y z
a b c
12:00
x y z
d c f
12:01
x y z
t h y
12:02
Is this easily possible?
Thanks in advance
4.reading a file line by line and using the line in another script
Hi I am reading the file line by line and assgning it to the variable
this variable used in another script called deploy_3.ksh.
I am able do that just for the first line. My file consists of three
lines
for the other to lines an error occurs
Found line: XXXX@XXXXX.COM :/home/baja/amireddy
I00020133: 'test.txt' copied into the directory:'/home/baja/amireddy'
on the machine 'baja.iso.ksu.edu'
E03020134: Warning: No xauth data; using fake authentication data for
X11 forwarding.
E03020134: Warning: No xauth data; using fake authentication data for
X11 forwarding.
I00020133:
=====================================================================
while read line
do
echo "Found line: $line"
dest_dir=&line
## We call deploy3 to copy the file to the
## destination directory found in the file
. ${HARVEST_SCRIPTS}/deploy_3.ksh "${environment}" "${state}" "$
{viewpath}" "${item}"
done< $dest_dir1
5.get all lines between two lines
I have a huge file (more than 8 million lines), it is a log file and
each message body starts with timestamp information. Sample:
2008-09-19-06.05.40.704851-240 I516287092C387 LEVEL: Severe
PID : 413900 TID : 1 PROC : db2agent (DB)
0
.
.
.
.
.
.
.
2008-09-20-06.05.40.704851-240 I516287092C387 LEVEL: Severe
.
.
.
.
I would like to get details between two timestamps, like I need all
lines saved to a new file that are between 2008-09-19-06.05.40.704851
and 2008-09-20-06.05.40.704851.
could someone point me to right direction on how I should approach
this? Thanks!
6. Generate two lines code for each line
7. write lines of one file to new file dependent on first field of line
8. write lines of one file to new file dependent on first field of line