Python script to compare two text files

Posted on 17th September 2015

This is a simple python script to compare two text files line by line and output only the lines that are different.

Program Analysis

The program asks the user to input the names of the two files to compare. It will then open the files in read only mode and reads one line at a time from each file and compares them after stripping off any trailing whitespaces, which means we are ignoring new-line and spaces at the end of the line.

If the lines are different then it is printed out along with the corresponding line number and the prefix '>' or '<' to indicate the file.

An additional check is also carried out to determine if that specific line exist only on one file in which case it is indicated with a + sign in the output.

Program Source

# Ask the user to enter the names of files to compare
fname1 = input("Enter the first filename: ")
fname2 = input("Enter the second filename: ")

# Open file for reading in text mode (default mode)
f1 = open(fname1)
f2 = open(fname2)

# Print confirmation
print("-----------------------------------")
print("Comparing files ", " > " + fname1, " < " +fname2, sep='\n')
print("-----------------------------------")

# Read the first line from the files
f1_line = f1.readline()
f2_line = f2.readline()

# Initialize counter for line number
line_no = 1

# Loop if either file1 or file2 has not reached EOF
while f1_line != '' or f2_line != '':

    # Strip the leading whitespaces
    f1_line = f1_line.rstrip()
    f2_line = f2_line.rstrip()
    
    # Compare the lines from both file
    if f1_line != f2_line:
        
        # If a line does not exist on file2 then mark the output with + sign
        if f2_line == '' and f1_line != '':
            print(">+", "Line-%d" % line_no, f1_line)
        # otherwise output the line on file1 and mark it with > sign
        elif f1_line != '':
            print(">", "Line-%d" % line_no, f1_line)
            
        # If a line does not exist on file1 then mark the output with + sign
        if f1_line == '' and f2_line != '':
            print("<+", "Line-%d" % line_no, f2_line)
        # otherwise output the line on file2 and mark it with < sign
        elif f2_line != '':
            print("<", "Line-%d" %  line_no, f2_line)

        # Print a blank line
        print()

    #Read the next line from the file
    f1_line = f1.readline()
    f2_line = f2.readline()


    #Increment line counter
    line_no += 1

# Close the files
f1.close()
f2.close()

Program Output

Consider two text files file1.txt and file2.txt with the following contents

file1.txt

opentechguides website contains
tutorials and howto articles
on topics such as Linux
Windows, databases etc.,

file2.txt

opentechguides website contains
tutorials and howto articles
on topics such as Linux
Windows, databases , networking
programming and web development.

If you compare these two files using our python script, the output will be as shown below:


 Enter the first filename: file1.txt
 Enter the second filename: file2.txt
 -----------------------------------
 Comparing files 
  > file1.txt
  < file2.txt
 -----------------------------------
 > Line-4 Windows, databases etc.,
 < Line-4 Windows, databases , networking

 <+ Line-5 programming and web development.


Post a comment

Comments

Liam | September 27, 2018 1:53 PM |

I have followed you closely here, I've had to add a bit to f1 = open(fname1, encoding = "utf8"). Though now I've added that it goes smoothly, except no output. I don't know why. Any help?

Bilal | September 27, 2018 4:50 PM |

may be there is no difference in the files? Output is printed only when the lines are different.

Shashi | May 9, 2018 7:01 PM |

Can some help how can i save the compared file from console to the text file in a folder

gnanasekar | April 29, 2016 9:52 AM |

When I run the above script in the name of sekar.py, it was asking file name,when I entered the file name as file1.txt I could see that error displays as below Enter the first filename: file1.txt Traceback (most recent call last): File "C:\Python27\Scripts\PY\sekar.py", line 2, in fname1 = input("Enter the first filename: ") File "", line 1, in NameError: name 'f

Sayan | March 30, 2018 5:21 AM |

if your filename is test1.txt By this - fname1 = input("Enter the first filename: ") python receives the input in this way: fname1 = test1.txt whereas it needs in this way: fname1 = "test1.txt" I hope you got my point.

Sayan | March 30, 2018 5:18 AM |

My objective is similar but need to have some arranged output - Say if "file1.txt" contains a b c and "file2.txt" contains b a c Then While executing the script i should get - File Diff Status : Success If some difference found then it should output - File Diff Status : Failed [ check diff.txt ] diff.txt should have the line no, content & file name which differs.. Can you help here .. ?

Swill | October 27, 2017 9:59 AM |

Really cool, and almost what I need. But, really need it to look for and note the lines in File1 that are not ANYWHERE in the File2 or visa-versa. I know this will likely take longer in processing, but that is ok. This script notes the lines in File1 that are not on the same exact line number within File2, or Visa-versa.

dave | March 7, 2018 3:44 PM |

Remember that each line in File-1 must be compared against each line in File-2 and vice-versa, if not you're not truly comparing each files contents against the other.

october | April 24, 2017 7:49 AM |

I am learning something here not a robot, if I switching too fast and it is because I am searching something.