Python script to find files that contain a text string

Source:www.opentechguides.com | Date Published: 2015-09-23 10:56:17

This is a Python program to search for a specified text in all files of a directory. The user inputs the directory path and a text string to search. Additionally, by entering the file type, the search can be restricted to a subset of the files in the directory. The program outputs the filenames that contain the search string and also the line number and column number where the text appears in the file.

Program Flow

The program uses the os module to get the files from a directory.

The user is prompted to enter a directory path, file type and a text string to be searched. If the directory path does not containing a directory separator, which is forward slash (/) in case of Linux and either a backward slash() or forward slash (/) in case if Windows, then it is appended to search path. The resulting path is then validated. If the path is invalid or if it does not exist then the search path is set to the current directory.

In the next step, each file in the directory is checked and if it matches the file type. The file type can be any text file format such as .txt, .log, .ini, .conf etc., If the user inputs a file type, for example .ini the program will check if the filename ends with the extension .ini. If no file type is input then program will search all files in the directory.

The files that match the file type are opened and each line is read in loop. The find() method is called to check if the search string is present in a line. If found, the find method returns index where the string was found. The filename, line number, index and the whole line that contains the string is output to the user console.

Program Source

#Import os module
import os

# Ask the user to enter string to search
search_path = input("Enter directory path to search : ")
file_type = input("File Type : ")
search_str = input("Enter the search string : ")

# Append a directory separator if not already present
if not (search_path.endswith("/") or search_path.endswith("\\") ): 
        search_path = search_path + "/"
                                                          
# If path does not exist, set search path to current directory
if not os.path.exists(search_path):
        search_path ="."

# Repeat for each file in the directory  
for fname in os.listdir(path=search_path):

   # Apply file type filter   
   if fname.endswith(file_type):

        # Open file for reading
        fo = open(search_path + fname)

        # Read the first line from the file
        line = fo.readline()

        # Initialize counter for line number
        line_no = 1

        # Loop until EOF
        while line != '' :
                # Search for string in line
                index = line.find(search_str)
                if ( index != -1) :
                    print(fname, "[", line_no, ",", index, "] ", line, sep="")

                # Read next line
                line = fo.readline()  

                # Increment line counter
                line_no += 1
        # Close the files
        fo.close()

Sample program Output

Run the program to check which all log files in the directory c:\samples contain the word "error"


Enter directory path to search : c:\samples
File Type : .log
Enter the search string : error
LogFile-20150922.log[5,25] 22/09/2015 08:56 Unknown error occured.

Open Tech Guides | www.opentechguides.com