Python Script for Find and Replace

Python Script for Find and Replace

Overview

This page demonstrates a Python script that automates the process of finding and replacing text in .txt files based on a spreadsheet. The script includes features such as error handling, logging, configuration, and backup.

Python Script for Find and Replace

# Import necessary libraries
import pandas as pd
import os
import logging
import shutil
import configparser
from pathlib import Path

# Setup logging to track the script's progress and issues
def setup_logging():
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(levelname)s - %(message)s',
        handlers=[
            logging.FileHandler("find_and_replace.log"),
            logging.StreamHandler()
        ]
    )

# Load configuration from a config file
def load_configuration(config_file='config.ini'):
    config = configparser.ConfigParser()
    config.read(config_file)
    return config

# Create a backup of the original file
def backup_file(file_path):
    backup_path = f"{file_path}.bak"
    shutil.copyfile(file_path, backup_path)
    logging.info(f"Backup created: {backup_path}")

# Perform find and replace operations on a file
def find_and_replace_in_file(file_path, find_replace_dict):
    try:
        backup_file(file_path)
        
        with open(file_path, 'r', encoding='utf-8') as file:
            file_contents = file.read()

        for find_text, replace_text in find_replace_dict.items():
            file_contents = file_contents.replace(find_text, replace_text)

        with open(file_path, 'w', encoding='utf-8') as file:
            file.write(file_contents)

        logging.info(f"Processed file: {file_path}")

    except Exception as e:
        logging.error(f"Error processing file {file_path}: {e}")

# Main function to execute the script
def main():
    setup_logging()

    config = load_configuration()
    spreadsheet_path = config['DEFAULT']['SpreadsheetPath']
    txt_files_directory = config['DEFAULT']['TxtFilesDirectory']

    try:
        df = pd.read_excel(spreadsheet_path)
        find_replace_dict = dict(zip(df['Find'], df['Replace']))

        processed_files = 0
        for filename in os.listdir(txt_files_directory):
            if filename.endswith('.txt'):
                file_path = os.path.join(txt_files_directory, filename)
                find_and_replace_in_file(file_path, find_replace_dict)
                processed_files += 1

        logging.info(f"Processing completed. Total files processed: {processed_files}")

    except FileNotFoundError:
        logging.error("Spreadsheet not found. Please check the path in the configuration file.")
    except Exception as e:
        logging.error(f"Unexpected error: {e}")

# Entry point of the script
if __name__ == "__main__":
    main()
            

Extensions and Improvements

  • Logging: Provides a detailed log of what the script is doing, which is crucial for professional environments.
  • Backup: Creates a backup of the .txt files before any modifications.
  • Configuration File (config.ini): Allows the user to specify file paths through a configuration file, enhancing flexibility.
[DEFAULT]
SpreadsheetPath = path/to/your/spreadsheet.xlsx
TxtFilesDirectory = path/to/your/txt/files
            

Considerations

  • Environment: Ensure all necessary libraries (like pandas and openpyxl for spreadsheet reading) are installed.
  • Testing: It’s important to test thoroughly with different spreadsheets and .txt files to ensure the script behaves as expected in all scenarios.

Comments

Popular posts from this blog

Tech Duos For Web Development

CIFAR-10 Dataset Classification Using Convolutional Neural Networks (CNNs) With PyTorch

Long-short-term-memory (LSTM) Word Prediction With PyTorch