Python Script for Find and Replace

Python Script for Find and Replace

Overview

This page demonstrates a Python script that automates the process of finding and replacing text in .txt files based on a spreadsheet. The script includes features such as error handling, logging, configuration, and backup.

Python Script for Find and Replace

# Import necessary libraries
import pandas as pd
import os
import logging
import shutil
import configparser
from pathlib import Path

# Setup logging to track the script's progress and issues
def setup_logging():
    logging.basicConfig(
        level=logging.INFO,
        format='%(asctime)s - %(levelname)s - %(message)s',
        handlers=[
            logging.FileHandler("find_and_replace.log"),
            logging.StreamHandler()
        ]
    )

# Load configuration from a config file
def load_configuration(config_file='config.ini'):
    config = configparser.ConfigParser()
    config.read(config_file)
    return config

# Create a backup of the original file
def backup_file(file_path):
    backup_path = f"{file_path}.bak"
    shutil.copyfile(file_path, backup_path)
    logging.info(f"Backup created: {backup_path}")

# Perform find and replace operations on a file
def find_and_replace_in_file(file_path, find_replace_dict):
    try:
        backup_file(file_path)
        
        with open(file_path, 'r', encoding='utf-8') as file:
            file_contents = file.read()

        for find_text, replace_text in find_replace_dict.items():
            file_contents = file_contents.replace(find_text, replace_text)

        with open(file_path, 'w', encoding='utf-8') as file:
            file.write(file_contents)

        logging.info(f"Processed file: {file_path}")

    except Exception as e:
        logging.error(f"Error processing file {file_path}: {e}")

# Main function to execute the script
def main():
    setup_logging()

    config = load_configuration()
    spreadsheet_path = config['DEFAULT']['SpreadsheetPath']
    txt_files_directory = config['DEFAULT']['TxtFilesDirectory']

    try:
        df = pd.read_excel(spreadsheet_path)
        find_replace_dict = dict(zip(df['Find'], df['Replace']))

        processed_files = 0
        for filename in os.listdir(txt_files_directory):
            if filename.endswith('.txt'):
                file_path = os.path.join(txt_files_directory, filename)
                find_and_replace_in_file(file_path, find_replace_dict)
                processed_files += 1

        logging.info(f"Processing completed. Total files processed: {processed_files}")

    except FileNotFoundError:
        logging.error("Spreadsheet not found. Please check the path in the configuration file.")
    except Exception as e:
        logging.error(f"Unexpected error: {e}")

# Entry point of the script
if __name__ == "__main__":
    main()
            

Extensions and Improvements

  • Logging: Provides a detailed log of what the script is doing, which is crucial for professional environments.
  • Backup: Creates a backup of the .txt files before any modifications.
  • Configuration File (config.ini): Allows the user to specify file paths through a configuration file, enhancing flexibility.
[DEFAULT]
SpreadsheetPath = path/to/your/spreadsheet.xlsx
TxtFilesDirectory = path/to/your/txt/files
            

Considerations

  • Environment: Ensure all necessary libraries (like pandas and openpyxl for spreadsheet reading) are installed.
  • Testing: It’s important to test thoroughly with different spreadsheets and .txt files to ensure the script behaves as expected in all scenarios.

Comments

Popular posts from this blog

CIFAR-10 Dataset Classification Using Convolutional Neural Networks (CNNs) With PyTorch

Radial Basis Function Networks with PyTorch

Long-short-term-memory (LSTM) Word Prediction With PyTorch