Blog Migration Automation (Redux)

🛠️ Blog Migration Tool (Portfolio Project)

This custom-built tool automates the migration of blog posts from What Is Alex Thinking to my WordPress site at Payne Enterprises. It utilizes the WordPress REST API, ChromeDriver, and Python scripting to scrape, process, format, and post content seamlessly. This tool uses my friend’s blog as an example but it can be re-tooled for any website. This was a fun project because it merged two things I’m passionate about, making websites and python programming.

Key Features:

  • 🔎 Scrapes featured blog posts using Selenium
  • 🧠 Prevents duplicates with an SQLite tracking database
  • 🎨 Formats content using Bootstrap 5.3 and post wrappers
  • 📤 Uploads each post via the WordPress REST API
  • 🖼️ Downloads and optimizes featured images

Highlighted Code: main.py

def run_migration():
    """Coordinates the migration process step-by-step."""

    # Remove temporary database if it exists from a previous run
    if os.path.exists("temp.db"):
        os.remove("temp.db")

    # Ensure the main database for tracking migrations is initialized
    initialize_db()

    # Scrape the homepage for available blog posts
    posts = scrape_homepage()
    if not posts:
        logging.warning("No posts found on homepage.")
        return

    # Temporary storage during current run
    temp_db = TempStorage()

    for post in posts:
        if is_post_migrated(post["url"]):
            logging.info(f"Skipping (already migrated): {post['title']}")
            continue

        logging.info(f"Scraping: {post['title']}")
        full_post = scrape_post_content(post["url"])
        if not full_post:
            logging.error(f"Failed to scrape: {post['url']}")
            continue

        temp_db.save_post(full_post)

        # Format post into structure suitable for WordPress
        formatted = format_post_content(full_post)

        # Attempt to upload and mark as migrated if successful
        if upload_post(formatted):
            mark_post_as_migrated(post["url"])
            logging.info(f"Uploaded: {post['title']}")
        else:
            logging.error(f"Failed to upload: {post['title']}")

    # Clean up temporary storage
    temp_db.close()
    if os.path.exists("temp.db"):
        os.remove("temp.db")

Technologies: Python 3.7, Selenium, BeautifulSoup, SQLite, WordPress REST API, Bootstrap 5.3

Codebase is clean, modular, and ready to scale with additional features like image gallery support or multi-blog ingestion.


📁 GitHub Repository

From Our Blogs

Apr 29
Working with Python, Databases, & SQLite

How to Work with the Northwind SQLite Database using Python This guide walks you through interacting with the… Read more

Apr 28
This Week in TV: Andor Season Two Premier

  Welcome back blog readers! I was planning on doing a season finale recap of Righteous Gemstones today, but… Read more