Automating Your Linux System with Python and Cron

Automating Your Linux System with Python and Cron

Automation is one of the most powerful features of Linux systems. By combining Python's versatility with cron's scheduling capabilities, you can automate virtually any repetitive task on your system. This guide will show you how to create robust automated workflows that run reliably in the background.

Why Python and Cron?

Python provides: - Easy-to-read syntax for writing automation scripts - Extensive libraries for system operations, web requests, file handling, and more - Cross-platform compatibility - Excellent error handling capabilities

Cron provides: - Reliable task scheduling at specified intervals - Runs tasks in the background without user interaction - Built into most Linux distributions - Minimal resource overhead

Together, they form a powerful automation toolkit for system administrators, developers, and power users.

Understanding Cron

Cron Basics

Cron is a time-based job scheduler in Unix-like operating systems. It reads a configuration file called a "crontab" (cron table) that contains commands and their execution schedules.

Crontab Syntax

A crontab entry has five time fields followed by the command to execute:

* * * * * command_to_execute
    
    └─── Day of week (0-7, Sunday is 0 or 7)
   └───── Month (1-12)
  └─────── Day of month (1-31)
 └───────── Hour (0-23)
└─────────── Minute (0-59)

Common Cron Patterns

# Every minute
* * * * *

# Every hour at minute 0
0 * * * *

# Every day at 2:30 AM
30 2 * * *

# Every Monday at 9:00 AM
0 9 * * 1

# First day of every month at midnight
0 0 1 * *

# Every 15 minutes
*/15 * * * *

# Every 6 hours
0 */6 * * *

Setting Up Your First Automated Task

Step 1: Create a Python Script

Let's start with a simple disk space monitoring script:

#!/usr/bin/env python3
import shutil
import smtplib
from email.mime.text import MIMEText
from datetime import datetime

def check_disk_space(path="/", threshold=80):
    """Check if disk usage exceeds threshold percentage"""
    usage = shutil.disk_usage(path)
    percent_used = (usage.used / usage.total) * 100

    return percent_used > threshold, percent_used

def send_alert(message):
    """Send email alert (configure with your SMTP settings)"""
    # This is a placeholder - configure with your email settings
    print(f"ALERT: {message}")
    # Uncomment and configure for actual email alerts:
    # msg = MIMEText(message)
    # msg['Subject'] = 'Disk Space Alert'
    # msg['From'] = 'your_email@example.com'
    # msg['To'] = 'recipient@example.com'
    # with smtplib.SMTP('smtp.example.com', 587) as server:
    #     server.starttls()
    #     server.login('your_email@example.com', 'your_password')
    #     server.send_message(msg)

def main():
    is_critical, usage = check_disk_space()
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")

    if is_critical:
        message = f"[{timestamp}] WARNING: Disk usage is at {usage:.2f}%"
        send_alert(message)
        # Log to file
        with open('/var/log/disk_monitor.log', 'a') as f:
            f.write(message + '\n')
    else:
        print(f"[{timestamp}] Disk usage OK: {usage:.2f}%")

if __name__ == "__main__":
    main()

Step 2: Make the Script Executable

chmod +x /home/username/scripts/disk_monitor.py

Step 3: Test the Script

Run it manually first to ensure it works:

python3 /home/username/scripts/disk_monitor.py

Step 4: Add to Crontab

Open your crontab for editing:

crontab -e

Add an entry to run every hour:

0 * * * * /usr/bin/python3 /home/username/scripts/disk_monitor.py >> /home/username/logs/disk_monitor.log 2>&1

The >> file.log 2>&1 redirects both standard output and errors to a log file.

Real-World Automation Examples

1. Automated Backup Script

#!/usr/bin/env python3
import os
import tarfile
from datetime import datetime
import shutil

def create_backup(source_dir, backup_dir):
    """Create compressed backup of specified directory"""
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    backup_name = f"backup_{timestamp}.tar.gz"
    backup_path = os.path.join(backup_dir, backup_name)

    # Create backup directory if it doesn't exist
    os.makedirs(backup_dir, exist_ok=True)

    # Create compressed archive
    with tarfile.open(backup_path, "w:gz") as tar:
        tar.add(source_dir, arcname=os.path.basename(source_dir))

    print(f"Backup created: {backup_path}")

    # Clean up old backups (keep last 7 days)
    cleanup_old_backups(backup_dir, days=7)

def cleanup_old_backups(backup_dir, days=7):
    """Remove backups older than specified days"""
    import time
    now = time.time()
    cutoff = now - (days * 86400)

    for filename in os.listdir(backup_dir):
        filepath = os.path.join(backup_dir, filename)
        if os.path.isfile(filepath):
            if os.path.getmtime(filepath) < cutoff:
                os.remove(filepath)
                print(f"Removed old backup: {filename}")

if __name__ == "__main__":
    create_backup("/home/username/important_data", "/home/username/backups")

Crontab entry (daily at 2 AM):

0 2 * * * /usr/bin/python3 /home/username/scripts/backup.py

2. System Health Monitor

#!/usr/bin/env python3
import psutil
import json
from datetime import datetime

def collect_system_metrics():
    """Gather system performance metrics"""
    metrics = {
        'timestamp': datetime.now().isoformat(),
        'cpu_percent': psutil.cpu_percent(interval=1),
        'memory_percent': psutil.virtual_memory().percent,
        'disk_usage': psutil.disk_usage('/').percent,
        'network_sent': psutil.net_io_counters().bytes_sent,
        'network_recv': psutil.net_io_counters().bytes_recv,
    }
    return metrics

def save_metrics(metrics, filepath='/var/log/system_metrics.json'):
    """Append metrics to JSON log file"""
    try:
        with open(filepath, 'a') as f:
            json.dump(metrics, f)
            f.write('\n')
    except PermissionError:
        # Fallback to user directory if /var/log is not writable
        filepath = os.path.expanduser('~/system_metrics.json')
        with open(filepath, 'a') as f:
            json.dump(metrics, f)
            f.write('\n')

if __name__ == "__main__":
    metrics = collect_system_metrics()
    save_metrics(metrics)

    # Alert if critical thresholds exceeded
    if metrics['cpu_percent'] > 90:
        print(f"WARNING: High CPU usage: {metrics['cpu_percent']}%")
    if metrics['memory_percent'] > 90:
        print(f"WARNING: High memory usage: {metrics['memory_percent']}%")

Crontab entry (every 5 minutes):

*/5 * * * * /usr/bin/python3 /home/username/scripts/health_monitor.py

3. Web Scraper for Price Monitoring

#!/usr/bin/env python3
import requests
from bs4 import BeautifulSoup
import json
from datetime import datetime

def check_price(url, css_selector):
    """Scrape price from a website"""
    try:
        headers = {'User-Agent': 'Mozilla/5.0'}
        response = requests.get(url, headers=headers, timeout=10)
        soup = BeautifulSoup(response.content, 'html.parser')

        price_element = soup.select_one(css_selector)
        if price_element:
            price_text = price_element.text.strip()
            # Extract numeric value
            price = float(''.join(filter(lambda x: x.isdigit() or x == '.', price_text)))
            return price
        return None
    except Exception as e:
        print(f"Error fetching price: {e}")
        return None

def log_price(product_name, price, threshold=None):
    """Log price and send alert if below threshold"""
    timestamp = datetime.now().isoformat()
    data = {
        'timestamp': timestamp,
        'product': product_name,
        'price': price
    }

    # Log to file
    with open('price_history.json', 'a') as f:
        json.dump(data, f)
        f.write('\n')

    # Check threshold
    if threshold and price and price < threshold:
        print(f"PRICE ALERT: {product_name} is now ${price} (below ${threshold})")
        # Send notification (implement your preferred method)

if __name__ == "__main__":
    # Example: Monitor a product price
    url = "https://example.com/product"
    price = check_price(url, ".price-class")
    log_price("Example Product", price, threshold=50.0)

Crontab entry (every 6 hours):

0 */6 * * * cd /home/username/scripts && /usr/bin/python3 price_monitor.py

4. Log Rotation and Cleanup

#!/usr/bin/env python3
import os
import gzip
import shutil
from datetime import datetime, timedelta

def rotate_logs(log_dir, max_age_days=30):
    """Compress old logs and remove very old ones"""
    cutoff_date = datetime.now() - timedelta(days=max_age_days)
    compress_date = datetime.now() - timedelta(days=7)

    for filename in os.listdir(log_dir):
        filepath = os.path.join(log_dir, filename)

        if not os.path.isfile(filepath):
            continue

        file_mtime = datetime.fromtimestamp(os.path.getmtime(filepath))

        # Delete very old files
        if file_mtime < cutoff_date:
            os.remove(filepath)
            print(f"Deleted old log: {filename}")

        # Compress week-old logs
        elif file_mtime < compress_date and not filename.endswith('.gz'):
            compressed_path = filepath + '.gz'
            with open(filepath, 'rb') as f_in:
                with gzip.open(compressed_path, 'wb') as f_out:
                    shutil.copyfileobj(f_in, f_out)
            os.remove(filepath)
            print(f"Compressed log: {filename}")

if __name__ == "__main__":
    rotate_logs('/var/log/myapp')

Crontab entry (daily at midnight):

0 0 * * * /usr/bin/python3 /home/username/scripts/log_rotation.py

Best Practices

1. Use Absolute Paths

Always use absolute paths in your scripts and crontab entries:

# Good
with open('/home/username/data/output.txt', 'w') as f:
    f.write(data)

# Bad (relative path may not work in cron)
with open('output.txt', 'w') as f:
    f.write(data)

2. Set Up Proper Logging

Create comprehensive logs for debugging:

import logging

logging.basicConfig(
    filename='/home/username/logs/script.log',
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)

logging.info("Script started")
logging.error("An error occurred")

3. Handle Errors Gracefully

def main():
    try:
        # Your automation logic
        perform_task()
    except Exception as e:
        logging.error(f"Script failed: {e}")
        send_error_notification(str(e))
        sys.exit(1)

4. Use Virtual Environments

For scripts with dependencies:

# In crontab
0 * * * * /home/username/venv/bin/python /home/username/scripts/script.py

5. Add Shebang and Make Executable

#!/usr/bin/env python3
# Your script here
chmod +x script.py

6. Test Before Scheduling

Always test your scripts manually before adding them to cron:

# Test with same environment as cron
env -i /usr/bin/python3 /path/to/script.py

7. Set Environment Variables

Cron has a minimal environment. Add necessary variables in your crontab:

SHELL=/bin/bash
PATH=/usr/local/bin:/usr/bin:/bin
PYTHONPATH=/home/username/lib

0 * * * * /usr/bin/python3 /home/username/scripts/script.py

Managing Crontab

Useful Crontab Commands

# Edit crontab
crontab -e

# List current crontab
crontab -l

# Remove all cron jobs
crontab -r

# Edit another user's crontab (requires root)
sudo crontab -e -u username

System-Wide Cron Jobs

Place scripts in these directories for predefined intervals:

/etc/cron.daily/      # Daily
/etc/cron.hourly/     # Hourly
/etc/cron.weekly/     # Weekly
/etc/cron.monthly/    # Monthly

Debugging Cron Jobs

Check Cron Service Status

sudo systemctl status cron     # Debian/Ubuntu
sudo systemctl status crond    # RHEL/CentOS

View Cron Logs

grep CRON /var/log/syslog      # Debian/Ubuntu
tail -f /var/log/cron          # RHEL/CentOS

Common Issues and Solutions

Script doesn't run: - Check file permissions - Verify the shebang line - Ensure executable flag is set

Script runs but fails: - Check script logs - Verify all paths are absolute - Test with minimal environment: env -i script.sh

No output in logs: - Add explicit redirects: >> /path/to/log 2>&1 - Check log file permissions - Verify log directory exists

Advanced Techniques

Using Flock for Single Instance

Prevent multiple instances from running simultaneously:

import fcntl
import sys

def acquire_lock(lockfile):
    """Ensure only one instance runs"""
    fp = open(lockfile, 'w')
    try:
        fcntl.lockf(fp, fcntl.LOCK_EX | fcntl.LOCK_NB)
    except IOError:
        print("Another instance is running")
        sys.exit(1)
    return fp

# Usage
lock = acquire_lock('/tmp/myscript.lock')
# Your script logic here

Email Notifications on Failure

def send_failure_notification(error_message):
    """Send email when script fails"""
    import smtplib
    from email.mime.text import MIMEText

    msg = MIMEText(f"Script failed with error:\n\n{error_message}")
    msg['Subject'] = 'Automation Script Failed'
    msg['From'] = 'automation@yourdomain.com'
    msg['To'] = 'admin@yourdomain.com'

    # Configure with your SMTP settings
    # with smtplib.SMTP('localhost') as server:
    #     server.send_message(msg)

Using Systemd Timers (Modern Alternative)

For more complex scheduling needs, consider systemd timers:

Create a service file (/etc/systemd/system/mybackup.service):

[Unit]
Description=Automated Backup Script

[Service]
Type=oneshot
ExecStart=/usr/bin/python3 /home/username/scripts/backup.py
User=username

Create a timer file (/etc/systemd/system/mybackup.timer):

[Unit]
Description=Run backup daily

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target

Enable and start:

sudo systemctl enable mybackup.timer
sudo systemctl start mybackup.timer

Conclusion

Combining Python's powerful scripting capabilities with cron's reliable scheduling creates a robust automation framework for Linux systems. Whether you're monitoring system health, backing up data, scraping websites, or managing logs, this combination provides the flexibility and reliability needed for production environments.

Start small with simple scripts, test thoroughly, implement proper logging, and gradually build more sophisticated automations as you become comfortable with the workflow. Your automated Linux system will save you countless hours while running smoothly in the background.

Happy automating!

Leave a comment:

Comments

No comments yet. Be the first to comment!