BASH: File Handling and Text Processing

Dec 28, 2025 By: Jordan McGilvray • bash,linux,homelab,scripting,automation,files,text-processing,logs,backups

Bash Scripting for Homelab Automation: Part 3 of 6

Linux automation is only as strong as your ability to manipulate files and text. Last week, we explored Bash logic, loops, and conditionals, learning how to write scripts that branch intelligently, iterate over data, and make decisions without human intervention. Those building blocks form the foundation of reliable automation. If loops and conditionals allow your scripts to think, then files and text streams allow them to act and communicate with the system.

Bridging into this week, we shift focus from control flow to data management. Scripts live in a dynamic environment: logs grow, configuration files change, and command outputs need parsing. To automate effectively, you must be able to read, write, move, and transform these resources. Understanding Bash as a language of data streams is the next step in taking your homelab from reactive to proactive.

This week, we explore practical file handling and text processing techniques. You will learn how to navigate the filesystem confidently, create and remove files safely, and copy or move data with precision. These are not abstract exercises — they are the real actions that power backup routines, log rotation, and configuration management. By the end of this article, interacting with files will feel as natural as issuing a simple command.

We will also introduce a central Linux philosophy: everything is a file. Devices, processes, and system state can all be manipulated using the same commands you already know. Once this perspective clicks, text processing tools like grep, sed, and awk become powerful levers. Streams, redirection, and pipelines stop feeling like syntax and start feeling like a language of flow. This is the lens through which Bash scripts become scalable, readable, and robust.

Why File Handling Matters

Every significant automation task interacts with files:

Logs accumulate continuously.
Configuration files need inspection, modification, or backup.
Backups and snapshots must be rotated.
Command outputs must be filtered, summarized, or transformed.

File handling is the connective tissue of automation. Without it, scripts are limited to ephemeral actions; with it, they can persist, report, and orchestrate tasks reliably. Linux presents files consistently and predictably, which is why mastering file operations is essential for any homelab automation workflow.

Automation also benefits from structured thinking about roles and projects in your environment. Understanding which scripts touch critical paths or manage key services can prevent unintended interference.

Working with Files and Directories

Before tackling streams or text, you need confidence in basic file operations.

Listing and Navigating

ls -lah

Flags explained:

-l: long listing with permissions, owner, size, and timestamps
-a: includes hidden files (. and ..)
-h: human-readable sizes

Consistency in scripts prevents surprises. Hard-coded assumptions about filenames or paths are a frequent source of errors. You can also integrate your system monitoring scripts to check directory states dynamically.

Creating and Removing Files

touch file.txt
mkdir backups

Removing files carefully:

rm file.txt
rm -r old_logs/

For safety during testing:

rm -rv test_directory/

Even small operations can have significant consequences when automated, so explicit verbosity is your friend.

Copying and Moving Files

cp source.txt destination.txt
mv oldname.txt newname.txt

For directories:

cp -r source_dir/ backup_dir/

Adding the -v flag in scripts during development gives visibility into operations, reducing debugging headaches later. Pair this with cron scheduling to automate routine copy or move tasks.

Everything Is a File

Linux treats almost everything as a file — but not in the sense of “document stored on disk.” A file is a generic interface for reading and writing data. Devices, processes, and system information can all be accessed and manipulated as streams of bytes.

Devices

Look in /dev:

ls /dev

Disks (/dev/sda), terminals (/dev/tty), and special sinks like /dev/null are all files. You can write to them:

echo "hello" > /dev/null

This discards the output but demonstrates the uniform interface. No special API is needed — your standard tools suffice.

System State

The /proc filesystem exposes kernel and process information as files:

cat /proc/cpuinfo
cat /proc/meminfo

These files are generated dynamically. Reading them gives insight into the current system state using familiar commands.

Why It Matters

Once you see devices and system state as files, pipelines and text processing make sense. You can filter errors from logs, extract network information, or inspect processes without learning specialized APIs. Bash scripts become composable flows rather than rigid sequences of commands.

Reading Files and Streams

Displaying Content

cat file.txt
less file.txt

less allows safe scrolling of large files without flooding the terminal.

Reading Line by Line

while read -r line; do
  echo "$line"
done < file.txt

The -r prevents misinterpretation of backslashes, keeping scripts predictable when handling arbitrary text.

Text Processing: grep, sed, and awk

These core utilities turn raw data into actionable information.

grep — Filtering

grep "error" logfile.txt
grep -i "error" logfile.txt
grep -r "ssh" /etc

sed — Stream Editing

Replace text on the fly:

sed 's/old/new/g' file.txt
sed -i 's/old/new/g' file.txt

Preview changes without writing immediately to prevent mistakes.

awk — Structured Text

awk '{print $1}' file.txt
awk '{print $1, $3}' file.txt

Ideal for parsing structured logs, command outputs, or tabular data. Pairing awk with pipes creates compact, powerful processing chains.

Input, Output, and Redirection

Redirecting Output

command > output.txt
command >> output.txt
command 2> error.log
command > all.log 2>&1

Pipes

ps aux | grep root

Each command performs one task. Pipes turn them into flows. The uniform file interface makes this possible.

Practical Example: Log Scanner

#!/bin/bash
LOG="/var/log/syslog"
grep -i "error" "$LOG" | awk '{print $1, $2, $3, $5}'

This extracts timestamps and sources for error messages. Small, readable, and effective.

Practical Example: Log Rotation

Automated systems accumulate logs continuously. Rotating them prevents disk bloat and keeps historical data manageable. Here’s a real-world example: a Synchronet BBS log rotation script that maintains the last N rotated logs, optionally using copytruncate for logs that must stay open.

#!/bin/bash
#======================================================================
# Synchronet BBS Log Rotation Script
# Keeps last N rotated logs with numeric suffixes (.1, .2, ...)
# Supports copytruncate for selected logs.
#======================================================================
LOGDIR="/opt/sbbs/sbbs/logs"
LOGDAYS=5
# Logs to rotate with copytruncate instead of mv
COPYTRUNCATE_LOGS=("hack-ips.log")
#-------------------------
# Rotate a single log file
#-------------------------
rotate_log() {
    local log="$1"
    local base=$(basename "$log" .log)
    # Shift older logs up the numeric suffix ladder
    for ((i=LOGDAYS; i>=1; i--)); do
        if [ -f "$LOGDIR/$base.log.$i" ]; then
            if [ $i -eq $LOGDAYS ]; then
                # Remove oldest log
                rm -f "$LOGDIR/$base.log.$i"
            else
                # Move log up one level
                mv "$LOGDIR/$base.log.$i" "$LOGDIR/$base.log.$((i+1))"
            fi
        fi
    done
    # Rotate current log
    if [ -s "$log" ]; then
        if [[ " ${COPYTRUNCATE_LOGS[*]} " =~ " $base.log " ]]; then
            # Copy content and truncate original
            cp "$log" "$LOGDIR/$base.log.1"
            : > "$log"  # truncate the original file
            echo "CopyTruncated $log -> $base.log.1"
        else
            mv "$log" "$LOGDIR/$base.log.1"
            echo "Rotated $log -> $base.log.1"
        fi
    fi
}
#-------------------------
# Main Execution
#-------------------------
for log in "$LOGDIR"/*.log; do
    [ -e "$log" ] || continue
    rotate_log "$log"
done

Combining Operations: Backup and Rotation

Beyond individual logs, complete automation often involves structured backups. This Synchronet combined backup script demonstrates creating backups, rotating previous archives, checking free disk space, and logging the entire process. Each step uses file handling commands you’ve learned, along with conditional checks to avoid accidental data loss. Many of the patterns here are inspired by the Bash automation techniques from last month’s series, showing how concepts scale from learning scripts to full system orchestration.

[...full backup script as before...]

Safety and Sanity

Automation scales intent — good or bad.

Before acting on files:

Test with echo or in a sandbox
Avoid running destructive commands as root
Validate assumptions before overwriting or deleting

A single misstep can propagate across the system if scripts are untested.

From Files to Flow

File handling in Bash is both the muscle and the lens of automation. Once everything is understood as a stream, redirection and pipelines become predictable levers. Scripts evolve from sequences into orchestrations, combining reading, transforming, and writing into flows that react to changing system state.

The “everything is a file” principle is the key insight: a consistent interface for devices, processes, and data. Once internalized, you’ll approach new automation challenges with a clear mental model rather than guessing.

Summary

This week’s exploration demonstrates how file operations form the foundation of effective automation. Mastering navigation, creation, and removal of files and directories gives scripts the reliability and predictability necessary for homelab maintenance. Without these basics, even the most elegant logic and loops cannot operate safely.

Understanding that devices, processes, and system state behave like files allows reuse of familiar commands across the system. Once internalized, the “everything is a file” principle makes scripts composable, readable, and versatile, turning Linux into a coherent environment rather than a collection of isolated commands.

Tools such as grep, sed, and awk transform raw text into structured information, enabling scripts to make decisions and react dynamically. Combined with pipes and redirection, these utilities create concise flows that allow disparate parts of the system to communicate seamlessly, amplifying the power of small, predictable actions.

Safe scripting practices remain essential. Testing operations, validating assumptions, and avoiding destructive commands prevent small errors from escalating. With these skills, your Bash scripts evolve from simple command sequences into a capable orchestration layer, ready to integrate scheduling, monitoring, and automation in the homelab.

More from the "Bash Scripting for Homelab Automation" Series:

BASH: Foundations and System Insight
BASH: Logic, Loops, and Automation
BASH: File Handling and Text Processing
BASH: Automating Tasks with Scripts and Cron Jobs
BASH: Troubleshooting Scripts and Best Practices
BASH: Scripting in the Homelab - a Summary

Excalibur's Sheath