Troubleshooting a Crontab Job

In yesterday’s post I created a crontab that saves a backup copy of my Obsidian vault to my server every night. This morning, however, I was met with this:

Can you spot the issue?

That column between username and date is the size of the file in bytes. Pretty odd that yesterday’s notes would be ~5mb, but today’s is only 45 bytes…

Notice also that the owner of today’s file is root, not claytone.

So, looks like we have some work to do.

Solving the ownership issue is simple; during my creation of the crontab yesterday, I used sudo crontab -e as instructed by Stack Overflow. However, this creates a crontab for root and not my user. This was a simple enough fix; I just had to run sudo crontab -e again and comment out the line from yesterday. Then, I set up a new job using crontab -e without sudo so that claytone would be creating all of the backups.

Troubleshooting the rest was a bit trickier.

When crontab jobs run, they send a copy of the output to the user’s mailbox on the server itself. However, I didn’t even have a mail server set up, so the output was discarded. I discovered this using cat /var/log/syslog | grep CRON | less

Oddly enough, I couldn’t even redirect the output of the cron job to a log file. It had to go through mail.

I did some googling and installed postfix as a mail sending service, and mailutils as a reading service. I was then able to view my cron logs by checking my inbox with the command mail.

Aha! When I moved the gdrive binary to /usr/local/bin, I didn’t realize the crontabs wouldn’t be able to see it on my PATH. I updated the script to include the full path to the gdrive command:

set -x

if [ "$#" -ne 1 ]; then
    echo "Invalid args. Please pass a file name to download."
    exit 1
fi

gdrive_loc='/usr/local/bin'
file_name=$1
file_loc='/home/claytone/backups/'$file_name
sed_expression='s/\(.*\)\s.*'$file_name'.*/\1/'
file_id=$($gdrive_loc/gdrive list --query "name contains '$file_name'" | grep $1 | sed -e $sed_expression )

$gdrive_loc/gdrive download --path $file_loc --recursive -f $file_id
tar -czf $file_loc/$file_name-$(date +'%Y-%m-%d').tar.gz $file_loc/$file_name

# Remove non-compressed version
rm -r $file_loc/$file_name

# If there are more than 10 days of history, remove the oldest file in the dir
if [ "$(ls -l $file_loc | wc -l)" -gt 10 ]; then
    oldest_file="$file_loc/$(ls -t $file_loc | tail -1)"
    echo "Removing $oldest_file"
    rm -r $oldest_file
fi

This time I tested the crontab by setting it to run very shortly after I closed the file. And presto, now it appears to work!

Happy debugging, y’all.

Quick Backup Automation With Bash

The Problem

I recently started using Obsidian to bring some regulation to my large number of poorly organized notes. Of course, using multiple devices, I need to keep my vault in sync at all times. While you can pay for their proprietary cloud sync services, I’ve actually opted to create my own.

I wanted to create a backup system on Metis (my home server) in case I ever overwrite my Obsidian vault on Google Drive. This kind of thing has happened before with an encrypted password vault dropping a bunch of keys (rather than merging changes) when I make an edit from a separate place.

This will be a one-way sync that pulls down the vault from Google drive every night and archives it, leaving a copy that can’t be overwritten by the sync service.

I also want it to be generalized, so that I can make continuous backups of other files that frequently change on my Google Drive.

Setup

The first step is installing the Gdrive utility for Unix. This will allow me to interact with Google Drive from the command line.

I grabbed the gdrive_2.1.1_linux_386.tar.gz archive from their Releases page, unpacked it, and moved it to /usr/local/bin for execution. Don’t forget to reset your session.

claytone@metis:~$ wget https://github.com/prasmussen/gdrive/releases/download/2.1.1/gdrive_2.1.1_linux_386.tar.gz
claytone@metis:~$ tar -xvf gdrive_2.1.1_linux_386.tar.gz
claytone@metis:~$ sudo mv gdrive /usr/local/bin
claytone@metis:~$ source ~/.bashrc
claytone@metis:~$ ./gdrive list

Scripting

Setting up a bash script to pull the file from Drive is pretty easy.

However, I don’t want to keep tons of unnecessary copies over time. Ideally if I really mess up the sync, I should notice within 10 days. So, I’ll just drop the oldest file every day after the first 10 days.

Note that there is a bug in this script! Check out this post for the corrected version!

set -x

if [ "$#" -ne 1 ]; then
    echo "Invalid args. Please pass a file name to download."
    exit 1
fi

file_name=$1
file_loc='/home/claytone/backups/'$file_name
sed_expression='s/\(.*\)\s.*'$file_name'.*/\1/'
# Get file ID from searching gdrive
# TODO: if file not found, quit
file_id=$(gdrive list --query "name contains '$file_name'" | grep $1 | sed -e $sed_expression )

# Download and compress file
gdrive download --path $file_loc --recursive -f $file_id
tar -czf $file_loc/$file_name-$(date +'%Y-%m-%d').tar.gz $file_loc/$file_name

# Remove non-compressed version
rm -r $file_loc/$file_name

# If there are more than 10 days of history, remove the oldest file in the dir
if [ "$(ls -l $file_loc | wc -l)" -gt 10 ]; then
    oldest_file="$file_loc/$(ls -t $file_loc | tail -1)"
    echo "Removing $oldest_file"
    rm -r $oldest_file
fi

It appears to work!

Automation

You can create a repeating crontab job that runs at a certain time. In my case, I want the backups to trigger at 3 every morning. Run

sudo crontab -e 

Note that using sudo will cause files created by your job to be owned by root! Don’t use sudo if you don’t want that!

Then add:

0 3 * * * /home/claytone/backups/backup.sh claytones-brain

This will invoke the executable backup.sh script at 3am every day.

Future work

  • Error checking/handling
  • What if a query returns multiple file names?
  • What if a query returns nothing?
  • Help page