Monitoring Directory Changes with Git to Build Lume

8 min

language: ja bn en es hi pt ru zh-cn zh-tw

Image
Hello, it's Munou.
Quite some time ago,
I used inotifywait to rebuild LumeCMS, a static site SSG, only when there were changes.
I had set it up to build automatically with this, but it would also run when vim swap files were created, processes would die unexpectedly, and builds would run unnecessarily, making it quite unstable, which bothered me. So I thought, "If I manage it with git, I can run the build flow when there are changes, and I can revert if something goes wrong, which would be good." With some shell script ideas in mind, I decided to give it a try.

Something like this

Here's the code from the beginning, concisely.
It's also published on Github.

#!/bin/bash

LUME_DIR="/your/lume/dir"
SRC_DIR="$LUME_DIR/src"
BUILD_DIR="site"
WEBPSH="/your/webp/convert/path"
COMMIT_COMMENT="`echo "Memory" && free -h | head -2 | awk  '{print $(NF-5)"," $(NF-4)"," $(NF-3)}' | column -t -s ","`"

export DENO_INSTALL="/home/$USER/.deno"
export PATH="$DENO_INSTALL/bin:$PATH"

cd "$SRC_DIR" || exit
ls "$SRC_DIR/.git" || git init || exit
git add . || exit

git commit -m "$COMMIT_COMMENT"

if [ $? -eq 0 ]; then
  $WEBPSH
  cd $LUME_DIR || exit
  # deno task lume --dest=$BUILD_DIR
  deno task lume --dest=$BUILD_DIR > /dev/null 2>&1
else
  exit 1
fi

Anyway, I wanted to implement it with the shortest possible code, so this is what I came up with.
Well, the COMMIT_COMMENT is a bit messy, but it's just for fun...

Initially, I was recording the time with date, but then I realized it's already verifiable with git log, so it's not really necessary, is it?
(Maybe I could even put some random ASCII art in there...)

The script shown above is executed every 5 minutes via cron.

Examining Git's Return Values

Anyway, it seems that the return value when running git add . in Git doesn't detect changes. For example, the return value when git init hasn't been run:

alleycat:[haturatu]:~/git/gittest$ ls -la
合計 8
drwxr-xr-x  2 haturatu haturatu 4096 10月 14 01:08 .
drwxr-xr-x 72 haturatu haturatu 4096 10月 14 01:08 ..
alleycat:[haturatu]:~/git/gittest$ git add .
fatal: not a git repository (or any of the parent directories): .git
alleycat:[haturatu]:~/git/gittest$ echo $?
128

It's 128.
Now, let's create and delete files after running git init.

alleycat:[haturatu]:~/git/gittest$ ls -la
合計 12
drwxr-xr-x  3 haturatu haturatu 4096 10月 14 01:11 .
drwxr-xr-x 72 haturatu haturatu 4096 10月 14 01:08 ..
drwxr-xr-x  7 haturatu haturatu 4096 10月 14 01:11 .git
alleycat:[haturatu]:~/git/gittest$ git add .
alleycat:[haturatu]:~/git/gittest$ echo $?
0
alleycat:[haturatu]:~/git/gittest$ touch test
alleycat:[haturatu]:~/git/gittest$ ls -la
合計 12
drwxr-xr-x  3 haturatu haturatu 4096 10月 14 01:11 .
drwxr-xr-x 72 haturatu haturatu 4096 10月 14 01:08 ..
drwxr-xr-x  7 haturatu haturatu 4096 10月 14 01:11 .git
-rw-r--r--  1 haturatu haturatu    0 10月 14 01:11 test
alleycat:[haturatu]:~/git/gittest$ git add .
alleycat:[haturatu]:~/git/gittest$ echo $?
0

Therefore, git add . doesn't change the return value, so it can't detect changes.
So, let's look at the return value when committing.

alleycat:[haturatu]:~/git/gittest$ ls -la
合計 12
drwxr-xr-x  3 haturatu haturatu 4096 10月 14 01:11 .
drwxr-xr-x 72 haturatu haturatu 4096 10月 14 01:08 ..
drwxr-xr-x  7 haturatu haturatu 4096 10月 14 01:11 .git
-rw-r--r--  1 haturatu haturatu    0 10月 14 01:11 test
alleycat:[haturatu]:~/git/gittest$ git commit -m "nyaa"
[master (root-commit) 60bed93] nyaa
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 test
alleycat:[haturatu]:~/git/gittest$ echo $?
0
alleycat:[haturatu]:~/git/gittest$ git commit -m "nyan nyan"
On branch master
nothing to commit, working tree clean
alleycat:[haturatu]:~/git/gittest$ echo $?
1

It seems that 1 is returned if there are no changes.
So, we can determine this from this return value, and the rest can be handled as needed.

Surprisingly Deep Change Monitoring

So, this method of change management seems quite useful for other purposes as well.
However, what I've been exploring is how to easily and lightly monitor directory changes recursively, which seems simple but isn't.
In the case of the find command, it's a very powerful command, so running it for change monitoring can be quite cumbersome (I wonder if Git also runs it internally?).
If you have to write your own change monitoring code, you also need to include logic for excluded directories. In this regard, Git's change monitoring is superior because you can effectively exclude unnecessary directories from monitoring by listing them in .gitignore.

In this respect, Git is excellent.
However, since Git stores its cache in .git, it might not be suitable for monitoring directories with very large files. Therefore, one might hash each file, store it as a text file in /tmp, take a diff, and if a return value indicating a difference is found, then proceed with the change flow.
Something like that might be possible.
However, if I were to add code to exclude files, sort before diffing, and so on, it would likely become quite complex, so I'd rather not do it.

A thought while writing...

Instead of running find, I could just output ls -laR to /tmp and take the diff.
But why does Git feel lighter...? If I start thinking about that, I'll have to start dissecting Git, so I'll stop here for today.

Until next time, thank you.

Related Posts