jasonwryan.com

Miscellaneous ephemera…

Playing with overlayfs

Around this time last year, I posted about setting up a udev rule to run a script when I plugged my USB drive containing all of my music into one of my laptops; the script, a couple of lines of bash, removes all pre-existing symlinks to $HOME/Music and repopulates the directory with an updated set. Almost. The one flaw that has been an irritant of variable intensity, depending on what I felt like listening at any given time, is that the symlinks aren’t written for directories that already exist on the target filesystem.

In order that I am able to play some music if I forget the USB drive, each of the laptops has a subset of albums on it, depending on the size of their respective hard drives. If I add a new album to the USB drive, then that change won’t get written to either of the laptops when the drive is plugged in. Not entirely satisfactory. I had tinkered around with globbing, or with having find(1) scan deeper into the tree, or even a loop to check for the presence of directories in an array…

It just got too hard. My rudimentary scripting skills and the spectre of recursion, I am sorry to admit, conspired to undermine my resolve. So, rather than concede unconditional surrender, I asked for help. As is almost always the case in these situations, this proved to be a particularly wise move; the response I received was neither what I expected, nor was it anything I was even remotely familiar with: so in addition to an excellent solution (one far better suited to what I was trying to achieve), I learned something new.

The first comment on my question proved singularly insightful.

Care to use union mounts, for example via overlayfs?

A union mount, something until now I was blissfully unaware of, is according to Wikipedia,

a mount that allows several filesystems to be mounted at one time, appearing to be one filesystem.

https://en.wikipedia.org/wiki/Union_filesystem

Union mounting has a long and storied history on Unix, beginning in 1993 with the Inheriting File System (IFS). The genealogy of these mounts has been well covered in this 2010 LWN article by Valerie Aurora. However, it is only in the current kernel, 3.18, that a union mount has been accepted into the kernel tree.

After reading the documentation for overlayfs, it seemed this was exactly what I was looking for. Essentially, an overlay mount would allow me to “merge" the underlying tree (the Music directory on the USB drive) with an “upper” one, $HOME/Music on the laptop, completely seamlessly.

Then whenever a lookup is requested in such a merged directory, the lookup is performed in each actual directory and the combined result is cached in the dentry belonging to the overlay filesystem.

Kernel docs

It was the just a matter of adapting my script to use overlayfs, which was trivial:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#!/usr/bin/env bash
# union mount Music when Apollo plugged in

low=/media/Apollo/Music
upp=/home/jason/Music
wod=/home/jason/.local/tmp
export DISPLAY=:0
export XAUTHORITY=/home/jason/.Xauthority

# overlayfs mount
mount -t overlay -o lowerdir="$low",upperdir="$upp",workdir="$wod" overlay "$upp"
status1=$?

mpc update &>/dev/null
status2=$?

if [[ "$status1" -eq 0 && "$status2" -eq 0 ]]; then
    printf "^fg(#BF85CC)%s\n" "Music directory updated" | dzen2 -p 3
fi

And now, when I plug in the USB drive, the contents of the drive are merged with my local music directory, and I can access whichever album I feel inclined to listen to. I can also copy files across to the local machines, knowing if I update the portable drive, it will no longer mean I have to forego listening to any newer additions by that artist in the future (without manually intervening, anyway).

Overall, this is a lightweight union mount. There is neither a lot of functionality, nor complexity. As the commit note makes clear, this “simplifies the implementation and allows native performance in these cases.” Just note the warning about attempting to write to a mounted underlying filesystem, where the behaviour is described as “undefined”.

Notes

Creative Commons image, mulitlayered jello by Frank Farm.

Pruning Tarsnap Archives

I started using Tarsnap about three years ago and I have been nothing but impressed with it since. It is simple to use, extremely cost effective and, more than once, it has saved me from myself; making it easy to retrieve copies of files that I have inadvertently overwritten or otherwise done stupid things with1. When I first posted about it, I included a simple wrapper script, which has held up pretty well over that time.

What became apparent over the last couple of months, as I began to consciously make more regular backups, was that pruning the archives was a relatively tedious business. Given that Tarsnap de-duplicates data, there isn’t much mileage in keeping around older archives because, if you do have to retrieve a file, you don’t want to have to search through a large number of archives to find it; so there is a balance between making use of Tarsnap’s efficient functionality, and not creating a rod for your back if your use case is occasionally retrieving single—or small groups of—files, rather than large dumps.

I have settled on keeping five to seven archives, depending on the frequency of my backups, which is somewhere around two to three times a week. Pruning these archives was becoming tedious, so I wrote a simple script to make it less onerous. Essentially, it writes a list of all the archives to a tmpfile, runs sort(1) to order them from oldest to newest, and then deletes the oldest minus whatever the number to keep is set to.

The bulk of the code is simple enough:

snapclean
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# generate list
tarsnap --list-archives > "$tmpfile"

# sort by descending date, format is: host-ddmmyy_hh:mm
{
  rm "$tmpfile" && sort -k 1.11,1.10 -k 1.8,1.9 -k 1.7,1.6 > "$tmpfile"
} < "$tmpfile"

# populate the list
mapfile -t archives < "$tmpfile"

# print the full list
printf "%s\n%s\n" "${cyn}Current archives${end}:" "${archives[@]#*-}"

# identify oldest archives
remove=$(( ${#archives[@]} - keep ))
targets=( $(head -n "$remove" "$tmpfile") )

# if there is at least one to remove
if (( ${#targets[@]} >= 1 )); then
  printf "%s\n" "${red}Archives to delete${end}:"
  printf "%s\n" "${targets[@]#*-}"

  read -p "Proceed with deletion? [${red}Y${end}/N] " YN

  if [[ $YN == Y ]]; then
    for archive in "${targets[@]}"; do
      tarsnap -d --no-print-stats -f "$archive"
    done && printf "%s\n" "${yel}Archives successfully deleted...${end}"

    printf "\n%s\n" "${cyn}Remaining archives:${end}"
    tarsnap --list-archives
  else
    printf "%s\n" "${yel}Operation aborted${end}"
  fi
else
  printf "%s\n" "Nothing to do"
  exit 0
fi

You can see the rest of the script in my bitbucket repo. It even comes with colour.

Once every couple of weeks, I run the script, review the archives marked for deletion and then blow them away. Easy. If you aren’t using Tarsnap, you really should check it out; it is an excellent service and—for the almost ridiculously small investment—you get rock solid, encrypted peace of mind. Why would you not do that?

Coda

This is the one hundredth post on this blog: a milestone that I never envisaged getting anywhere near. Looking back through the posts, nearly 60,000 words worth, there are a couple there that continue to draw traffic and are obviously seen at some level as helpful. There are also quite a few that qualify as “filler”, but blogging is a discipline like any other and sometimes you just have to push something up to keep the rhythm going. In any event, this is a roundabout way of saying that, for a variety of reasons both personal and professional, I am no longer able to fulfil my own expectations of regularly pushing these posts out.

I will endeavour to, from time to time when I find something that I genuinely think is worth sharing, make an effort to write about it, but I can’t see that happening all that often. I’d like to thank all the people that have read these posts; especially those of you that have commented. With every post, I always looked forward to people telling me where I got something wrong or how I could have approached a problem differently or more effectively2; I learned a lot from these pointers and I am grateful to the people that were generous enough to share them.

Notes

  1. The frequency with which this happens is, admittedly, low; but not low enough to confidently abandon a service like this…
  2. Leaving a complimentary note is just as welcome, don’t get me wrong…

Multi-arch Packages in AUR

One of the easiest ways to contribute to Arch is to maintain a package, or packages, in the AUR; the repository of user contributed PKGBUILDs that extends the range of packages available for Arch by some magnitude. Given that PKGBUILDs are just shell scripts, the barrier to entry is relatively low, and investing the small amount of effort required to clear that barrier will not only give you a much better understanding of how packaging works in Arch, but will scratch your own itch for a particular package and hopefully assuage someone else’s similar desire at the same time.

Now that I have a Raspberry Pi1, I am naturally much more interested in packages that can be built for the ARMv6 architecture; especially those that are available in the AUR. It is worth a brief digression to note that Arch Linux ARM is an entirely separate distribution and, while they share features with Arch, support for each is restricted to their respective communities. It is with this consideration in mind that I had begun to think about multi-arch support in PKGBUILDs, particularly in the packages that I maintain in the AUR.

I have previously posted about using Syncthing across my network, including on a Pi as one of the nodes. As the Syncthing developer pushes out a release at least weekly, I have been maintaining my own PKGBUILD and, after Syncthing was pulled into [community], I uploaded it to the AUR as syncthing-bin.

Syncthing is a cross platform application so it runs on a wide range of architectures, including ARM (both v6 and v7). Initially, when I wrote the PKGBUILD, I would updpkgsums on my x86_64 machine, build the package and then, on the Pi, have to regenerate the integrity checks. This was manageable enough for my own use across two architectures, but wasn’t really going to work for people using other architectures (especially if they are using AUR helpers).

Naturally enough, this started me thinking about how I could more effectively manage the process of updating the PKGBUILD for each new release, and have it work across the four architectures—without having to manually copy and paste or anything similarly tedious. Managing multiple architectures in the PKGBUILD itself is not particularly problematic, a case statement is sufficient:

PKGBUILD
1
2
3
4
5
6
7
8
9
10
11
12
13
14
case "$CARCH" in
    armv6h) _pkgarch="armv6"
            sha1sums+=('a94e5d00cec32956eb27bc12dbbc4964b68913f9')
           ;;
    armv7h) _pkgarch="armv7"
            sha1sums+=('9b782abf95668a906bfe76ad5ceb4cda17ec2289')
           ;;
    i686) _pkgarch="386"
          sha1sums+=('b2e1961594a931201799246f5cf61cb1e1700ff9')
           ;;
    x86_64) _pkgarch="amd64"
            sha1sums+=('035730c09ca5383c90fdd9898baf66b90acdef24')
           ;;
esac

The real challenge, for me, was to be able to script the replacement of each of the respective sha1sums, and then to update the PKGBUILD with the new arrays. Each release of Syncthing is accompanied by a text file containing all of the sha1sums, each on its own line in a conveniently ordered format, like so:

sha1sums.txt.asc
1
2
3
4
b2e1961594a931201799246f5cf61cb1e1700ff9    syncthing-linux-386-v0.9.16.tar.gz
035730c09ca5383c90fdd9898baf66b90acdef24    syncthing-linux-amd64-v0.9.16.tar.gz
d743b64204f0ac7884e4b42d9b1865b2436f5ecb    syncthing-linux-armv5-v0.9.16.tar.gz

This seemed a perfect job for Awk, or more particularly, gawk’s switch statement, and an admittedly rather convoluted printf incantation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    switch ($2) {
      case /armv6/:
        arm6 = $1
        break
      case /armv7/:
        arm7 = $1
        break
      case /linux-386/:
        i386 = $1
        break
      case /linux-amd64/:
        x86 = $1
        break
      }
  }
END {
  printf "case \"$CARCH\" in\n\t"\
         "armv6h) _pkgarch=\"armv6\"\n\t\tsha1sums+=(\047%s\047)\n\t\t;;\n\t"\
         "armv7h) _pkgarch=\"armv7\"\n\t\tsha1sums+=(\047%s\047)\n\t\t;;\n\t"\
         "i686) _pkgarch=\"386\"\n\t\tsha1sums+=(\047%s\047)\n\t\t;;\n\t"\
         "x86_64) _pkgarch=\"amd64\"\n\t\tsha1sums+=(\047%s\047)\n\t\t;;\n"\
         "esac\n",
         arm6, arm7, i386, x86
}

The remaining step was to update the PKGBUILD with the new sha1sums. Fortunately, Dave Reisner had already written the code for this in his updpkgsums utility; I had only to adapt it slightly:

excerpt from updpkgsums
1
2
3
4
5
6
7
8
9
10
{
  rm "$buildfile"
  exec awk -v newsums="$newsums" '
    /^case/,/^esac$/ {
      if (!w) { print newsums; w++ }
        next
      }; 1
      END { if (!w) print newsums }
  ' > "$buildfile"
} < "$buildfile"

Combining these two tasks means that I have a script that, when run, will download the current Syncthing release’s sha1sum.txt.asc file, extract the relevant sums into the replacement case statement and then write it into the PKGBUILD. I can then run makepkg -ci && mkaurball, upload the new tarball to the AUR and the two other people that are using the PKGBUILD can download it and not have to generate new sums before installing their shiny, new version of Syncthing. You can see the full version of the script in my bitbucket repo.

Notes

  1. See my other posts about the Pi

Creative Commons image of the Mosque at Agra, by yours truly.