After stopping usage a cloud disk and transferring data to my own devices i step forward and would like to find a solution of storing text file changes history, other files formats aren't considering as they aren't editing by me.
A common solution you might think about for such a purpose is tracking files with GIT. I can say it's pretty more than working but having and dealing with git's database lead to an automation of many operations types what aren't related to the file changes, i would like to work with files only. Anyway, i checked if there are FUSE based git file systems and found a couple of them RepoFS , gitfs , Great, the file changes are tracking automatically by them. However, git database, branches, a fuse layer make a solution more complicated then i really need, having only write ahead changes history. An alternative i actually planed to take a look at is the LVM snapshots as i have already had experience of creating them and see a good potential in being helpful of reaching the goal.
Idea to create the snapshots came into my mind as LVM is a built-in Ubuntu library and my laptop disk is managing by it since the OS installation. Should i jump to a snapshots automation right away? No, it worth to spent some time to search other appropriate libraries that could bring me to a better result. I heard that ZFS and BTRFS have snapshot feature as well, so they are two other candidates to consider for storing the history.
Tools are determined, what are the basic objectives to choose the most suitable one:
Targeting a directory to snapshot.
How many snapshots can be made with no issues for different automation scenarios. Yearly 360 snapshots could be satisfying.
Snapshots disk space usage.
File events to track.
LVM is preinstalled in Ubuntu.
Basically, ZFS has to be compiled from the source code, instead of it installing from the ppa
jonathonf/zfs is a faster way.
$ sudo add-apt-repository ppa:jonathonf/zfs $ sudo apt install zfsutils-linux
BTRFS is developing by Linux community and included in the standard Ubuntu package repo.
$ apt install btrfs-progs
First of all, the dedicated logical volumes on an HDD are creating for each file system and then mounting.
$ pvcreate /dev/mapper/storagebox $ vgcreate vgstoragebox /dev/mapper/storagebox
ZFS, the monstrous FS that tooks the longest time to set up than the others.
$ lvcreate --size 2G vgstoragebox --nname zfs $ lvs vgstoragebox LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert zfs vgstoragebox -wi-a----- 2.00g $ zpool create zpool /dev/vgstoragebox/zfs $ zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zpool 1.88G 108K 1.87G - - 0% 0% 1.00x ONLINE - $ zfs mount zpool /zpool $ zfs create zpool/documents $ zfs list NAME USED AVAIL REFER MOUNTPOINT zpool 158K 1.75G 24K /zpool zpool/documents 24K 1.75G 24K /zpool/documents
ZFS basically works with the pools and datasets . Many operations with them are doing by cli only, i got different side-effects while experimenting with trying the standard shell file operations. Very easy to get confused by a difference between
/zpool/documents. Pool and dataset are ready and mounted.
BTRFS is a truly lightweight FS that i realized while learning it in comparison to ZFS, the only root directory, nothing is hidden, designed to reside on a single server, but capable of grouping multiple devices up. The most frequent shell operations i executed are standard. Time is taken to set it up was drastically short.
$ lvcreate --size 2G vgstoragebox --name btrfs --yes $ mkfs.btrfs /dev/vgstoragebox/btrfs btrfs-progs v5.4.1 See http://btrfs.wiki.kernel.org for more information. Label: (null) UUID: 9bb08305-55a7-4bd2-bd8c-f03530a1c5d6 Node size: 16384 Sector size: 4096 Filesystem size: 2.00GiB Block group profiles: Data: single 8.00MiB Metadata: DUP 102.38MiB System: DUP 8.00MiB SSD detected: no Incompat features: extref, skinny-metadata Checksum: crc32c Number of devices: 1 Devices: ID SIZE PATH 1 2.00GiB /dev/vgstoragebox/btrfs $ mkdir /btrfs $ mount /dev/vgstoragebox/btrfs /btrfs $ btrfs subvolume create documents Create subvolume '/btrfs/documents'
Interaction with BTRFS is mostly performing by the
LVM tends to be difficult to understand its mechanic as there are two types of snapshots, one which is needed in my case called Thin as its size is computing dynamically by a pool. Standard type requires a snapshot size to be declared explicitly and static, in other words you have to be known in advance of how big your next snapshot is based on file changes. I don't think that dealing with a size calculation of file changes is worth to pay time for, so the thin type is chosen.
According to the
man lvmthin description there are five abstractions thin snapshot's mechanic consist of. Each one has it's own parameters and those parameters combinations bring different effects. It seems already to be the most complicated tool in comparison to the other competitors, and it's just snapshotting functionality of the entire LVM system.
$ sudo ./snapit-lvm.sh Logical volume "lvm-snap-1" created. Logical volume "lvm-snap-2" created. Logical volume "lvm-snap-3" created. Logical volume "lvm-snap-4" created. ... $ sudo lvchange --activate y -K vgStorage/lvm-snap-20 $ sudo mkdir /lvm-fs/snap-20 $ sudo mount /dev/vgStorage/lvm-snap-20 /lvm-fs/snap-20 $ diff -yr /lvm-fs/lvm-doc.txt /lvm-fs/snap-27/lvm-doc.txt ... line27 line27 line28 < line29 < line30 <
Many abstractions leads to many operations, here they are.
$ sudo lvcreate --name lvm-pool --size 2G vgstoragebox Logical volume "lvm-pool" created. $ sudo lvcreate --name lvm-pool-meta --size 200m vgstoragebox Logical volume "lvm-pool-meta" created. $ sudo lvs vgstoragebox LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert btrfs vgstoragebox -wi-ao---- 2.00g home-backup vgstoragebox -wi-ao---- 300.00g lvm-pool vgstoragebox -wi-a----- 2.00g lvm-pool-meta vgstoragebox -wi-a----- 200.00m zfs vgstoragebox -wi-a----- 2.00g $ sudo lvconvert --type thin-pool --poolmetadata vgstoragebox/lvm-pool-meta vgstoragebox/lvm-pool Thin pool volume with chunk size 64.00 KiB can address at most 15.81 TiB of data. WARNING: Converting vgstoragebox/lvm-pool and vgstoragebox/lvm-pool-meta to thin pool's data and metadata volumes with metadata wiping. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) Do you really want to convert vgstoragebox/lvm-pool and vgstoragebox/lvm-pool-meta? [y/n]: y Converted vgstoragebox/lvm-pool and vgstoragebox/lvm-pool-meta to thin pool. $ sudo lvs -a vgstoragebox LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert btrfs vgstoragebox -wi-ao---- 2.00g home-backup vgstoragebox -wi-ao---- 300.00g lvm-pool vgstoragebox twi-a-tz-- 2.00g 0.00 8.03 [lvm-pool_tdata] vgstoragebox Twi-ao---- 2.00g [lvm-pool_tmeta] vgstoragebox ewi-ao---- 200.00m [lvol0_pmspare] vgstoragebox ewi------- 200.00m zfs vgstoragebox -wi-a----- 2.00g $ sudo lvcreate --name lvm --virtualsize 2G --thinpool lvm-pool vgstoragebox Logical volume "lvm" created. $ sudo mkfs.ext4 /dev/vgstoragebox/lvm mke2fs 1.45.5 (07-Jan-2020) Discarding device blocks: done Creating filesystem with 524288 4k blocks and 131072 inodes Filesystem UUID: 21e0ea4c-b173-406c-8fb5-420edadc830b Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done sda 8:0 0 1.8T 0 disk └─luks-4b0ed59d-059f-4f09-b204-1de0e18646ac 253:3 0 1.8T 0 crypt ├─vgstoragebox-zfs 253:5 0 2G 0 lvm ├─vgstoragebox-btrfs 253:6 0 2G 0 lvm /media/vol/9bb08305-55a7-4bd2-bd8c-f03530a1c5d6 ├─vgstoragebox-lvm--pool_tmeta 253:7 0 200M 0 lvm │ └─vgstoragebox-lvm--pool-tpool 253:9 0 2G 0 lvm │ ├─vgstoragebox-lvm--pool 253:10 0 2G 1 lvm │ └─vgstoragebox-lvm 253:11 0 2G 0 lvm /lvm-fs └─vgstoragebox-lvm--pool_tdata 253:8 0 2G 0 lvm └─vgstoragebox-lvm--pool-tpool 253:9 0 2G 0 lvm ├─vgstoragebox-lvm--pool 253:10 0 2G 1 lvm └─vgstoragebox-lvm 253:11 0 2G 0 lvm /lvm-fs
The lvm volume is a confusing tree structure.
It turned out that keeping the logical volumes and snapshots in a working state is cumbersome. Have to determine a size limit of volume and snapshots to track it doesn't exceed. But what will happen if the limit is exceeded. I don't how to figure it out. For a simple case of having files history LVM is overkill i can conclude right now.
A time measurement scenario is identical to all the tools, within a loop a text document is appending by a text line then a snapshot of the document's directory is taken. The shell scripts for each tool snapit-zfs.sh , snapit-btrfs.sh , snapit-lvm.sh .
$ sudo ./snapit-zfs.sh 2>&1 | tee output-zfs.txt execution time: 0:00.04 execution time: 0:00.69 execution time: 0:00.03 ... $ sudo zfs list -o space -r NAME AVAIL USED USEDSNAP USEDDS USEDREFRESERV USEDCHILD zpool 1.67G 79.9M 0B 24K 0B 79.9M zpool/documents 1.67G 39.5M 39.5M 40.5K 0B 0B zpool/documents@snap-1 - 12.5K - - - - zpool/documents@snap-2 - 12.5K - - - - ... zpool/documents@snap-1998 - 28.5K - - - - zpool/documents@snap-1999 - 28.5K - - - - zpool/documents@snap-2000 - 12K - - - -
By default, ZFS doesn't mount all the snapshots to a file system.
Interesting, the ZFS snapshotting scenario loaded CPU significantly.
$ sudo btrfs subvolume create /btrfs/snapshots Create subvolume '/btrfs/snapshots' $ ./snapit-btrfs.sh 2>&1 | tee output-btrfs.txt Create a snapshot of '/btrfs/documents' in '/btrfs/snapshots/documents-snap1' execution time: 0:00.33 Create a snapshot of '/btrfs/documents' in '/btrfs/snapshots/documents-snap2' execution time: 0:00.00 ... $ tree -vahpfugi /btrfs/snapshots [drwxr-xr-x root root 26] /btrfs/snapshots/documents-snap1 [-rw-r--r-- root root 6] /btrfs/snapshots/documents-snap1/btrfs-doc.txt [drwxr-xr-x root root 26] /btrfs/snapshots/documents-snap2 [-rw-r--r-- root root 12] /btrfs/snapshots/documents-snap2/btrfs-doc.txt ... [drwxr-xr-x root root 26] /btrfs/snapshots/documents-snap1999 [-rw-r--r-- root root 16K] /btrfs/snapshots/documents-snap1999/btrfs-doc.txt [drwxr-xr-x root root 26] /btrfs/snapshots/documents-snap2000 [-rw-r--r-- root root 16K] /btrfs/snapshots/documents-snap2000/btrfs-doc.txt
In contrary to ZFS, the scenario running didn't load CPU.
$ sudo ./snapit-lvm.sh 2>&1 | tee output-lvm.txt WARNING: Sum of all thin volume sizes (4.00 GiB) exceeds the size of thin pool vgstoragebox/lvm-pool (2.00 GiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. Logical volume "lvm-snap-1" created. execution time: 0:00.79 ... WARNING: Sum of all thin volume sizes (<2.95 TiB) exceeds the size of thin pool vgstoragebox/lvm-pool and the size of whole volume group (<1.82 TiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. Logical volume "lvm-snap-1508" created. execution time: 0:00.44 WARNING: Sum of all thin volume sizes (<2.95 TiB) exceeds the size of thin pool vgstoragebox/lvm-pool and the size of whole volume group (<1.82 TiB). WARNING: You have not turned on protection against thin pools running out of space. WARNING: Set activation/thin_pool_autoextend_threshold below 100 to trigger automatic extension of thin pools before they get full. VG vgstoragebox 16059 metadata on /dev/mapper/luks-4b0ed59d-059f-4f09-b204-1de0e18646ac (521716 bytes) exceeds maximum metadata size (521472 bytes) Failed to write VG vgstoragebox. Command exited with non-zero status 5
A few words here, when i was fixing some issues and came to removing the lvm volumes that process turned out to be very troublesome. It wasn't simply removing snapshots but tinkering with putting all the technical volumes into a ready state for releasing space of the target snapshots. At last, when they were ready, removing was going very slow around 5 sec per snapshot.
The log messages
WARNING in the output is a consequence of complexity thin pool mechanic. The last log message notifies that a max size limit is reached, only 1508 volumes can be created with the default settings.
And again, LVM by it's setting up and snapshot processes is the most complicated tool.
Visualize information about how fast each FS is. The chart building script chart.py reads output in the log files from the previous section output-zfs.txt , output-btrfs.txt , output-lvm.txt , parses the text from them and extract execution timestamps.
LVM grows linearly while ZFS not obviously grows exponentially. That observation i've got by creating an order of magnitude more than 2000 zfs snapshots during the experiments and saw how an execution becomes slower with each next snapshot.
BTRFS has constant growth.
At this point the details have been gotten from the experiments are enough to chose the best deal, which is BTRFS, but extracting some additional information is worth to do as a file system is going to be exploited and it's behavior should be well learnt.
ZFS is able to run diff without mounting a snapshot, but doesn't display a content difference only file attributes:
$ sudo zfs diff -F zpool/documents@snap-30 zpool/documents@snap-400 M F /zpool/documents/zfs-doc.txt` $ sudo mkdir /zpool/documents-snap-1997 $ sudo mount.zfs zpool/documents@snap-1997 /zpool/documents-snap-1997 $ diff --side-by-side /zpool/documents/zfs-doc.txt /zpool/documents-snap-1997/zfs-doc.txt ... line1996 line1996 line1997 line1997 line1998 < line1999 < line2000 <
BTRFS has no a command to find out file attributes diff. To display a content difference there is the standard
diff command. By default, if the root file system has been mounted all the snapshots are visible.
$ diff --side-by-side /btrfs/documents/btrfs-doc.txt /btrfs/snapshots/documents-snap1997 ... line1996 line1996 line1997 line1997 line1998 < line1999 < line2000 <
There is a hack of extracting a file attributes diff using
LVM doesn't operate the files therefore the standard way is using.
$ sudo lvchange --activate y --ignoreactivationskip /dev/vgstoragebox/lvm-snap-1505 $ sudo mount /dev/vgstoragebox/lvm-snap-1505 /lvm-fs/snap-1505 $ diff --side-by-side /lvm-fs/lvm-doc.txt /lvm-fs/snap-1503/lvm-doc.txt ... line1504 line1504 line1505 line1505 line1506 < line1507 < line1508 <
Snapshot mounting is a little tricky, as the other functional parts of LVM, why do i have to enable
--ignoreactivationskip? I ha
A brief look at disk usage. It is a little unclear to figure out an exact total consumed space as the standard cli commands don't take file systems' specifics into account and the
df, du commands don't display an actual usage.
$ sudo btrfs filesystem df -h /btrfs Data, single: total=216.00MiB, used=6.90MiB System, DUP: total=8.00MiB, used=16.00KiB Metadata, DUP: total=102.38MiB, used=33.42MiB GlobalReserve, single: total=3.25MiB, used=0.00B $ sudo btrfs filesystem show --human-readable /btrfs Label: none uuid: 9bb08305-55a7-4bd2-bd8c-f03530a1c5d6 Total devices 1 FS bytes used 40.34MiB devid 1 size 2.00GiB used 436.75MiB path /dev/mapper/vgstoragebox-btrfs $ sudo zpool list -v zpool NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zpool 1.88G 80.0M 1.80G - - 12% 4% 1.00x ONLINE - zfs 1.88G 80.0M 1.80G - - 12% 4.16% - ONLINE $ sudo zfs list NAME USED AVAIL REFER MOUNTPOINT zpool 80.0M 1.67G 25K /zpool zpool/documents 39.5M 1.67G 40.5K /zpool/documents zpool/documents@snap-1 12.5K - 24.5K - zpool/documents@snap-2 12.5K - 24.5K - ... zpool/documents@snap-1999 28.5K - 40.5K - zpool/documents@snap-2000 12K - 40.5K - $ sudo lvs vgstoragebox LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert lvm vgstoragebox Vwi-aotz-- 2.00g lvm-pool 4.77 lvm-pool vgstoragebox twi-aotz-- 2.00g 28.38 19.88 lvm-snap-1 vgstoragebox Vwi---tz-k 2.00g lvm-pool lvm lvm-snap-10 vgstoragebox Vwi---tz-k 2.00g lvm-pool lvm ... # Compare the fs commands output with df $ df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/vgstoragebox-btrfs 2.0G 78M 1.8G 5% /btrfs /dev/mapper/vgstoragebox-lvm 2.0G 48K 1.8G 1% /lvm-fs zpool 1.7G 128K 1.7G 1% /zpool zpool/documents 1.7G 128K 1.7G 1% /zpool/documents
That's not obvious to say which a file system uses a disk space less. BTRFS usage looks bigger on the output.
The file systems' capabilities of producing events are given a quick check to make sure an automation is implementable as it triggers snapshot command. Two types of the events
open, close is enough for a test.
$ inotifywait -m --format '%e => %w%f' -e open -e close /zpool/documents/zfs-doc.txt \ /btrfs/documents/btrfs-doc.txt \ /lvm-fs/lvm-doc.txt Setting up watches. Watches established. OPEN => /zpool/documents/zfs-doc.txt CLOSE_NOWRITE,CLOSE => /zpool/documents/zfs-doc.txt OPEN => /btrfs/documents/btrfs-doc.txt CLOSE_NOWRITE,CLOSE => /btrfs/documents/btrfs-doc.txt OPEN => /lvm-fs/lvm-doc.txt CLOSE_NOWRITE,CLOSE => /lvm-fs/lvm-doc.txt
At this point everything is fine.