Today in #lobsters I was referred to the GNU doco for Shell Functions.
Tag Archives: shell
explainshell.com
I found this great web site: explainshell.com. It explains shell commands, check it out.
A great MySQL session
I found this great MySQL session over here. It has a clever approach for creating a big table, and shows how to invoke shell commands from the `mysql` client.
USE test; SET GLOBAL innodb_file_per_table=1; SET GLOBAL autocommit=0; -- Create an uncompressed table with a million or two rows. CREATE TABLE big_table AS SELECT * FROM information_schema.columns; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; INSERT INTO big_table SELECT * FROM big_table; COMMIT; ALTER TABLE big_table ADD id int unsigned NOT NULL PRIMARY KEY auto_increment; SHOW CREATE TABLE big_table\G select count(id) from big_table; -- Check how much space is needed for the uncompressed table. \! ls -l data/test/big_table.ibd CREATE TABLE key_block_size_4 LIKE big_table; ALTER TABLE key_block_size_4 key_block_size=4 row_format=compressed; INSERT INTO key_block_size_4 SELECT * FROM big_table; commit; -- Check how much space is needed for a compressed table -- with particular compression settings. \! ls -l data/test/key_block_size_4.ibd
Removing .local from $HOSTNAME on Mac OS X
The $HOSTNAME shell variable on my Mac Book Pro (condor) was “condor.local” but I wanted it to be just “condor”. So I ran:
scutil --set HostName condor
which fixed the problem!
Shell scripting for archive restoration
I’ve been restoring my archives. Basically I have a bit over 1.3TB of data that I’ve tarballed up and stashed on some disconnected SATA disks, and now that I have a computer with the capacity to hold all that data I’m resurrecting the file shares from the archived tarballs. You can see my restore script here:
restore.sh
#!/bin/bash cd "`dirname $0`" data_path=/var/sata2/data tar xf $data_path/1999.tar.gz --hard-dereference > output.txt 2>&1 tar xf $data_path/2001.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2002.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2003.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2004.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2005.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2006.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2007.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2008.tar.gz --hard-dereference >> output.txt 2>&1
The restore.sh script creates an output.txt file that lists any errors from tar during the restore process. I then have a set of scripts that process this output.txt file fixing up two types of common errors.
Fixing dates
The first error is that the date of the file in the archive isn’t a reasonable value. For example, I had files reporting modification time somewhere back in 1911, before computers. To fix the dates with this problem I run the following scripts:
fix-dates
#!/bin/bash cd "`dirname $0`"; ./bad-date | xargs -0 touch --no-create
bad-date
#!/bin/bash awk -f bad-date.awk < output.txt | while read line do # note: both -n and \c achieve the same end. echo -n -e "$line\0\c" done
bad-date.awk
{ if ( /tar: ([^:]*): implausibly old time stamp/ ) { split( $0, array, ":" ) filepath = array[ 2 ] sub( / /, "", filepath ) printf( "%s\n", filepath ) } }
Fixing hard links
The second class of error that I can receive is that the file that is being extracted from the archive is a hard link to an already existing file, but the hard link cannot be created because the number of links to the target has reached its limit. I think I used ReiserFS as my file system the archives were on originally, and I’m using Ext4 now. Ext4 seems to have limitations that ReiserFS didn’t. Anyway, it’s not big deal, because I can just copy the target to the path that failed to link. This creates a duplicate file, but that’s not a great concern. I’ll try to fix up such duplicates with my pcdedupe project.
fix-links
#!/bin/bash cd "`dirname $0`"; ./bad-link | xargs -0 ./fix-link
bad-link
#!/bin/bash awk -f bad-link.awk < output.txt | while read line do # note: both -n and \c achieve the same end. echo -n -e "$line\0\c" done
bad-link.awk
{ if ( /tar: ([^:]*): Cannot hard link to `([^']*)': Too many links/ ) { split( $0, array, ":" ) linkpath = array[ 2 ] sub( / /, "", linkpath ) filepath = array[ 3 ] sub( / Cannot hard link to `/, "", filepath ) filepath = substr( filepath, 0, length( filepath ) ) printf( "%s:%s\n", filepath, linkpath ) } }
fix-link
#!/bin/bash cd "`dirname $0`"; spec="$1" file=`echo "$spec" | sed 's/\([^:]*\):.*/\1/'` link=`echo "$spec" | sed 's/[^:]*:\(.*\)/\1/'` #echo "$spec" #echo Linking "'""$link""'" to "'""$file""'"... #echo "" if [ ! -f "$file" ]; then echo Missing "'""$file""'"... exit 1; fi cp "$file" "$link"
check-output
I then checked for anything that I’d missed with my scripts with the following:
#!/bin/bash cd "`dirname $0`"; cat output.txt | grep -v "Cannot hard link" | grep -v "implausibly old time"