I was having a problem where Visual Studio was saying that “windows.h” could not be found after having installed the Windows SDK v7.1. I found this article which said to update the path in “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\CurrentInstallFolder” in the registry. So I did that, and now everything is working.
Category Archives: Sys Admin
Shell scripting for archive restoration
I’ve been restoring my archives. Basically I have a bit over 1.3TB of data that I’ve tarballed up and stashed on some disconnected SATA disks, and now that I have a computer with the capacity to hold all that data I’m resurrecting the file shares from the archived tarballs. You can see my restore script here:
restore.sh
#!/bin/bash cd "`dirname $0`" data_path=/var/sata2/data tar xf $data_path/1999.tar.gz --hard-dereference > output.txt 2>&1 tar xf $data_path/2001.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2002.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2003.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2004.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2005.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2006.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2007.tar.gz --hard-dereference >> output.txt 2>&1 tar xf $data_path/2008.tar.gz --hard-dereference >> output.txt 2>&1
The restore.sh script creates an output.txt file that lists any errors from tar during the restore process. I then have a set of scripts that process this output.txt file fixing up two types of common errors.
Fixing dates
The first error is that the date of the file in the archive isn’t a reasonable value. For example, I had files reporting modification time somewhere back in 1911, before computers. To fix the dates with this problem I run the following scripts:
fix-dates
#!/bin/bash cd "`dirname $0`"; ./bad-date | xargs -0 touch --no-create
bad-date
#!/bin/bash awk -f bad-date.awk < output.txt | while read line do # note: both -n and \c achieve the same end. echo -n -e "$line\0\c" done
bad-date.awk
{ if ( /tar: ([^:]*): implausibly old time stamp/ ) { split( $0, array, ":" ) filepath = array[ 2 ] sub( / /, "", filepath ) printf( "%s\n", filepath ) } }
Fixing hard links
The second class of error that I can receive is that the file that is being extracted from the archive is a hard link to an already existing file, but the hard link cannot be created because the number of links to the target has reached its limit. I think I used ReiserFS as my file system the archives were on originally, and I’m using Ext4 now. Ext4 seems to have limitations that ReiserFS didn’t. Anyway, it’s not big deal, because I can just copy the target to the path that failed to link. This creates a duplicate file, but that’s not a great concern. I’ll try to fix up such duplicates with my pcdedupe project.
fix-links
#!/bin/bash cd "`dirname $0`"; ./bad-link | xargs -0 ./fix-link
bad-link
#!/bin/bash awk -f bad-link.awk < output.txt | while read line do # note: both -n and \c achieve the same end. echo -n -e "$line\0\c" done
bad-link.awk
{ if ( /tar: ([^:]*): Cannot hard link to `([^']*)': Too many links/ ) { split( $0, array, ":" ) linkpath = array[ 2 ] sub( / /, "", linkpath ) filepath = array[ 3 ] sub( / Cannot hard link to `/, "", filepath ) filepath = substr( filepath, 0, length( filepath ) ) printf( "%s:%s\n", filepath, linkpath ) } }
fix-link
#!/bin/bash cd "`dirname $0`"; spec="$1" file=`echo "$spec" | sed 's/\([^:]*\):.*/\1/'` link=`echo "$spec" | sed 's/[^:]*:\(.*\)/\1/'` #echo "$spec" #echo Linking "'""$link""'" to "'""$file""'"... #echo "" if [ ! -f "$file" ]; then echo Missing "'""$file""'"... exit 1; fi cp "$file" "$link"
check-output
I then checked for anything that I’d missed with my scripts with the following:
#!/bin/bash cd "`dirname $0`"; cat output.txt | grep -v "Cannot hard link" | grep -v "implausibly old time"
tr
I learned about the ‘tr’ Unix command today. It’s for translating text in streams. The particular example was:
echo | tr '012' '001'
And I didn’t really understand what that did, but now I do. Basically the ‘echo’ part will echo a new line character, which is octal 012. Then tr will read its input stream and read that new line. It then has a rule to translate 012 (new line) to 001 (Ctrl+A), which it does. So basically it’s just a way of getting a Ctrl+A character in a stream. If you use Ctrl+A as your regular expression delimiter you’re unlikely to have a collision in the expression itself.
Bash internal variables
As I mentioned before I read (most of) Bash Internal Variables which has some great tips and tricks in it. There’s also stuff to learn about Debugging. In fact the whole of the Advanced Bash-Scripting Guide looks like a worthwhile read!
Environment Variables and Secure Programming for Linux
I read the Environment Variables section of Secure Programming for Linux and Unix HOWTO and learned about the IFS environment variable.
I also read CS 15-392 Secure Programming – Environment Variables.
The IFS environment variable is the “internal field separator” and it is typically space, tab, new line. I.e. white space used to separate fields. So in bash you can delete the IFR variable and it will default to ” \t\n” or you can set it explicitly to that value. So that explains why I found a script that unset the IFR variable — it’s a secure programming practice.
Ext4
Been reading up on Ext4. I have so many duplicate files that when I de-dupe them I bump into hard link limits (EMLINK) in the file system at around 30,000 files.
I’ve set up a new file server
I’ve been having some fun over the last day or two looking over all my old files. I’ve got files that go back as far as 1999 in my archives. I’ve found my old blog database and associated files, so I hope to get that back up again soon, and I found some old code that I’ve been looking for (I don’t want to have to write it again!).
So my new file server has 6TB of storage as 3 x 2TB partitions. I can fit all my data in 1.3TB of space, so I’m planning to have one file share, and then a backup of that onto another partition. I have 10,174,633 files in my archive folder, and many more in my media, download and home folders. I might publish some more stats once du -s has finished processing. :)
I’m running Ubuntu 10.04 LTS Server as my file server. I tried to setup the Desktop version but it wouldn’t play nice with my nVidia graphics card.
Making Subversion/SVN recognize CVS Id and Revision tags
Today I found this article Making Subversion/SVN recognize CVS Id and Revision tags which describes how to add support for Id and Revision tags in Subversion.
Google webmaster tools
I decided to have a look at the Google Webmaster Tools. I’ve setup accounts for jsphp.co and www.progclub.org. So far I’m not very impressed at all. Maybe it’s because the accounts are just new, but there is basically no data in any of my sites, so it’s not very useful at all. I guess it’s just wait and see if data ends up being loaded or not.
Non-interactive apt-get install
I was wondering about how to do a non-interactive installation of MySQL using apt-get, because it prompts for the root password. I found my answer in an article — Truly non-interactive / unattended apt-get install.
Basically in addition to passing the -q -y arguments to apt-get, you export an environment variable, like this:
# export DEBIAN_FRONTEND=noninteractive # apt-get -q -y install mysql-server-5.0