Unix command to format a number of bytes as a human readable value

It took me a while, but I finally figured out how to print a number from a bash script properly formatted with commas as thousand’s separators. For those like me who weren’t in the know, the magical incantation is:

  printf "%'d" 123456

That will format 123456 as 123,456. Much easier to read when you’re dealing with large numbers such as the sizes of files in gigabytes.

So now if I could only find the Unix command that took a number of bytes and turned it into an approximate value with GB or MB suffixes.

Yo man you are Basic

I’ve been restoring my archives — 10,000,000 files from the past 13 years — and while I did so I had a little bit of a trip down memory lane. I found some old Visual Basic 6 database programs that I wrote when I was in high school, including a phone book, a diary, a time sheeting system, and a few other miscellaneous tools. In order to get them to run I had to get my compiler out to fix a few data object dependencies (DAO5) and a few hardcoded configuration settings such as the paths to database files. So I reinstalled my old VB6 development environment, which I’m pleased to say still works (although I’m still running Windows XP). Good fun. :)

Auto-extracting archives

I have a directory structure with archived files, many of which are zipped or tarballed up. So if I want to search for a file, I can’t really be sure that the file I’m looking for isn’t in a compressed file. So I wrote some scripts to automatically run over a directory tree and automatically extract any compressed files that it finds there. There are a few extra features, such as if the file is larger than 10MB then it will prompt for whether to extract it or not. I have a few other features for handling errors to add in, but I’m happy to post this version up now.

extract-archives.sh

#!/bin/bash
err=$1
err=${err:-1}
search() {
	find -iname "*$1" -print0 | xargs -i -0 `dirname "$0"`/extract-file.sh "$1" "$2" "$err" "{}"
}
search ".tar.gz" "tar xf" 
search ".tgz" "tar xf" 
search ".zip" "unzip -q" 

extract-file.sh

#!/bin/bash
exec 0< /dev/tty
path="$4"
file_name=`basename "$path"`
dir_name=`dirname "$path"`
new_name=`basename "$path" "$1"`
new_path="$dir_name/$new_name"
[ -e "$new_path" ] && exit 0
echo $dir_name/$file_name
file_size=`stat -c %s "$path"`
#echo -n "File is $file_size bytes."
printf "File is %'d bytes." $file_size
check=`echo "$file_size < 10485760" | bc`
if [ "$check" = "1" ]; then answer="y"; fi
while [ "$answer" != "y" ]; do
	read -n 1 -s -p " Extract? " answer
	[ "$answer" = "n" ] && echo && echo && exit 0
done
echo
mkdir "$new_path"
pushd "$new_path" > /dev/null 2>&1
$2 "../$file_name"
if [ "$?" != "0" ]; then
	popd > /dev/null 2>&1
	rm -rf "$new_path"
	echo
	answer=""
	while [ "$answer" != "y" ]; do
  		read -n 1 -s -p "Do you want to ignore this file? " answer
		[ "$answer" = "n" ] && exit $3
	done
fi
popd > /dev/null 2>&1
echo

Enabling Windows SDK 7.1 in Visual Studio 2008

I was having a problem where Visual Studio was saying that “windows.h” could not be found after having installed the Windows SDK v7.1. I found this article which said to update the path in “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SDKs\Windows\CurrentInstallFolder” in the registry. So I did that, and now everything is working.

Operator overloading in C++

I did my first operator overloading in C++ today. My code ended up being quite different to the example code that I found, but I think I’m happy with it:

class FileData {
public:
  FileData( const string& path );
  static bool LessThan( const FileData& a, const FileData& b );
  bool operator< ( const FileData& );
  unsigned long size() const { return m_size; }
private:
  unsigned long m_size;
  char m_md5[ 16 ];
};

static bool operator< ( const FileData& a, const FileData& b ) {
  return FileData::LessThan( a, b );
}

bool FileData::operator< ( const FileData& b ) {
  return FileData::LessThan( *this, b );
}

bool FileData::LessThan ( const FileData& a, const FileData& b ) {
  unsigned long a_size = a.size();
  unsigned long b_size = b.size();
  if ( a_size == b_size ) {
    return memcmp( a.m_md5, b.m_md5, 16 ) < 0;
  }
  return a_size < b_size;
}

I'm not sure when it becomes either useful or necessary to define the operator both as a static and as a member function. I did both because I wasn't sure what was required.

Differences between C++ pointers and C++ references

I was curious about the difference between C++ pointers and C++ references, so I searched and found this which says that basically:

  1. It’s not necessary to initialise pointers at declaration time, but it is necessary to initialise references at declaration time.
  2. You can create an array of pointers, but you can’t create an array of references.
  3. You can assign null to pointers, but you can’t assign null to references.