For the record:
Category Archives: Database
Common files and directories found on popular Linux systems
Over on https://digitalforensics.ch/linux/ there is a PDF you can download which lists common files and directories you might find in popular Linux systems.
Data Analysis with Python for Excel Users – Full Course
Today I learned a little about Python, Jupyter, and Pandas in this video: Data Analysis with Python for Excel Users – Full Course.
The 6502 Instruction Set Database
This is great: The 6502 Instruction Set Database. It’s a database with details of opcodes for the 6502 processor. You can see the SQL and TSV.
Setting query timeout for MySQL or MariaDB
I happened upon the selectSQLText function in the MediaWiki source code and learned how to set a query timeout in either MySQL or MariaDB. Good to know!
MySQL/MariaDB Identifier Length Limits
On the value of foreign key constraints
Here is something I would like to demo regarding foreign key constraints.
This regards what type of errors foreign key constraints can (and can’t!) save you from.
Occasionally you hear developers say how important foreign key constraints are and how foreign key constraints saved them from some bug. That might all seem well and good but when you think it through if a foreign key constraint saved you from a bug it also might very well just have been luck that saved you and the thing that went wrong might have gone wrong another way and have gone undetected as database corruption and potential accidental data disclosure to an incorrect entity.
To set the scene let’s use some schema/data from what we were talking about earlier:
create table t_skinny_customer ( customer_id bigint not null primary key, customer_name varchar(255) not null, customer_record_created_on timestamp default current_timestamp, customer_record_updated_on timestamp default current_timestamp on update current_timestamp ); create table t_skinny_customer_phone ( customer_phone_id bigint not null primary key, customer_id bigint not null references t_skinny_customer( customer_id ) on delete cascade, phone_number varchar(255) not null, phone_type enum( 'phone', 'tollfree', 'mobile' ) not null default 'phone', phone_type_order tinyint unsigned not null default 0, phone_record_created_on timestamp default current_timestamp, phone_record_updated_on timestamp default current_timestamp on update current_timestamp ); insert into t_skinny_customer ( customer_id, customer_name ) values ( 1, 'John Doe' ), ( 2, 'Jane Doe' ); insert into t_skinny_customer_phone ( customer_phone_id, customer_id, phone_number, phone_type, phone_type_order ) values ( 11, 1, '123-456-7890', 'phone', 1 ), ( 21, 2, '123-456-7894', 'phone', 1 );
Then let’s run this code:
$stmt = $pdo->prepare(" update t_skinny_customer_phone set customer_id = :customer_id where customer_phone_id = :phone_id "); try { $stmt->execute([ 'customer_id' => 3, 'phone_id' => 11, ]); assert( false ); } catch ( PDOException $ex ) { // 2024-02-13 jj5 - foreign key constraint saved the day! assert( 0 === strpos( $ex->getMessage(), 'SQLSTATE[23000]: Integrity constraint violation: 1452 Cannot add or update a child row: a foreign key constraint fails' ) ); } $stmt->execute([ 'customer_id' => 2, 'phone_id' => 11, ]); // 2024-02-13 jj5 - ... whoops! what we've done here is switch // one of John Doe's phone numbers to be one of Jane Doe's // phone numbers. The foreign key constraint is of no value in // detecting this type of data corruption.
Now I’m not saying that you shouldn’t have foreign key constraints. I’m just pointing out that if you do have foreign key constraints and you’re relying on them for “referential integrity” your door may be open for all manner of bugs (and exploits).
That your foreign key points to an actual record doesn’t necessarily imply that your foreign key points to a correct record.
Database design and normalization
I would like to compare and contrast two options for the design of customer contact information for a relational database.
This is the first option, a “fat” customer table.
create table t_fat_customer ( customer_id bigint not null primary key, customer_name varchar(255) not null, customer_phone_1 varchar(255) not null default '', customer_phone_2 varchar(255) not null default '', customer_tollfree_1 varchar(255) not null default '', customer_tollfree_2 varchar(255) not null default '', customer_mobile_1 varchar(255) not null default '', customer_mobile_2 varchar(255) not null default '', customer_record_created_on timestamp default current_timestamp, customer_record_updated_on timestamp default current_timestamp on update current_timestamp ); insert into t_fat_customer ( customer_id, customer_name, customer_phone_1, customer_phone_2, customer_tollfree_1, customer_tollfree_2, customer_mobile_1, customer_mobile_2 ) values ( 1, 'John Doe', '123-456-7890', '123-456-7891', '800-123-4567', '', '123-456-7892', '' ), ( 2, 'Jane Doe', '123-456-7894', '123-456-7895', '800-123-4569', '', '123-456-7896', '' );
This is the second option, a “skinny” and “properly normalized” option. It has lots of skinny tables, which some will claim are the hallmarks of “good design”.
create table t_skinny_customer ( customer_id bigint not null primary key, customer_name varchar(255) not null, customer_record_created_on timestamp default current_timestamp, customer_record_updated_on timestamp default current_timestamp on update current_timestamp ); create table t_skinny_customer_phone ( customer_phone_id bigint not null primary key, customer_id bigint not null references t_skinny_customer( customer_id ) on delete cascade, phone_number varchar(255) not null, phone_type enum( 'phone', 'tollfree', 'mobile' ) not null default 'phone', phone_type_order tinyint unsigned not null default 0, phone_record_created_on timestamp default current_timestamp, phone_record_updated_on timestamp default current_timestamp on update current_timestamp ); insert into t_skinny_customer ( customer_id, customer_name ) values ( 1, 'John Doe' ), ( 2, 'Jane Doe' ); insert into t_skinny_customer_phone ( customer_phone_id, customer_id, phone_number, phone_type, phone_type_order ) values ( 11, 1, '123-456-7890', 'phone', 1 ), ( 12, 1, '123-456-7891', 'phone', 2 ), ( 13, 1, '800-123-4567', 'tollfree', 1 ), ( 14, 1, '123-456-7892', 'mobile', 1 ), ( 21, 2, '123-456-7894', 'phone', 1 ), ( 22, 2, '123-456-7895', 'phone', 2 ), ( 23, 2, '800-123-4569', 'tollfree', 1 ), ( 24, 2, '123-456-7896', 'mobile', 1 );
Here are some stats from the design options:
Fat option | Skinny option | |
---|---|---|
Lines of code: | 42 | 85 |
Tables: | 1 | 2 |
Columns: | 10 | 11 |
Records: | 2 | 10 |
The first option is simpler than the second option by every measure.
I know, you want to create a more general solution. I know, they say a hallmark of good database design and “proper normalization” is lots of skinny tables. I remind you: the first option is simpler than the second option by every measure.
My advice is don’t rush toward general solutions in an effort to solve problems you don’t have.
The UI isn’t coupled to the database design in any way. You could implement any type of UI on top of either database design alternative, such as:
Primary | Secondary | Operations | |
---|---|---|---|
Phone: | |||
Tollfree: | |||
Mobile: |
Or:
Phone Type | Phone Number | Operations |
---|---|---|
(My opinion is that the first UI option above is superior to the second UI option, it’s also exceedingly easy to implement with the “fat” customer table.)
Here are some implementation exercises to compare the difficulty of working with the database options:
- given a customer ID get the primary tollfree number or the primary phone number if there’s no tollfree number on record
- for each customer, fix the phone number records so that the secondary contact option is promoted to the primary contact option if there isn’t already a primary option on record, for each phone type (phone, tollfree, mobile).
- implement user interface option A above
- implement user interface option B above
- create a view to present the “fat” customer contact options as a table like in the “skinny” t_skinny_customer_phone table
- create a view to present the “skinny” customer contact options as a table like in the “fat” t_fat_customer table
- enforce a business rule which says that not more than two contact options for each phone type should be stored in the database for each customer
- check for duplicate contact phone numbers and remove them when found
- put the customer contact information into a spreadsheet and email to marketing by COB
Humble Bundles
There are a couple of decent looking ebook bundles available via Humble Bundle at the moment:
Relational Theory for Computer Professionals – C.J. Date
I was watching Relational Theory for Computer Professionals – C.J. Date and he made this fun remark:
There’s only three databases in the world: there’s departments and employees, there’s courses and students, and there’s suppliers and parts.