Context object versus global variables

I’m reading A Philosophy of Software Design by John Ousterhout and he says:

The context object unifies the handling of all system-global information and eliminates the need for pass-through variables. If a new variable needs to be added, it can be added to the context object; no existing code is affected except for the constructor and destructor for the context. The context makes it easy to identify and manage the global state of the system, since it is all stored in one place. The context is also convenient for testing: test code can change the global configuration of the application by modifying fields in the context. It would be much more difficult to implement such changes if the system used pass-through variables.

Contexts are far from an ideal solution. The variables stored in a context have most of the disadvantages of global variables; for example, it may not be obvious why a particular variable is present, or where it is used. Without discipline, a context can turn into a huge grab-bag of data that creates nonobvious dependencies throughout the system. Contexts may also create thread-safety issues; the best way to avoid problems is for variables in a context to be immutable. Unfortunately, I haven’t found a better solution than contexts.

Okay, so I’m just gonna step way out of line over here and suggest something heretical… but shouldn’t you just use global variables? You only introduced the context object so you could tweak it in unit tests, and you could just change your tests so that each one ran in a new process. Just sayin’.

…I suppose for the sake of completeness I should add a little more from Ousterhout which he said prior to the above:

Another approach is to store the information in a global variable, as in Figure 7.2(c). This avoids the need to pass the information from method to method, but global variables almost always create other problems. For example, global variables make it impossible to create two independent instances of the same system in the same process, since accesses to the global variables will conflict. It may seem unlikely that you would need multiple instances in production, but they are often useful in testing.

…so he is bending over backward to support multiple tests in one process, but he could just run each test in its own process and his problem evaporates.

Learning Kdenlive: Audio Normalization

I’m slowly learning how to use my video production tools.

Yesterday I discovered the 2 pass audio normalization feature in kdenlive.

You seem to need to apply this effect to each clip separately.

This 2 pass effect is much easier to configure than the other normalization option, which requires more configuration settings.

With the 2 pass audio normalization you just nominate your desired volume, click Analyze, and you’re done. You need to do that for each clip.

Now that I know how to do this the audio on my videos shouldn’t be as terrible as it has been. Getting better. Slowly.

Transferring large files with Salt file.managed

Well, this took me a few hours to figure out.

If you’re going to be transferring large files using file.managed in your salt state, make sure you specify show_changes: False, otherwise salt will start trying to boil the ocean and calculate the unified diff of your enormous files.

The clue that this is the problem you're having is if one of your CPU cores pegs at 100% and your state doesn't apply in a reasonable amount of time (minutes, I guess).

Also you probably want to make sure you're not trying to use your large file as any sort of template. By default file.managed will assume no template, which is what you want for large files that aren't templates. I've never used large files which are templates, but I suspect if you tried that you'd have a bad time.

Chromium bug fixed!

For the past month or so I’ve had this intermittent problem where chromium would start crashing and would keep crashing every time I tried to start it until I rebooted.

Today I ran chromium under strace and straightaway discovered that it was trying to use shared memory in /dev/shm and it couldn’t because there was no space left.

There was no space left because I had PHP’s xdebug writing dump files there. I’ve just disabled xdebug while I’m not using it and after clearing out its old files I can now run chromium again.

I added a note about this to my debugging notes for in case this or something like it happens to me again.

ChatGPT explains ZFS scrub operation to me

Now I understand ZFS scrub better. Thanks ChatGPT! I hope we can be friends, you, and me, and all of your kind.

ChatGPT 3.5

User:
when zpool scrub reports bytes issued, what does it mean?

ChatGPT:
When zpool scrub reports “bytes issued,” it refers to the amount of data that has been read from the storage devices during the scrub operation.

A ZFS scrub is a process that checks all data in the pool for errors and automatically repairs any detected errors. It involves reading all the data in the pool and comparing it to the checksums stored with the data. If errors are found, ZFS will attempt to correct them using redundant copies (if available).

The “bytes issued” metric indicates the total amount of data read from the pool during the scrub. It’s a measure of the workload that the scrub operation has imposed on the storage subsystem. In general, a scrub is a resource-intensive operation, and the “bytes issued” metric can give you an idea of how much I/O (Input/Output) activity is being generated by the scrub process.

Keep in mind that this metric doesn’t necessarily indicate errors or problems with the pool; it’s more of a diagnostic measure to understand the activity and workload associated with the scrubbing process. If you see a high number of bytes issued, it simply means that a significant amount of data has been read from the pool during the scrubbing operation.

User:
if bytes issued is the data read, what is the bytes scanned?

ChatGPT:
In the context of a ZFS scrub operation, the “bytes scanned” metric refers to the total amount of data that has been read and checked for errors. This includes both the data that is actively used (not just the spare or redundant copies) and any parity or checksum information associated with that data.

When ZFS performs a scrub, it reads through all the data in the pool, including primary and spare copies, and it also reads and checks the checksums associated with that data. The “bytes scanned” metric represents the total size of the dataset or pool that has been read during the scrubbing process.

To summarize:

“Bytes issued” is the amount of data read, indicating the workload imposed on the storage subsystem.
“Bytes scanned” is the total amount of data, including both the active data and associated checksums, that has been read and checked for errors during the scrub operation.
These metrics are useful for understanding the scope of the scrubbing process and can be monitored to assess the health and integrity of the ZFS pool. If errors are detected during the scan, ZFS will attempt to repair them using redundancy or other available mechanisms.

User:
why am I doubtful about your answer?

ChatGPT:
If you have doubts about the information provided, I encourage you to seek additional sources or clarify specific points you find confusing. It’s always a good practice to verify technical information, especially when it comes to system administration or storage technologies.

If there’s a particular aspect of my response that you find unclear or if you have additional questions, please feel free to ask for further clarification. I’m here to assist you and provide accurate information to the best of my ability.

User:
how is bytes scanned different to bytes issued? are they both measures of bytes read?

ChatGPT:
I apologize for any confusion in my previous responses. Let me clarify the distinction between “bytes scanned” and “bytes issued.”

In the context of ZFS scrub operations:

Bytes Scanned: This refers to the total amount of data that has been read and checked for errors during the scrub operation. It includes the actual data as well as any associated checksums, metadata, or parity information. “Bytes scanned” represents the entire content of the pool that was examined during the scrub, including both active data and any additional information needed for verification.

Bytes Issued: This metric specifically measures the amount of data issued to the storage subsystem during the scrub. It includes the actual data read from the disks, but it does not include any additional metadata, checksums, or parity information. “Bytes issued” is a subset of “bytes scanned” and represents the workload imposed on the storage devices.

In summary, both metrics involve the amount of data read during a scrub, but “bytes scanned” encompasses all the data and associated information checked during the process, while “bytes issued” focuses specifically on the workload sent to the storage subsystem in the form of data reads.

User:
why is there many gigs of bytes scanned with zero bytes issued?

ChatGPT:
The scenario where you observe many gigabytes of “bytes scanned” with zero “bytes issued” in a ZFS scrub operation can occur when ZFS is checking the integrity of the data without actively reading it from the disks. Let me break down the possible reasons for this:

Checksum Verification: ZFS uses checksums to ensure data integrity. During a scrub, ZFS may verify the checksums of the data without necessarily reading the actual data from disk if the checksums match the expected values. In this case, “bytes scanned” would include the entire dataset, but “bytes issued” would be zero because the actual data doesn’t need to be read.

Caching and Prefetching: ZFS employs caching mechanisms, and if the required data is already in the cache, it might not be necessary to issue new read requests to the storage devices during a scrub. The system can use cached data for checksum verification without incurring additional I/O operations.

Redundancy and Parity Checks: If ZFS detects errors in the metadata, checksums, or redundancy information without finding errors in the actual data blocks, it may not need to read the redundant copies or parity blocks during the scrub. This could contribute to “bytes scanned” being non-zero while “bytes issued” remains at zero.

In summary, the ZFS scrub operation involves both checking data integrity and repairing errors. If the data and associated metadata are consistent and error-free, “bytes issued” may remain zero even though ZFS is scanning the entire dataset to ensure its integrity. This behavior is part of ZFS’s design to optimize performance and minimize unnecessary I/O operations during scrubbing when possible.

User:
does nonzero bytes issue imply there were checksum errors?

ChatGPT:
Yes, a nonzero value for “bytes issued” in the context of a ZFS scrub often implies that there were checksum errors detected during the scrub operation. When ZFS encounters checksum errors, it needs to issue read requests to retrieve the actual data from disk, allowing it to attempt to repair or correct the errors.

Here’s a breakdown of the process:

Checksum Verification (Bytes Scanned): ZFS first scans the entire dataset, checking the integrity of data and associated metadata using checksums. This process contributes to the “bytes scanned” metric.

Checksum Errors Detected: If ZFS detects checksum errors during the scan, it identifies the affected blocks or sectors where the errors occurred.

Data Repair (Bytes Issued): To correct the errors, ZFS issues read requests (“bytes issued”) to retrieve the actual data from redundant copies, spare devices, or parity information. The system then uses this redundant information to repair or reconstruct the corrupted data.

So, if “bytes issued” is nonzero, it indicates that ZFS had to actively read data to correct errors detected during the scrub. This is a key aspect of ZFS’s self-healing capabilities, where it can automatically repair data corruption when redundancy or spare copies are available.