Scientific Computing

Matlab / GNU Octave HTTP user agent

February 25, 2023

Some servers may block Matlab or GNU Octave download traffic from web operations like websave() or webread(). A web browser from the same computer may work–this is a symptom of server user agent blocking.

Get Matlab or GNU Octave HTTP user agent, with this script.

This script demonstrates setting a custom HTTP user agent using Matlab or GNU Octave factory function weboptions() to get around servers that block non-allow-listed user agents.

Python urllib.request can also set user agent.

CMake HTTP user agent

February 24, 2023

Programs using HTTP typically report a user agent to avoid being blocked. User agent metadata help the server know user client statistics.

CMake’s HTTP user agent is like

curl/<curl version>

as seen with CMake script.

Some servers may block CMake download traffic such as file(DOWNLOAD …). A web browser from the same computer may work–this is a symptom of server user agent blocking. This script demonstrates setting a custom HTTP user agent to get around servers that block non-allow-listed user agents.

PulseAudio on macOS

February 23, 2023

PulseAudio is available via Homebrew.

brew install pulseaudio

Start PulseAudio by:

brew services start pulseaudio

Check that PulseAudio is running by:

pactl list sinks

The proper “sink” may need to be selected to hear sound. Inspect the device list looking for say “Macbook Speakers” and set the default audio output device like:

pactl set-default-sink 1

if “Sink #1” is the Macbook Speakers and so on.

Related: macOS X11

Python pathlib iterdir subdirectories

February 22, 2023

Python pathlib.Path.iterdir only iterates over a single directory level. It does not recurse directories.

A common task is to iterate over each subdirectory of a top-level directory non-recursively. Given directory structure:

a/b
z
y/1/2

The Python iterdir example would return (in unspecified order):

a
z
y

Notice in the C++ and Python examples, iterators are used that emit one element at a time rather than building up an entire list at once. Since directory operations are often unordered, there is no advantage to retrieving every directory name in a greedy operation rather than the lazy generators shown, particularly if networked file systems are being used.

C11 Annex K safe functions

February 21, 2023

The C11 standard defines optional bounds-checking functions with an “_s” suffix in their names in Annex K. There are numerous reasons why these functions aren’t implemented in popular compilers / stdlib except MSVC. The most salient points are in the field experience note that observes that static analysis, dynamic analysis, address sanitization, etc. provide benefits that are largely beyond what the secure functions could provide, without the end user runtime penalty.

For totally new projects, one could consider coding languages that have inherently more secure memory access such as Rust. Or for a less dramatic change, using C++ for string-heavy portions of the project where the string class can be easier to use than C char.

Git maintainer feature branch workflow

February 20, 2023

Major Git projects commonly have a workflow where other contributors fork a primary Git repository, make changes, and then contribute back to the primary project. This article describe the maintainer workflow. The contributor workflow is in a separate article.

This workflow is also suitable for projects using Git submodules, where you may want a submodule to temporarily use a branch from another repository.

Maintainers of a primary Git repo can make a local copy of the forked Git branch from the contributor’s Git repo to ensure the changes work as desired. Two ways for the maintainer to make this local copy are described in this article.

add remote upstream

This workflow is suited to accommodate regular contributors to a project, for example colleagues or employees. In this example, we assume the primary project branch to merge new contributions into is “main” and that the remote contributor branch is “add-feat1”.

git remote add coworker-42 https://github.invalid/coworker-42/forked_project.git

git fetch coworker-42

git switch coworker-42/add-feat1

Ensure that things work as desired. To merge the changes, do like:

git merge --no-ff add-feat1

Check the Git history to verify the desired commits.

git log

Push to the primary project as desired.

temporary local branch

This workflow is suitable for occasional contributors. It avoids cluttering the local repo with many upstream repos metadata in .git/config.

On the local copy of the primary project, create a temporary branch in which to put the contributor’s remote branch. Here we assume the remote branch is “add-feat1”:

git switch -c contrib-add-feat1

git pull https://github.invalid/contrib/forked_project.git add-feat1

After testing the new code to see that it’s suitable, merge the changes into the primary project:

git switch main

git merge --no-ff contrib-add-feat1

Check the Git history to verify the desired commits.

git log

Push to the primary project as desired. Finally, delete the temporary local branch:

git branch -d contrib-add-feat1

CMake check TLS capabilities

February 19, 2023

CMake uses vendored libraries including curl and nghttp2 to provide internet connectivity in CMake. As web technologies are created and obsolesced given the ubiquitous nature of HTTPS today, over various library versions, CMake version, and system configurations some failures may occur. To help catch these failures in new releases of CMake, as well as check connectivity on a particular platform and CMake version, try script check_https.cmake.

Git pull HTTPS, push SSH

February 18, 2023

Public Git repos, GitHub Gists, and GitLab Snippets can use HTTPS for “git fetch”, “git pull”, and other Git download operations. Using HTTPS to download and verifying the author PGP signed Git commits can help assert that the content is from the intended authors. Git download operations over HTTPS are perhaps twice as fast as Git over SSH and use less CPU. By default, Git verifies SSL certs.

Typical Git hosting providers such as GitHub require SSH for enhanced security. Since “git push” operations typically take longer than “git pull”, particularly where pre-commit hooks and PGP commit signing are used, SSH speed penalty on “git push” is often acceptable.

For developers there are speed benefits from a hybrid Git configuration where Git downloads use HTTPS and Git uploads use SSH. Git has intrinsic functionality for this setup in a global configuration. The one-time setup below uses “https://” for the remote repo URL instead of “ssh://”. Replace “your_username” with your GitHub username for SSH GitHub Gists Git push.

git config --global url."ssh://github.com/".pushInsteadOf https://github.com/

git config --global url."ssh://gitlab.com/".pushInsteadOf https://gitlab.com/

git config --global url."ssh://gist.github.com/".pushInsteadOf https://gist.github.com/your_username/

Note that GitLab Snippets doesn’t require an extra configuration setting like GitHub Gists does.

The file ~/.ssh/config is typically set for Git SSH keys.

This makes all GitHub and GitLab public repos push over SSH, unless overridden in a repo’s own Git config. Confirm by git remote -v in a repo.

Troubleshooting

If experiencing problems on “git push”, check this matches the desired Git repo:

git config --get remote.origin.url

In particular, it must NOT have a trailing slash like repo/

If SSH port 22 is blocked by the network, try using Git SSH on port 443 instead.

Git contributor feature branch workflow

February 17, 2023

Major Git projects commonly have a contributor workflow where other contributors fork a primary Git repository, make changes, and then contribute back to the primary project. This article describe the contributor workflow. The maintainer workflow is in a separate article.

The contributor forks the primary project Git repo. On the local copy of the fork, create a feature branch:

git switch -c add-feature1

Once the new work is complete, make the branch consistent with the primary repo by pulling in the latest changes from the primary repo:

git switch main
# whatever the primary or development branch of the primary repo is

git remote add upstream https://github.invalid/primary/repo_url

git fetch upstream

git rebase upstream/main

Update the local branch to remote main

git switch add-feature1

git rebase main

Test the final updated code. Create the Pull Request / Merge Request for the maintainer to evaluate the changes and possibly put them into the primary project.

MPI vader unlink error

February 16, 2023

OpenMPI 4.x uses temporary files for UNIX sockets, which are limited to 100 character path length. This can become an issue on macOS, which by default sets POSIX environment variable TMPDIR to a pseudorandom path under “/var” about 50 character long. We have experienced this as a race condition that becomes more likely as the number of MPI workers is eight or more.

A workaround is to set an alias for mpiexec or within CMake test properties etc. such that mpiexec gets a sufficiently short working path so that UNIX sockets don’t fail due to excessive path length.

CMake example