Scientific Computing

VS Code Copilot parent repo

When only a subdirectory of a Git repository is opened in Visual Studio Code, repo-root Copilot customizations like .github/copilot-instructions.md are not discovered by default. This can make Copilot ignore repository-wide instructions even though they exist at the top of the current Git repository.

Visual Studio Code has a built-in configuration items to resolve this issue by enabling parent repository discovery for chat customizations.

{
  "chat.useCustomizationsInParentRepositories": true
}

With this setting enabled true, VS Code walks upward from the opened workspace folder until it finds .git. It then discovers chat customizations between the opened folder and the repository root, including:

  • .github/copilot-instructions.md
  • .github/instructions/*.instructions.md
  • prompt files
  • agent files such as AGENTS.md
  • hooks and other chat customizations

This setting is especially useful for monorepos and for workflows that open a focused subdirectory such as content/posts/, src/, or packages/frontend/ instead of the full repository root. Without parent repository discovery, Copilot can miss repository-specific style and validation rules.

A few conditions apply:

  • the opened folder must not itself be a separate Git repository (e.g. Git submodule)
  • a parent folder must contain .git
  • the parent repository folder must be trusted in VS Code

To verify that the repository instructions are in use, inspect the References list on a Copilot Chat response. If parent discovery is working, the response references typically include the repo-root customization files.

Purging computer temp folder

A Linux computer temp folder can be purged on schedule to free up disk space and remove old temporary files. The programs “tmpwatch” or “tmpreaper” can be used to purge the temp folder on a schedule. tmpwatch is available on Red Hat-based Linux distributions, while tmpreaper is available on Debian-based Linux distributions.

To do a “dry run” of the purge command to see what files would be deleted, use the “–test” flag:

<tmpwatch|tmpreaper> --test --mtime 7d /tmp

Set the temp path explicitly, especially on HPC systems where scratch space may be under system-specific paths.

--mtime 7d
purge files older than 7 days – adjust as desired
/tmp
path to the temp folder – adjust as needed

Linux cron example

On Linux, a cron job can run the purge command on schedule.

Edit the crontab with:

crontab -e

Add a line to run the purge command daily at midnight:

0 0 * * * /usr/bin/tmpwatch --mtime 7d /tmp

Homebrew CMake-GUI install

Homebrew Cask packages GUI (graphical) programs. Many users install only the CMake CLI tools with:

brew install cmake

This does not install the cmake-gui program.

To install CMake-GUI:

brew install --cask cmake

To use cmake-gui from terminal, add this to ~/.zprofile:

export PATH=$PATH:/Applications/CMake.app/Contents/bin

Confirm the /Applications path from the cmake-gui line under Artifacts:

brew info --cask cmake

When launching from terminal, specify -S . and -B build to prefill source and build directories:

cmake-gui -S . -B build

Configure defaults for Bash, Zsh, or PowerShell

The default shell for operating systems is typically:

  • Linux: Bash
  • macOS: Zsh
  • Windows: PowerShell

Each shell vendor has configuration files to change the default shell parameters. Here are some useful examples:

Remove duplicate entries in shell history

To remove duplicate entries in shell history for pressing “up” on repeated commands to give the last non-duplicated command, set for the respective shell as follows.

Bash: “~/.bashrc”: ignore duplicate lines, and omits lines that start with space.

export HISTCONTROL=ignoredups:ignorespace

Zsh: “~/.zshrc”

setopt hist_ignore_dups
setopt hist_ignore_space

PowerShell: “$profile”: set the history to ignore duplicates.

Set-PSReadlineOption -HistoryNoDuplicates

Peak RAM usage of process and its children

Measuring the peak RAM usage of a process and all its children can be done using various tools and techniques. OS-dependent tools may be the most accurate, but they can be complex to use. A simpler approach is to periodically sample the RAM usage of the process and its children, like this scripts for Linux and macOS using ps.

It is also possible though less accurate on macOS or Linux to use /usr/bin/time, but this only measures the peak RAM usage of the largest child process, not the total of all children, so this is unsuitable for multiprocess applications like “mpiexec”.

For Linux, a more accurate method is the Cgroup v2, such as implemented by cgmemtime. For macOS, the Instruments tool can be used to measure the RAM usage of a process and its children, but it requires a ‘codesign’d application and is more complex to set up.

xcrun xctrace record --template "Game Memory" --launch -- /path/to/application --output bench_game.trace --time-limit 30s

open bench_game.trace

CMake trace portion of script

CMake cmake_language(TRACE) enables tracing selected nestable portions of CMake script, which is important for debugging CMake projects due to the generally large volume of trace output. The trace output is large as the nature of CMake’s platform-independence means that numerous checks are performed even on minimal CMake scripts. This can make it difficult to find the relevant portion of the trace output for debugging. The cmake_language(TRACE) command allows specification of a named portion of the CMake script to trace, including nested trace regions. This is a powerful debugging tool because it narrows trace output to the relevant part of the CMake script instead of emitting the entire script trace.

To trace only part of a script, wrap that region with cmake_language(TRACE) as in this CMakeLists.txt example:

cmake_minimum_required(VERSION 4.2)

project(demo LANGUAGES C)

cmake_language(TRACE ON)
find_package(Zlib)
cmake_language(TRACE OFF)

find_package(LAPACK)

observe that only the trace output for the find_package(Zlib) command is emitted, while the find_package(LAPACK) command and compiler discovery are not traced.

Ripgrep is a fast text file search tool

grep is a ubiquitous command-line tool for searching plain-text data sets for lines that match a regular expression. Ripgrep is a modern alternative to grep that is designed to be faster and more efficient at recursive text file searches common to code developers. Ripgrep is used internally by VS Code for its search functionality.

Install ripgrep with:

  • Windows: winget install BurntSushi.ripgrep.GNU
  • macOS: brew install ripgrep
  • Linux: apt install ripgrep or similar

The rg command is used to invoke ripgrep, and it supports a wide range of options for customizing search behavior.

Examples:

Search for the term “TODO” in all .cpp files in the current directory and its subdirectories:

rg "TODO" --glob "*.cpp"

Case-insensitive search:

rg -i todo

Show line numbers and filename only:

rg -n TODO

Search only certain file types:

rg TODO -g "*.cpp" -g "*.hpp"

Include hidden and ignored files (like .gitignore):

rg TODO --hidden --no-ignore

Literal string search (not regex):

rg -F a+b*c

List only matching file names:

rg -l TODO

Show files that do not match:

rg -L TODO

Valgrind memory leak detection on macOS

Valgrind is a dynamic analysis tool that can detect memory leaks and other problems in programs including C, C++, and Fortran. Valgrind is available on Linux and BSD for x86 and ARM CPU architectures.

For macOS on Intel x86 CPUs, Valgrind is available in Homebrew.

For macOS on Apple Silicon ARM64 CPU, Valgrind can be used from an aarch64 Linux virtual machine in native mode for best performance. There is a development fork of Valgrind for Apple Silicon, but it may not yet ready for production use.

Matlab on macOS .matlab7rc.sh shell config

On macOS, when started from the Applications icon, Matlab does NOT source shell configuration files. Any environment variables set in these files such as those added by package managers like Homebrew or Conda will not be available in the Matlab environment.

On macOS, only if Matlab is started from the Terminal like /Applications/MatlabR20*.app/bin/matlab will Matlab source the shell configuration file, leading to the expected environment variables as default in an interactive login shell also available in Matlab.

Instead of depending on how the user launched Matlab, a more uniform approach is to have users set environment variables in the Matlab startup.m script, which will be sourced regardless of how Matlab is launched. Set desired environment variables in the Matlab startup.m script like:

edit(fullfile(userpath,'startup.m'))

Add lines like

setenv('PATH', [getenv('HOMEBREW_PREFIX') '/bin/' pathsep getenv('PATH')])

We recommend not running system() scripts in startup.m unless truly necessary to avoid delaying Matlab startup time or affecting startup stability.

Then programs installed by Homebrew like CMake, GCC, etc. will be on Path environment variable in Matlab.

Note that the Matlab commands below only affect commands within that same command line:

!source ~/.zshrc && ls
system("source ~/.zshrc && ls")

GitHub Issues search issues by username

In larger projects, one might not remember if they or a colleague has opened or commented on a prior Issue topic. To search the issues by a username, use this search syntax in the GitHub Issues search bar:

  • Issue created by a user:
is:issue author:username
  • Issue commented on by a user:
is:issue commenter:username

Note that the search term is not user:.

This maps into the GitHub API as a query parameter:

?q=is:issue+author:username

For example at the Bootstrap project for one’s self “@me”:

https://github.com/twbs/bootstrap/issues?=q=is%3Aissue+author%3A%40me