Scientific Computing

Map coordinate convert Python, Matlab, Fortran

Over the past few years we have created and refined open-source map coordinate conversion programs that are independently available for:

These programs use syntax similar to the $1000 Matlab Mapping and Aerospace Toolboxes, while being independently developed as open-source software. We use continuous integration on each package to help ensure quality results.

The functions available include

aer2ecef  aer2enu  aer2geodetic  aer2ned
ecef2aer  ecef2enu  ecef2enuv  ecef2geodetic  ecef2ned  ecef2nedv
ecef2eci  eci2ecef  eci2aer  aer2eci
enu2aer  enu2ecef   enu2geodetic
geodetic2aer  geodetic2ecef  geodetic2enu  geodetic2ned
ned2aer  ned2ecef   ned2geodetic
azel2radec radec2azel
vreckon vdist
lookAtSpheroid
track2

Extracting raw images from PDF

Instead of low-quality screen-shots a PDF to get the images, use Poppler to extract the original high-resolution images from the PDF. Note: only raster images can be exported with Poppler.

Examples of PDF image extraction tasks:

List all PDF images:

pdfimages -list in.pdf

Extract PDF images from all pages, dumping all images in mydoc.pdf to the same directory. Filenames start with out-. There might be a lot of images.

pdfimages -all in.pdf out

Extract PDF images from specific pages: example is for page 3 only:

pdfimages -all -f 3 -l 3 in.pdf out
-f
first page to extract
-l
last page to extract

Install Poppler:

  • Linux: apt install poppler-utils
  • macOS / Linux (Homebrew): brew install poppler
  • Windows: use WSL poppler

Related: extracting a page(s) from PDF

Joining PDF files into one

Join multiple PDF files into one:

pdfunite one.pdf two.pdf three.pdf joined.pdf

pdfunite is obtained by installing Poppler:

  • Linux: apt install poppler-utils
  • macOS: brew install poppler
  • Windows: WSL poppler

recursive convert DOC, DOCX to PDF

We have created Python scripts in LibreOffice Utils that recursively search for files matching a glob pattern (such as *.docx) and convert or print these input documents. They use LibreOffice in headless (console) mode, just a single Terminal command.

  • doc2pdf.py: recursively converts directories containing DOC, DOCX, RTF or other word processing files to PDF.
  • doc2print.py: recursively print documents to the default printer

xarray NetCDF LRU cache replaced autoclose

Python code using xarray.open_dataset() or xarray.open_dataarray() or similar functions reading from NetCDF4 use an LRU cache that automatically closes unneeded files.

As of xarray 0.11.0, the obsolete autoclose=True option should no longer be used.

Problems fixed by LRU cache: the LRU cache used to open NetCDF4 files with xarray fixes these problems, and gives high performance:

  • random segmentation fault while reading NetCDF4 .nc files, where the same file is reopened in the program.
  • OSError from too many open files, where even increasing ulimit doesn’t help

Animate imagesc movies in Matlab

The pause() Matlab / Octave statement causes figures to refresh.

%% create data, 25 frames of 512x512 pixels
data = rand(512,512,25);

%% create blank image
h = imagesc(data(:,:,1));

%% for loop to play "movie"
for i = 1:size(data,3)
  set(h, cdata=data(:,:,i)) % update latest frame
  pause(0.01) % feel free to reduce, but keep greater than 0 to ensure redraw
end

Related: animate plots in Matplotlib

Fix VirtualBox kernel module not found

On upgrading major versions, say from VirtualBox 5.2 to VirtualBox 6.0, you may find you can’t start any of your VM images, getting an error including “kernel modules do not match”. This may mean you haven’t removed all of the old version VirtualBox components. Consider using synaptic or

apt list --installed | grep virtualbox

to see if any old VirtualBox components are installed that may be conflicting with the new VirtualBox version.

Fix

You don’t have to uninstall the new VirtualBox version.

  1. as noted above, remove any components from the old VirtualBox install
  2. reboot the computer to fully flush the old VirtualBox components
  3. install the Extension Pack for the new VirtualBox version.

Matplotlib in Windows Subsystem for Linux

To use Matplotlib on WSL, first setup Anaconda Python on WSL.

conda install matplotlib

As in general, to speed up plotting when display of plots is not needed, use .savefig() by beginning the Python program code:

import matplotlib
matplotlib.use('agg')

print(matplotlib.get_backend())

you should see agg printed. Figure can then be saved to disk without displaying them on screen.

Brother printers on Linux

Network and USB printing with Brother printers generally work great on Linux and Windows Subsystem for Linux. The Brother installer script automatically downloads and installs necessary programs. Opt out of the last program brscan-skey, as it’s unnecessary.

Download Brother printer driver.

apt install cups

gunzip linux-brprinter-installer-*.gz

bash linux-brprinter-installer*

“Will you specify the Device URI? [Y/n]”  y   (if networked) “Specify IP address”: “Enter IP address” the static IP printer address

  • BRscan: YES allows xsane for Brother scanning printers.
  • BRsaneconfig: YES xsane scanner over the network
  • BRscan-Skey: NO don’t install because of potential security risks.

If accidentally installed brscan-skey, uninstall with

dpkg -r brscan-skey

As usual, manage/check printer from CUPS by pointing your PC web browser to localhost:631

Scanning documents in Linux in typically done with Xsane, which works on networked or USB connected scanners.

apt install xsane

xsane

To update scanner IP address use brsaneconfig4

For native Windows (not using Windows Subsystem for Linux), download just the driver that is about 15-20 MB and add the printer by IP address and choose “have disk”, pointing to the folder where you extract the drivers files to.

Force GitHub Linguist language detection

GitHub Linguist is reasonably accurate at automatically detecting the percentages of each code language in Git tracked code. However, as with any automatic coding language detection scheme, mis-detected languages occur. This seems to happen most often with Matlab code. Despite being in the top 10 STEM / data science languages, Matlab / GNU Octave *.m files can be detected as other languages such as Objective-C or Limbo.

Fix

Add to each Matlab repo’s .gitattributes file:

*.m linguist-language=Matlab

Verify

Simply git push your code and look at the GitHub repo language graph.

If you frequently need to auto-detect code and prefer Python, consider the Linguist Python wrapper of GitHub Linguist.