A line-by-line algorithm for precision radial velocity 

Last updated: 2024-07-09

1. Line-by-line velocity measurements, an outlier-resistant method for precision velocimetry

The line-by-line velocity measurement is a numerical technique designed to derive velocity measurements from high-resolution (R>50 000) datasets by accounting for outliers in the spectra data. It is tailored for fiber-fed multi-order spectrographs, both in optical and near-infrared (up to 2.5µm) domains. The reader is invited to read the paper describing the technique in detail (Astronomical Journal, arxiv), but in short, the domain is split into individual units (lines, hence the name of the technique) and the velocity and its associated uncertainty are measured within each line and combined through a mixture model to allow for the presence of spurious values. In addition to the velocity, other quantities are also derived; the most important being a value (dW) that can be understood (for a gaussian line) as a change in the line FWHM. These values provide useful stellar activity indicators.

The underlying idea for the LBL comes from the work of Xavier Dumusque in the context of activity filtering for solar-type stars (ref) and his suggestion, during a coffee break, to give it a try on SPIRou M dwarf data that was affected by residual telluric absorption.

A number of other quantities are also provided to the user, and not described in the peer-reviewed publication. These are provided ‘as is’ and feedback regarding their usefulness will be very much appreciated.

The LBL technique has been used in the following studies (not complete).

As always for data reduction codes, the LBL team has a number of ideas for upgrades in future releases, but the current version of the code is stable and provides science-ready data. The code currently works on data from SPIRou, NIRPS, HARPS, CARMENES-optical, and ESPRESSO. Adding new instruments requires a relatively modest effort from the team as long as the data is in a format similar to that of the above instruments with spectral orders and a mean (either an extension or header keywords) to retrieve a per-frame wavelength solution and the line-of-sight barycentric correction to be applied (most likely a fits keyword). Also, the LBL algorithm is designed to retrieve velocity measurements for the entire time-series, not a single file. Contact the authors if you would like to give implement the LBL codes for your instrument.

The output of the LBL code is an rdb table that can be uploaded to the online DACE pRV analysis tool.

2. Installation

Step 1: Download the github repository

git clone

Step 2: Install python 3.9 using conda

Using conda, create a new environment and activate it

conda create --name lbl-env python=3.9

conda activate lbl-env

Step 3: Install LBL


pip install -U -e .

Note one can also use venv (instead of conda)

Note {LBL_ROOT} is the path to the cloned github repository (i.e. /path/to/lbl)

Developers can go to for more in depth instructions.

3. How to use LBL

LBL can be used from a python script we refer to as a "wrap" file (or lbl can be used as a standard python module)

Once you have an example or if you wish to start from a fresh wrap file, follow the instructions here.

Demos for instruments can be found on the demos page

4. Input/Output data format

5. DTemp

Paper submitted (Artigau et al. 2024)

Section coming soon.

6. Support for additional instruments

Possibility of collaboration for new instruments to be added to the LBL code.

Terms of our support:

7.  More details on LBL

As the LBL name implies, ‘lines’ are key in the LBL algorithm. These lines are propagated through the code under different names. Here are the main steps within the LBL code in handling lines.

After creating (or inputting) a template, we find the local maxima and minima (see Figure 2 in Artigau et al. 2022). The location of these maxima and minima are reported in the _pos and _neg (positive and negative) masks in the masks/ folder. There is also a _full file that reports both positive and negative features, it is simply the combination of the two other tables.

Masks are used primarily to cut the spectral domain in individual regions over which velocities are computed. There is also a secondary use, which is to construct a CCF (cross-correlation function) on the first iteration of the code to get a rough estimate of the systemic velocity of the star. The LBL algorithm only converges if one knows the velocity of the target within ~1 full-width at half maximum of the line profile.

Within the LBL code, if one passes the _full mask, the computation of the LBL velocities will be done between lines corresponding to the local maxima (equivalent to using the _pos mask), but the CCF will be computed using all spectroscopic features (both positive and negative), which works better in very low signal-to-noise datasets. For a bright target (SNR>20), using the _full or _pos mask will not change the significance of the CCF, but for targets with SNR=3-5, this could lead to a better convergence of the velocity on the first iteration.

The _pos, _neg and _full masks are reported at a zero systemic velocity and are independent of the instrument used. One could use a SPIRou _full mask for a NIRPS dataset without issue.

When analyzing a sequence, it is important that the line list always remains the same such that the behavior of an individual line can be traced through epochs without ambiguity. When analyzing a target for the first time with a given mask, one creates a file within the lblreftable folder, which is basically the _pos line limits (all positive features, even if a _full mask was passed along) projected onto the wavelength grid of the dataset. Many lines will therefore be seen twice in consecutive orders and the LBL code will handle them as different lines afterward. A line may fall on a cluster of bad pixels within one order and one a clean area in the following one, so they are better handled individually even if they trace the same wavelength in the stellar spectrum. 

When looping through individual files for a given target, we start with the corresponding lblreftable and add a number of values derived from the spectrum into the tables lblrv/ folder. Some lines at the edges of orders can come and go because of BERV changes between spectra. We always keep the same line lists and lines that are missing are simply flagged as NaN. Furthermore, in most instruments and over parts of the domain, the blaze throughput is very low and parts of orders are useless, even though the wavelength grid is defined. The lines falling there are also seen in other orders (close to blaze peak) and are kept in the lblrv and lblreftable, but are set to NaN. One therefore expects to see a good fraction of lines set to NaN for bluer orders. The lblrv tables preserve the input header for future reference and give a number of values that can be used for statistical analysis (see Ould-Elkhim 2023, in press). By construction, all files for a given combination of target+template will have the same number of lines in their lblrv table. The wavelengths reported in these tables are those at the beginning of the ‘line’ (maxima preceding feature), and are expressed at a systemic velocity of zero and without the BERV.

The number of lines used for an individual spectrum is the number of non-NaN lines in the corresponding lblrv file, with the understanding that the ‘lines’ are lines-per-order, with some lines being counted twice. The relative weight of each line is 1/svrad2.


Github is the best place to report problems:

Lead author:  

Code base:

Core collaborator: