About
This Python3 code aids in analyzing raw measurements with an Acoustic Doppler Velocimeter (ADV) producing *.vno
and *.vna
files. It detects and removes spikes according to Nikora and Goring (1998) and Goring and Nikora (2002).
The code was originally developed in Matlab(R) at the Nepf Environmental Fluid Mechanics Laboratory (Massachusetts Institute of Technology).
Important
*.vno
and *.vna
files need to comply with the following name convention:
XX_YY_ZZ_something.vna
where XX
, YY
, and ZZ
are streamwise (x), perpendicular (y), and vertical (z) coordinates in CENTIMETERS, respectively. Anything else added after ZZ_
is ignored by the code (it just copies it for the sake of dataset naming).
Note
This documentation is also as available as style-adapted PDF.
Requirements & Installation
Time requirement: 5-10 min.
Get Python
To get the code running, the following software is needed and their installation instructions are provided below:
Python >=3.6
NumPy >=1.17.4
Openpyxl 3.0.3
Pandas >=1.3.5
Matplotlib >=3.1.2
Start with downloading and installing the latest version of Anaconda Python. Alternatively, downloading and installing a pure Python interpreter will also work. Detailed information about installing Python is available in the Anaconda Docs and at hydro-informatics.com/python-basics.
To install the NumPy, Openpyxl, Pandas, and Matplotlib libraries after installing Anaconda, open Anaconda Prompt (e.g., click on the Windows icon, tap anaconda prompt
, and hit enter``). In Anaconda Prompt, enter the following command sequence to install the libraries in the base environment. The installation may take a while depending on your internet speed.
conda install -c anaconda numpy
conda install -c anaconda openpyxl
conda install -c anaconda numpy
conda install -c conda-forge pandas
conda install -c conda-forge matplotlib
If you are struggling with the dark window and blinking cursor of Anaconda Prompt, worry not. You can also use Anaconda Navigator and install the four libraries (in the above order) in Anaconda Navigator.
Note
Alternatively, create a new conda environment to install the three libraries for this application. However, creating a new environment may eat up a lot of disk space, and installing the Python-omnipresent libraries NumPy, Openpyxl, Pandas, and Maplotlib in the base environment does not hurt.
Download tke-analyst Code
The code can be either started from Terminal (Anaconda Prompt) or within an Integrated Development Environment (IDE). With Anaconda installed, consider using Spyder (Anaconda Navigator > Spyder IDE).
Download tke-calculator.zip and unpack it to the directory where you want to run the code.
Tip
Alternatively to downloading the zip file, you may want to git clone
the repository, which enables regular updating of the code (e.g., if there is an update of plot functions available). For using git, make sure that git bash is installed on your computer. Then, open git bash, cd into the directory where you want to download the code and type:
git clone https://github.com/sschwindt/tke-calculator.git
To update any time, cd
into the directory where tke-calculator
lives and type:
git pull --rebase
Usage
Regular Usage
With Python installed and the code living on your computer:
Copy your data to a sub-folder of
tke-analyst
(e.g., next to the folderdata/test-example
that contains three exemplary*.vna
files). Make sure the files are named withXX_YY_ZZ_something.vna
whereXX
,YY
, andZZ
are streamwise (x), perpendicular (y), and vertical (z) coordinates in CENTIMETERS, respectivelyComplete the required information on the experimental setup in
tke-calculator/input.xlsx
(see below figure). IMPORTANT: Never modify column A or any list in the sourcetables sheet (unless you also modifyload_input_defs
in line 25ff ofprofile_analyst.py
). The code uses the text provided in these areas of input.xlsx to identify setups. If useful, consider substituting the Wood wording in your mind and with a note in column C with your characteristic turbulence objects, but do not modify column A.- Open Anaconda Prompt (or any other Python-able Terminal) and:
cd
into the code directory (e.g.,cd "C:research\project\tke-analyst"
if you unpackedtke-analyst
to a folder living in the directoryC:\research\project\
)run the code:
python profile_analyst.py
(usesinput.xlsx
)ALTERNATIVELY, run with another
*.xlsx
input file:python profile_analyst.py "input-other-test.xlsx"
wait until the code finished with
-- DONE -- ALL TASKS FINISHED --

The interface of the input.xlsx workbook for entering experiment parameters and specifying a despiking method.
- After a successful run, the code will have produced the following files in
...\tke-analyst\TEST
(whereTEST
may correspond totest-example
): .xlsx
files of full-time series data, with spikes and despiked..xlsx
files of statistic summaries (i.e., average, standard deviation std, TKE) of velocity parameters with x, y, and z positions, with spikes and despiked (see workbook example in the figure below).Two plots (
norm-tke-x.png
andnorm-tke-x-despiked.png
) showing normalized TKE plotted against normalized x, with spikes and despiked, respectively (see plot example in the figure below).
- After a successful run, the code will have produced the following files in


Usage Example
For example, download and unpack the code to your hard-disk in a folder called C:\my-project\tke-analyst\
. To analyze the *.vna
files in test-example
, they were copied into a test folder that lives in the data
folder.
The definitions in the above-shown input.xlsx
define x-normalization as a function of a wood log length, in this case, the log diameter of 0.114 m.
Cell B3
containing Input folder name (tke-analyst/) in input.xlsx
defines that the input data for test-example
live in a subfolder called data/test-example
.
Important
The data directory of the subfolder definition in cell B3
may not end on any \
or /
. Also, make sure to use the /
sign for folder name separation (do not use \
).
- To run the code with the example data, open Anaconda Prompt (or any other Python-able Terminal) and:
cd
into the code directory (e.g.,cd "C:research\project\tke-analyst"
if you unpackedtke-analyst
to a folder living in the directory *C:researchproject*)run the code:
python profile_analyst.py
(usesinput.xlsx
)Or:
python profile_analyst.py "input.xlsx"
wait until the code finished with
-- DONE -- ALL TASKS FINISHED --
- After a successful run, the code will have produced the following files in
...\tke-analyst\data\test-example
: .xlsx
files of full-time series data, with spikes and despiked..xlsx
files of statistic summaries (i.e., average, standard deviation std, TKE) of velocity parameters with x, y, and z positions, with spikes and despiked.Two plots (
norm-tke-x.png
andnorm-tke-x-despiked.png
) showing normalized TKE plotted against normalized x, with spikes and despiked, respectively.
- After a successful run, the code will have produced the following files in
Developer Docs
The following sections provide details of functions, their arguments, and outputs to help tweaking the code for individual purposes.
config.py
Global parameters settings (essentially SCRIPT_DIR) and message logging controls.
flowstat.py
- flowstat.flowstat(time, u, v, w1, w2, profile_type='lp')[source]
Calculate ADV data statistics
- Parameters
time (np.array) – time in seconds
u (np.array) – streamweise velocity along x-axis (positive in bulk flow direction)
v (np.array) – perpendicular velocity along y-axis
w1 (np.array) – vertical velocity if side is DOWN
w2 (np.array) – vertical velocity if side is not DOWN
profile_type (str) – orientation of the probe (default: lp, which mean probe looks like FlowTracker in a river)
- Returns
keys correspond to series names and values to full time series stats (dict(dict)): keys correspond to series names with STAT for autoreplacement with STAT type of nested dictionaries with AVRG, STD and STDERR
- Return type
time_series (dict)
profile_analyst.py
Load ADV measurements and calculate TKE with plot options Originally coded in Matlab at Nepf Lab (MIT) Re-written in Python by Sebastian Schwindt (2022)
- profile_analyst.build_stats_summary(vna_stats_dict, experiment_info, profile_type, bulk_velocity, log_length)[source]
Re-organize the stats dataset and assign probe coordinates
- Parameters
vna_stats_dict (dict) – the result of all vna files processed with the flowstat.flowstat function
experiment_info (dict) – the result of the get_data_info function for retrieving probe positions
profile_type (str) – profile orientation as a function of sensor position; the default is lp corresponding to DOWN (ignores w2 measurements)
bulk_velocity (float) – bulk streamwise flow velocity in m/s (from input.xlsx)
log_length (float) – characteristic log length (either diameter or length) in m (from input.xlsx)
- Returns
Organized overview pandas.DataFrame with measurement stats, ready for dumping to workbook
- profile_analyst.get_data_info(folder_name='test-example')[source]
get names of input file names and prepare output matrix according to number of files
- profile_analyst.load_input_defs(file_name='/home/docs/checkouts/readthedocs.org/user_builds/tke-calculator/checkouts/latest/docs/input.xlsx')[source]
loads provided input file name as pandas dataframe
- profile_analyst.read_vna(vna_file_name)[source]
Read vna file name as pandas dataframe.
- Parameters
vna_file_name (str) – name of a vna file, such as __8_16.5_6_T3.vna
- Returns
pd.DataFrame
- profile_analyst.vna_file_name2coordinates(vna_file_name)[source]
Take vna file name and extract x, y, and z coordinates in meters. Non-convertible numbers are translated into np.nan with warning.
- Parameters
vna_file_name (str) – name of a vna file, such as __8_16.5_6_T3.vna
- Returns
list [x, y, z] coordinates
profile_plotter.py
Plot functions for TKE visualization
Note
The script represents merely a start for plotting normalized TKE against normalized X. If required, enrich this script with more plot functions and integrate them in profile_analyst.process_vna_files at the bottom of the function.
rmspike.py
- rmspike.rmspike(vna_df, u_stats, v_stats, w_stats, w2_stats=None, method='velocity', freq=200.0, lambda_a=1.0, k=3.0, profile_type='lp')[source]
Spike removal and replacement - see Nikora & Goring (1999) and Goring & Nikora (2002).
- Parameters
vna_df (pandas.DataFrame) – matrix-like data array of the vna measurement file
u_stats (pandas.DataFrame) – streamwise velocity stats from flowstat function
v_stats (pandas.DataFrame) – perpendicular velocity stats from flowstat function
w_stats (pandas.DataFrame) – vertical velocity stats from flowstat function
w2_stats (pandas.DataFrame) – sec. vertical velocity stats from flowstat function (only required if profile_type is not lp)
method (str) – determines whether to use acceleration or velocity (default) for despiking
freq (int) – sampling frequency in 1/s (Hz); default is 200 Hz
lambda_a (float) – multiplier of gravitational acceleration (acceleration threshold)
k (float) – multiplier of velocity stdev (velocity threshold)
side (str) – orientation of the probe (default: DOWN, which mean probe looks like FlowTracker in a river)
Note
Goring & Nikora (2002) suggest lambda_a = 1.0 ~ 1.5 and k = 1.5, but we shall use lambda_a = 1.0 and k = 3 ~ 9. SonTek, Nortek, and Lei recommend the SNR and correlation thresholds to be 15 and 70 respectively. Though data points have high SNR, the correlation can be low.
Disclaimer and License
Disclaimer (general)
No warranty is expressed or implied regarding the usefulness or completeness of the information provided for tke-analyst and its documentation. References to commercial products do not imply endorsement by the Author of tke-analyst. The concepts, materials, and methods used in the codes and described in the docs are for informational purposes only. The Author have made substantial effort to ensure the accuracy of the code and the docs and the Author shall not be held liable, nor their employers or funding sponsors, for calculations and/or decisions made on the basis of application of tke-analyst. The information is provided “as is” and anyone who chooses to use the information is responsible for her or his own choices as to what to do with the code, docs, and data and the individual is responsible for the results that follow from their decisions.
BSD 3-Clause License
Copyright (c) 2022, the Author. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.