About

This Python3 code aids in analyzing raw measurements with an Acoustic Doppler Velocimeter (ADV) producing *.vno and *.vna files. It detects and removes spikes according to Nikora and Goring (1998) and Goring and Nikora (2002).

The code was originally developed in Matlab(R) at the Nepf Environmental Fluid Mechanics Laboratory (Massachusetts Institute of Technology).

Important

*.vno and *.vna files need to comply with the following name convention: XX_YY_ZZ_something.vna where XX, YY, and ZZ are streamwise (x), perpendicular (y), and vertical (z) coordinates in CENTIMETERS, respectively. Anything else added after ZZ_ is ignored by the code (it just copies it for the sake of dataset naming).

Note

This documentation is also as available as style-adapted PDF.

Requirements & Installation

Time requirement: 5-10 min.

Get Python

To get the code running, the following software is needed and their installation instructions are provided below:

  • Python >=3.6

  • NumPy >=1.17.4

  • Openpyxl 3.0.3

  • Pandas >=1.3.5

  • Matplotlib >=3.1.2

Start with downloading and installing the latest version of Anaconda Python. Alternatively, downloading and installing a pure Python interpreter will also work. Detailed information about installing Python is available in the Anaconda Docs and at hydro-informatics.com/python-basics.

To install the NumPy, Openpyxl, Pandas, and Matplotlib libraries after installing Anaconda, open Anaconda Prompt (e.g., click on the Windows icon, tap anaconda prompt, and hit enter``). In Anaconda Prompt, enter the following command sequence to install the libraries in the base environment. The installation may take a while depending on your internet speed.

conda install -c anaconda numpy
conda install -c anaconda openpyxl
conda install -c anaconda numpy
conda install -c conda-forge pandas
conda install -c conda-forge matplotlib

If you are struggling with the dark window and blinking cursor of Anaconda Prompt, worry not. You can also use Anaconda Navigator and install the four libraries (in the above order) in Anaconda Navigator.

Note

Alternatively, create a new conda environment to install the three libraries for this application. However, creating a new environment may eat up a lot of disk space, and installing the Python-omnipresent libraries NumPy, Openpyxl, Pandas, and Maplotlib in the base environment does not hurt.

Download tke-analyst Code

The code can be either started from Terminal (Anaconda Prompt) or within an Integrated Development Environment (IDE). With Anaconda installed, consider using Spyder (Anaconda Navigator > Spyder IDE).

Download tke-calculator.zip and unpack it to the directory where you want to run the code.

Tip

Alternatively to downloading the zip file, you may want to git clone the repository, which enables regular updating of the code (e.g., if there is an update of plot functions available). For using git, make sure that git bash is installed on your computer. Then, open git bash, cd into the directory where you want to download the code and type:

git clone https://github.com/sschwindt/tke-calculator.git

To update any time, cd into the directory where tke-calculator lives and type:

git pull --rebase

Usage

Regular Usage

With Python installed and the code living on your computer:

  • Copy your data to a sub-folder of tke-analyst (e.g., next to the folder data/test-example that contains three exemplary *.vna files). Make sure the files are named with XX_YY_ZZ_something.vna where XX, YY, and ZZ are streamwise (x), perpendicular (y), and vertical (z) coordinates in CENTIMETERS, respectively

  • Complete the required information on the experimental setup in tke-calculator/input.xlsx (see below figure). IMPORTANT: Never modify column A or any list in the sourcetables sheet (unless you also modify load_input_defs in line 25ff of profile_analyst.py ). The code uses the text provided in these areas of input.xlsx to identify setups. If useful, consider substituting the Wood wording in your mind and with a note in column C with your characteristic turbulence objects, but do not modify column A.

  • Open Anaconda Prompt (or any other Python-able Terminal) and:
    • cd into the code directory (e.g., cd "C:research\project\tke-analyst" if you unpacked tke-analyst to a folder living in the directory C:\research\project\)

    • run the code: python profile_analyst.py (uses input.xlsx)

    • ALTERNATIVELY, run with another *.xlsx input file: python profile_analyst.py "input-other-test.xlsx"

    • wait until the code finished with -- DONE -- ALL TASKS FINISHED --

input turbulent tke experiment setup parameters

The interface of the input.xlsx workbook for entering experiment parameters and specifying a despiking method.

  • After a successful run, the code will have produced the following files in ...\tke-analyst\TEST (where TEST may correspond to test-example):
    • .xlsx files of full-time series data, with spikes and despiked.

    • .xlsx files of statistic summaries (i.e., average, standard deviation std, TKE) of velocity parameters with x, y, and z positions, with spikes and despiked (see workbook example in the figure below).

    • Two plots (norm-tke-x.png and norm-tke-x-despiked.png) showing normalized TKE plotted against normalized x, with spikes and despiked, respectively (see plot example in the figure below).

example output tke-calculator
example output normalized tke plot

Usage Example

For example, download and unpack the code to your hard-disk in a folder called C:\my-project\tke-analyst\. To analyze the *.vna files in test-example, they were copied into a test folder that lives in the data folder.

The definitions in the above-shown input.xlsx define x-normalization as a function of a wood log length, in this case, the log diameter of 0.114 m.

Cell B3 containing Input folder name (tke-analyst/) in input.xlsx defines that the input data for test-example live in a subfolder called data/test-example.

Important

The data directory of the subfolder definition in cell B3 may not end on any \ or / . Also, make sure to use the / sign for folder name separation (do not use \).

To run the code with the example data, open Anaconda Prompt (or any other Python-able Terminal) and:
  • cd into the code directory (e.g., cd "C:research\project\tke-analyst" if you unpacked tke-analyst to a folder living in the directory *C:researchproject*)

  • run the code: python profile_analyst.py (uses input.xlsx)

  • Or: python profile_analyst.py "input.xlsx"

  • wait until the code finished with -- DONE -- ALL TASKS FINISHED --

  • After a successful run, the code will have produced the following files in ...\tke-analyst\data\test-example:
    • .xlsx files of full-time series data, with spikes and despiked.

    • .xlsx files of statistic summaries (i.e., average, standard deviation std, TKE) of velocity parameters with x, y, and z positions, with spikes and despiked.

    • Two plots (norm-tke-x.png and norm-tke-x-despiked.png) showing normalized TKE plotted against normalized x, with spikes and despiked, respectively.

Developer Docs

The following sections provide details of functions, their arguments, and outputs to help tweaking the code for individual purposes.

config.py

Global parameters settings (essentially SCRIPT_DIR) and message logging controls.

flowstat.py

flowstat.flowstat(time, u, v, w1, w2, profile_type='lp')[source]

Calculate ADV data statistics

Parameters
  • time (np.array) – time in seconds

  • u (np.array) – streamweise velocity along x-axis (positive in bulk flow direction)

  • v (np.array) – perpendicular velocity along y-axis

  • w1 (np.array) – vertical velocity if side is DOWN

  • w2 (np.array) – vertical velocity if side is not DOWN

  • profile_type (str) – orientation of the probe (default: lp, which mean probe looks like FlowTracker in a river)

Returns

keys correspond to series names and values to full time series stats (dict(dict)): keys correspond to series names with STAT for autoreplacement with STAT type of nested dictionaries with AVRG, STD and STDERR

Return type

time_series (dict)

profile_analyst.py

Load ADV measurements and calculate TKE with plot options Originally coded in Matlab at Nepf Lab (MIT) Re-written in Python by Sebastian Schwindt (2022)

profile_analyst.build_stats_summary(vna_stats_dict, experiment_info, profile_type, bulk_velocity, log_length)[source]

Re-organize the stats dataset and assign probe coordinates

Parameters
  • vna_stats_dict (dict) – the result of all vna files processed with the flowstat.flowstat function

  • experiment_info (dict) – the result of the get_data_info function for retrieving probe positions

  • profile_type (str) – profile orientation as a function of sensor position; the default is lp corresponding to DOWN (ignores w2 measurements)

  • bulk_velocity (float) – bulk streamwise flow velocity in m/s (from input.xlsx)

  • log_length (float) – characteristic log length (either diameter or length) in m (from input.xlsx)

Returns

Organized overview pandas.DataFrame with measurement stats, ready for dumping to workbook

profile_analyst.get_data_info(folder_name='test-example')[source]

get names of input file names and prepare output matrix according to number of files

Parameters
  • folder_name (str) – name of the test (experiment) to analyze (default is test-example)

  • input_file_name (str) – name of input file (default is input.xlsx)

Returns

pd.DataFrame with row names corresponding to file names ending on .vna, and columns X, Y, Z in meters

profile_analyst.load_input_defs(file_name='/home/docs/checkouts/readthedocs.org/user_builds/tke-calculator/checkouts/latest/docs/input.xlsx')[source]

loads provided input file name as pandas dataframe

Parameters

file_name (str) – name of input file (default is input.xlsx)

Returns

user input of input.xlsx (or costum file, if provided)

Return type

(dict)

profile_analyst.read_vna(vna_file_name)[source]

Read vna file name as pandas dataframe.

Parameters

vna_file_name (str) – name of a vna file, such as __8_16.5_6_T3.vna

Returns

pd.DataFrame

profile_analyst.vna_file_name2coordinates(vna_file_name)[source]

Take vna file name and extract x, y, and z coordinates in meters. Non-convertible numbers are translated into np.nan with warning.

Parameters

vna_file_name (str) – name of a vna file, such as __8_16.5_6_T3.vna

Returns

list [x, y, z] coordinates

profile_plotter.py

Plot functions for TKE visualization

Note

The script represents merely a start for plotting normalized TKE against normalized X. If required, enrich this script with more plot functions and integrate them in profile_analyst.process_vna_files at the bottom of the function.

profile_plotter.plot_xy(x, y, file_name)[source]

Plots y data against x (1d-numpy array) and markers of local maxima and minima

Parameters
  • x (numpy.array) – x data

  • y (numpy.array) – y data

Returns

show and save plot in test folder as norm-TKE-x.png

rmspike.py

rmspike.rmspike(vna_df, u_stats, v_stats, w_stats, w2_stats=None, method='velocity', freq=200.0, lambda_a=1.0, k=3.0, profile_type='lp')[source]

Spike removal and replacement - see Nikora & Goring (1999) and Goring & Nikora (2002).

Parameters
  • vna_df (pandas.DataFrame) – matrix-like data array of the vna measurement file

  • u_stats (pandas.DataFrame) – streamwise velocity stats from flowstat function

  • v_stats (pandas.DataFrame) – perpendicular velocity stats from flowstat function

  • w_stats (pandas.DataFrame) – vertical velocity stats from flowstat function

  • w2_stats (pandas.DataFrame) – sec. vertical velocity stats from flowstat function (only required if profile_type is not lp)

  • method (str) – determines whether to use acceleration or velocity (default) for despiking

  • freq (int) – sampling frequency in 1/s (Hz); default is 200 Hz

  • lambda_a (float) – multiplier of gravitational acceleration (acceleration threshold)

  • k (float) – multiplier of velocity stdev (velocity threshold)

  • side (str) – orientation of the probe (default: DOWN, which mean probe looks like FlowTracker in a river)

Note

Goring & Nikora (2002) suggest lambda_a = 1.0 ~ 1.5 and k = 1.5, but we shall use lambda_a = 1.0 and k = 3 ~ 9. SonTek, Nortek, and Lei recommend the SNR and correlation thresholds to be 15 and 70 respectively. Though data points have high SNR, the correlation can be low.

Disclaimer and License

Disclaimer (general)

No warranty is expressed or implied regarding the usefulness or completeness of the information provided for tke-analyst and its documentation. References to commercial products do not imply endorsement by the Author of tke-analyst. The concepts, materials, and methods used in the codes and described in the docs are for informational purposes only. The Author have made substantial effort to ensure the accuracy of the code and the docs and the Author shall not be held liable, nor their employers or funding sponsors, for calculations and/or decisions made on the basis of application of tke-analyst. The information is provided “as is” and anyone who chooses to use the information is responsible for her or his own choices as to what to do with the code, docs, and data and the individual is responsible for the results that follow from their decisions.

BSD 3-Clause License

Copyright (c) 2022, the Author. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.