HydrOffice news

Plotting attitude data from Kongsberg file in Python

Dec. 27, 2014, midnight, author: gmasetti


In this post, we will use Huddl to read, verify and plot the attitude measurements collected together with acoustic data in a Kongsberg EM Series format file. This format, with extension .all, has a long release history; the most recent format specifications are reported here.    


Aerial view of the Center for Coastal and Ocean Mapping, UNH.


The Huddl Format Description for Kongsberg data format

A Huddl Format Description (HFD) is an XML file that strictly follows the Huddl Core Schemas (HCS). Each HFD file describes the physical and logical implementation of a data format, but can also be used to create documentation explaining the semantic meaning to a human through the use of a suitable XSLT (Extensible Stylesheet Language Transformations) translator, which are widely available. A top-level container, called a Schema, may hold several different descriptions of data formats, each of which has both a Prolog and a Content element.

The Prolog

The Prolog represents a collection of metadata related to the data format, such as the organization that created it, the personnel responsible for its maintenance, or the history of releases. For the Kongsberg EM Series data format, the HFD provides some information related to the manufacturer:
<huddl:format name="Kongsberg EM Series" scope="kng">
        <huddl:title>Instruction Manual - EM Series - Multibeam echo sounders - Datagram formats</huddl:title>
        <huddl:alternateTitle>Kongsberg EM Series formats specification</huddl:alternateTitle>
            <huddl:name>Kongsberg Maritime AS</huddl:name>
            <huddl:revision revID="PreA" date="1996-01-01">
                <huddl:description>'Original' datagrams, first use of EM96 (EM3-type)</huddl:description>
            // ... additional revision description ... //
             <huddl:revision revID="R" date="2013-10-01">
                <huddl:description>Quality factor corrected. Multibeam installation parameters updated.</huddl:description>
This information is required to create a homogeneous and consistent documentation for different data formats, although the extent to which it is implemented can vary between formats – the better the information, the more complete, and useful, the documentation.

The Content

The Content branch is used to describe both the structure and the format of a binary data file in a platform-independent way.
Inside the Content, Blocks may contain any number of fields and available data structures (e.g., 2D array). A Block represents a logically related group of information elements committed to file at the same time instant. A Field is used as a basic value container (e.g., bytes, two- or four-byte integers, floating point numbers, etc.). To allow the reading of complex data structures, blocks can also contain other blocks, optionally as one- and two-dimensional arrays with either fixed or variable sizes defined in the preceding data block. The following HFD snippet shows as example the start of a Content branch for the Kongsberg EM Series with the description of the emhdr block that is present in all the datagrams, used both to identify the type of datagram and to test their validity:
            <!-- header and tail blocks -->
            <huddl:block name="emhdr">
                <huddl:field name="datasize" type="huddl:u32" minValue="100" maxValue="4100"/>
                <huddl:field name="stx" type="huddl:u8" minValue="2" maxValue="2"/>
                <huddl:field name="type" type="huddl:u8"/>
                <huddl:field name="modelnum" type="huddl:u16"/>
                <huddl:field name="date" type="huddl:u32"/>
                <huddl:field name="time" type="huddl:u32" minValue="0" maxValue="86399999"/>
The data structure for the attitude data is then described by two blocks. The topblock em96attitude_revA describes the organization of several fields:
    <huddl:block name="em96attitude_revA">
        <huddl:field name="counter" type="huddl:u16"/>
        <huddl:field name="serialnum" type="huddl:u16"/>
        <huddl:field name="numentries" type="huddl:u16"/>
        <huddl:vector1d name="attitude">
        <huddl:field name="motiondesc" type="huddl:u8">
            <huddl:note>Motion sensor description byte</huddl:note>
        <huddl:note>b[0] => heading active; b[1] => roll inactive; b[2] => pitch inactive; b[3] => heave inactive.</huddl:note>
The attitude entry is described as a vector1d of em96att blocks. This means that the number of elements is reported in the field numentries and that the element type is user-defined as described in another block:
    <huddl:block name="em96att">
        <huddl:field name="latency" type="huddl:u16">
            <huddl:note>Time offset since start of record in ms</huddl:note>
        <huddl:field name="status" type="huddl:u16"/>
        <huddl:field name="roll" type="huddl:s16">
            <huddl:note>In 0.01 deg. steps</huddl:note>
        <huddl:field name="pitch" type="huddl:s16">
            <huddl:note>In 0.01 deg. steps</huddl:note>
        <huddl:field name="heave" type="huddl:s16">
            <huddl:note>in cm steps</huddl:note>
        <huddl:field name="heading" type="huddl:u16">
            <huddl:note>in 0.01 deg. steps</huddl:note>

HFD Validation and Code Generation

Once the format is described in a HFD file, it can be used to automatically generate driver code to access the Kongsberg raw files. This may done most efficiently by using Huddler, the HFD-based compiler, that is part of the suite of the Huddl-based tools. In addition to code-generation, Huddler also allows for optional validation of the HFD content against the Huddl Core Schemas. Thus, a possible list of arguments for the compiler is:
huddler -i both                 # both endianness as input
        -t both                 # both endianness as output
        -o kongsberg_em_series  # output base name
        -l c                    # generate C code
        -b                      # generate interface file
        -v                      # validate using internal xsd
Such a command, after the HFD validation, will create three files:
  • kongsberg_em_series.h: header file
  • kongsberg_em_series.c: source code file
  • kongsberg_em_series.i: interface file
The first two files are valid C files that can be used both in C and in C++ programs. The compiler can be easily extended to other languages. The interface file is used to build a Python binding calling the SWIG compiler: swig -python kongsberg_em_series.i. This will create other two files:
  • kongsberg_em_series_wrap.c: wrapper file
  • kongsberg_em_series.py: Python module
The wrapper file is then compiled as a .pyd file (that is similar to a Windows DLL).

Python script

The Huddl-based Python module can be used to access all the datagrams present in a Kongsberg file. We will now focus on the attitude datagram that provides measures of roll, pitch, heave and heading collected asynchronously to the acoustic data. The script uses several commonly used modules, such as Matplotlib for plotting utilities and Pandas for time series manipulation.
In [1]:
import kongsberg_em_series as kng
import numpy as np
import pandas as pd
import matplotlib
from matplotlib import pyplot as plt
from matplotlib import rcParams
import os.path
import datetime
Some helper methods and data structures are used to store and manage the attitude data content:
In [2]:
class KngData:
    """ container for Kongsberg attitude data """
    time = np.zeros(0, dtype='datetime64[us]')
    roll = np.zeros(0, dtype=np.float)
    pitch = np.zeros(0, dtype=np.float)
    heave = np.zeros(0, dtype=np.float)
    heading = np.zeros(0, dtype=np.float)

def fix_date(date):
    """ convert Kongsberg date """
    day = date % 100
    month = date / 100
    year = month / 100
    date = str("%.4d/%.2d/%.2d" %(year % 10000, month % 100, day % 100))
    return date

def fix_time(time):
    """ convert Kongsberg time """
    msec = time % 1000
    sec = time / 1000
    mins = sec / 60
    hr = mins / 60
    time = str("%.2d:%.2d:%.2d.%.3d" %( hr % 24, mins % 60, sec % 60, msec))
    return time

def make_datetime(date, time):
    """ convert Kongsberg date and time to Python datetime """
    time_string = fix_date(date) + " "+ fix_time(time) + "Z"
    dt = datetime.datetime.strptime(time_string, "%Y/%m/%d %H:%M:%S.%fZ")
    return dt
The attitude data samples are read from each attitude datagram, and added to the KngData class, using the following method:
In [3]:
def read_attitude(data_header, data_content, kng_data):
    """ read and store a Kongsberg attitude datagram """
    nr_entries = data_content.attitude.numentries
    loc_time = np.zeros(nr_entries, dtype='datetime64[us]')
    loc_roll = np.zeros(nr_entries, dtype=np.float)
    loc_pitch = np.zeros(nr_entries, dtype=np.float)
    loc_heave = np.zeros(nr_entries, dtype=np.float)
    loc_heading = np.zeros(nr_entries, dtype=np.float)

    for i in range(nr_entries):
        loc_att = kng.em96att_array_getitem(data_content.attitude.attitude, i)
        loc_time[i] = make_datetime(data_header.date, data_header.time)
        loc_roll[i] = loc_att.roll / 100.0
        loc_pitch[i] = loc_att.pitch / 100.0
        loc_heave[i] = loc_att.heave / 100.0
        loc_heading[i] = loc_att.heading / 100.0

    kng_data.time = np.append(kng_data.time, loc_time)
    kng_data.roll = np.append(kng_data.roll, loc_roll)
    kng_data.pitch = np.append(kng_data.pitch, loc_pitch)
    kng_data.heave = np.append(kng_data.heave, loc_heave)
    kng_data.heading = np.append(kng_data.heading, loc_heading)
This helper method relies on the Pandas module to provide some statistics from the retrieved data:
In [4]:
def print_attitude_info(kng_data):
    """ provide some info about read Kongsberg attitude data """
    df = pd.DataFrame({'Roll': kng_data.roll,
                       'Pitch': kng_data.pitch,
                       'Heave': kng_data.heave,
                       'Heading': kng_data.heading}, index=kng_data.time)
    df = df.tz_localize('UTC')
    print('\n', df.describe())
The data are displayed using the Matplotlib module as four plots:
In [5]:
def plot_attitude(kng_data):
    """ plot and save the read Kongsberg attitude data """
    fig, (ax0, ax1, ax2, ax3) = plt.subplots(ncols=1, nrows=4, figsize=(8, 12), dpi=300)
    xfmt = matplotlib.dates.DateFormatter('%H:%M:%S')
    matplotlib.rcParams.update({'font.size': 8, 'text.usetex': True, 'font.family': 'serif'})

    plot0 = ax0.fill_between(kng_data.time, kng_data.roll, linewidth=0.3, facecolor='green')
    ax0.set_ylim([-5.0, 5.0])
    ax0.grid(color='gray', alpha=0.5, linewidth=0.5)
    ax0.set_ylabel('$value [deg]$')
    ax0.yaxis.set_label_coords(-0.06, 0.5)

    plot1 = ax1.fill_between(kng_data.time, kng_data.pitch, linewidth=0.3, facecolor='orange')
    ax1.set_ylim([-5.0, 5.0])
    ax1.grid(color='gray', alpha=0.5, linewidth=0.5)
    ax1.set_ylabel('$value [deg]$')
    ax1.yaxis.set_label_coords(-0.06, 0.5)

    plot2 = ax2.fill_between(kng_data.time, kng_data.heave, linewidth=0.3, facecolor='red')
    ax2.set_ylim([-1.0, 1.0])
    ax2.grid(color='gray', alpha=0.5, linewidth=0.5)
    ax2.set_ylabel('$value [m]$')
    ax2.yaxis.set_label_coords(-0.06, 0.5)

    plot3 = ax3.plot(kng_data.time, kng_data.heading, linewidth=0.5)
    ax3.set_ylim([0, 360])
    ax3.grid(color='gray', alpha=0.5, linewidth=0.5)
    ax3.set_ylabel('$value [deg]$')
    ax3.yaxis.set_label_coords(-0.06, 0.5)

    fig.savefig('kng_attitude', dpi=600)
The reading mechanism to access the Kongsberg file content is quite simple. It first opens the raw file, then loops through the contents looking for attitude datagrams:
In [6]:
print("Huddl - Pythonic attitude example")
testfile_path = "C:/.../surveyline.all"
if not os.path.isfile(testfile_path):
    raise IOError("the file %s does not exist" % testfile_path)
ip = kng.fopen(testfile_path, "rb")
if ip is None:
    raise IOError("data file not successfully open")

position = kng.ftell(ip)
kng_data = KngData()
data = kng.kng_revR_t()
n_read = 0
abort = False
datagram_nr = 0
resync_event = 0
kng.fseek(ip, 0, 0)  # to be sure that we start from the beginning of file
datagram_dict = dict()

while not abort:
    rc, n_read = kng.read_kng_revR(ip, data, n_read, kng.validate_em96)

    if datagram_nr == 15000:
        abort = True

    if rc == kng.HUDDLER_TOPBLOCK_OK:
        datagram_nr += 1
        datagram_dict[data.header.type] = datagram_dict.get(data.header.type, 0) + 1
        if n_read != (data.header.datasize + 4):
            print("> #", datagram_nr, " -> repositioning")
            kng.fseek(ip, position, 0)  # Return to position before read
            kng.fseek(ip, data.header.datasize + 4, 1)  # Skip size + 4 (the length word in the datagram)

        if data.header.type == kng.KNG_REVR_ATTITUDE:
            read_attitude(data.header, data.datagram, kng_data)
            if (datagram_dict.get(data.header.type) / 100) == 0:
                print("> #", datagram_nr, "[r: ", n_read, "] -> type: ", data.header.type, " -> hex: ",

    elif rc == kng.HUDDLER_TOPBLOCK_UNKNOWN:
        print("> #", datagram_nr, " -> UNKNOWN datagram ", hex(data.header.type), " at offset ", position)
        datagram_nr += 1
        kng.fseek(ip, position, 0)  # Return to position before read
        kng.fseek(ip, data.header.datasize + 4, 1)  # Skip size + 4 (the length word in the datagram)

    elif rc == kng.HUDDLER_NO_HEADER:
        print("> #", datagram_nr, " -> no header at offset ", position, " -> resynch #", resync_event)
        resync_event += 1

        print("> #", datagram_nr, " -> fail reading datagram at offset ", position, " -> resynch #", resync_event)
        resync_event += 1

    elif rc == kng.HUDDLER_INVALID_TAIL:
        print("> #", datagram_nr, " -> invalid tail at offset ", position, " -> resynch #", resync_event)
        resync_event += 1

    elif rc == kng.HUDDLER_EOF_WHILE:
        print("> #", datagram_nr, " -> end of file at offset ", position)
        abort = True

    elif rc == kng.HUDDLER_READ_ERROR:
        print("> #", datagram_nr, " -> unexpected read error at offset ", position, " -> aborting")
        abort = True

        print("> unknown read return: ", rc)
        abort = True

    position = kng.ftell(ip)

print("> last position in file: ", position)
ret = kng.fclose(ip)
if ret == 0:
    print("> data file closed")
Huddl - Pythonic attitude example
> last position in file:  86465224
> data file closed
This script reads through the data file, then the attitude content with all the read entries is summarized:
In [7]:
print("\nRead attitude datagrams:", datagram_dict[kng.KNG_REVR_ATTITUDE], ', entries: ', kng_data.time.shape[0])
Read attitude datagrams: 423 , entries:  42777
                                  Heading  Heave  Pitch  Roll
2012-06-06 17:59:17.165000+00:00   223.92  -0.13  -0.60 -0.96
2012-06-06 17:59:17.165000+00:00   223.93  -0.13  -0.62 -0.96
2012-06-06 17:59:17.165000+00:00   223.95  -0.13  -0.65 -0.96
2012-06-06 17:59:17.165000+00:00   223.96  -0.13  -0.67 -0.97
2012-06-06 17:59:17.165000+00:00   223.97  -0.12  -0.69 -0.97
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-06-06 17:59:17.165000+00:00, ..., 2012-06-06 18:06:23.924000+00:00]
Length: 42777, Freq: None, Timezone: UTC

             Heading         Heave         Pitch          Roll
count  42777.000000  42777.000000  42777.000000  42777.000000
mean     214.174787     -0.132106      0.810069     -0.383255
std       13.779359      0.241842      0.978344      1.366207
min      186.930000     -0.790000     -2.030000     -4.330000
25%      203.110000     -0.310000      0.140000     -1.380000
50%      215.050000     -0.130000      0.830000     -0.430000
75%      225.030000      0.050000      1.510000      0.600000
max      245.320000      0.550000      3.460000      3.360000
Finally, the attitude data are plotted, as time series, using four horizontal panes:
In [8]:
%matplotlib inline


tags: notebook; huddl; attitude