3. Split survey data into lines#

We will try and automatically split a survey into flight lines using various techniques and compare there results to the hand-split lines from the published survey data.

[150]:
%load_ext autoreload
%autoreload 2


import geopandas as gpd
import pandas as pd
import plotly.io as pio

import airbornegeo

pio.renderers.default = "notebook"
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload

3.1. Load data#

This is a subset of the BAS AGAP survey over Antarctica’s Gamburtsev Subglacial Mountains. The file is downloaded and subset in the notebook AGAP_magnetic_survey.

[152]:
data_df = pd.read_csv("data/AGAP_magnetic_survey_processed_blocked.csv")

# only keep relevant columns
data_df = data_df[
    [
        "easting",
        "northing",
        "latitude",
        "longitude",
        "unixtime",
        "line",
    ]
]

# retain only a subset of lines
data_df = data_df[(data_df.line.between(168, 176)) | (data_df.line.between(90, 127))]

# rename lines starting from 1
data_df["original_line"] = airbornegeo.unique_line_id(data_df, "line")
data_df = data_df.drop(columns="line")

# sort by time and line
# data_df = data_df.sort_values(by=["original_line", "unixtime"])
data_df = data_df.sort_values(by=["unixtime"])

# turn to geopandas geodataframe
data_df = gpd.GeoDataFrame(
    data_df,
    geometry=gpd.points_from_xy(x=data_df.easting, y=data_df.northing),
    crs="EPSG:3031",
)

data_df.head()
[152]:
easting northing latitude longitude unixtime original_line geometry
592564 1307565.129 550917.135 -76.995 67.153 1229692438.500 39 POINT (1307565.129 550917.135)
592565 1307617.217 550827.922 -76.995 67.157 1229692440.500 39 POINT (1307617.217 550827.922)
592566 1307664.827 550736.002 -76.995 67.161 1229692442.500 39 POINT (1307664.827 550736.002)
592567 1307707.061 550641.238 -76.995 67.165 1229692444.500 39 POINT (1307707.061 550641.238)
592568 1307743.369 550543.805 -76.995 67.170 1229692446.500 39 POINT (1307743.369 550543.805)
[153]:
print(f"Originally {len(data_df.original_line.unique())} lines")
Originally 47 lines
[154]:
airbornegeo.plotly_points(
    data_df,
    color_col="original_line",
    hover_cols=[
        "unixtime",
    ],
    robust=False,
    size=3,
)

3.2. Split lines on time gaps#

Here we assume any time gap greater than 2 minutes between successive points marks the end of one line and the start of another.

[ ]:
data_df["segments_by_time"] = airbornegeo.split_into_segments(
    data_df,
    threshold=60 * 2,  # 2 minutes
    column_name="unixtime",
)
print(f"{len(data_df.segments_by_time.unique())} segments")
46 segments
[156]:
airbornegeo.plotly_points(
    data_df,
    color_col="segments_by_time",
    hover_cols=[
        "original_line",
        "unixtime",
    ],
    robust=False,
    size=3,
)

3.3. Split lines on distance gaps#

Here we assume any distance gap greater than 5 km between successive points marks the end of one line and the start of another.

[157]:
data_df["relative_distance"] = airbornegeo.relative_distance(
    data_df,
    easting_column="easting",
    northing_column="northing",
)

data_df["segments_by_distance"] = airbornegeo.split_into_segments(
    data_df,
    threshold=5e3,  # 5 km
    column_name="relative_distance",
)
print(f"{len(data_df.segments_by_distance.unique())} segments")
46 segments
[158]:
airbornegeo.plotly_points(
    data_df,
    color_col="segments_by_distance",
    hover_cols=["original_line", "relative_distance"],
    robust=False,
    size=3,
)

3.4. Split lines on track / heading changes#

Here we assume any change of track (heading) more than 45 degrees between successive points marks the end of one line and the start of another. First we need to calculate the track from the latitude and longitude of the data.

[183]:
data_df["track"] = airbornegeo.track(
    data_df,
    latitude_column="latitude",
    longitude_column="longitude",
    ellipsoid=False,
)

airbornegeo.plotly_points(
    data_df,
    color_col="track",
    hover_cols=["original_line"],
    size=3,
)
[ ]:
data_df = data_df.sort_values("unixtime")
data_df["segments_by_track"] = airbornegeo.split_into_segments(
    data_df,
    threshold=40,  # 20 degree track change
    column_name="track",
    angular_difference=True,  # the difference between two values on either side of (0/360) should be small
)
print(f"{len(data_df.segments_by_track.unique())} segments")
66 segments
[250]:
airbornegeo.plotly_points(
    data_df,
    color_col="segments_by_track",
    hover_cols=["original_line", "track"],
    robust=False,
    size=3,
)
[ ]: