3. Split survey data into lines#
We will try and automatically split a survey into flight lines using various techniques and compare there results to the hand-split lines from the published survey data.
[150]:
%load_ext autoreload
%autoreload 2
import geopandas as gpd
import pandas as pd
import plotly.io as pio
import airbornegeo
pio.renderers.default = "notebook"
The autoreload extension is already loaded. To reload it, use:
%reload_ext autoreload
3.1. Load data#
This is a subset of the BAS AGAP survey over Antarctica’s Gamburtsev Subglacial Mountains. The file is downloaded and subset in the notebook AGAP_magnetic_survey.
[152]:
data_df = pd.read_csv("data/AGAP_magnetic_survey_processed_blocked.csv")
# only keep relevant columns
data_df = data_df[
[
"easting",
"northing",
"latitude",
"longitude",
"unixtime",
"line",
]
]
# retain only a subset of lines
data_df = data_df[(data_df.line.between(168, 176)) | (data_df.line.between(90, 127))]
# rename lines starting from 1
data_df["original_line"] = airbornegeo.unique_line_id(data_df, "line")
data_df = data_df.drop(columns="line")
# sort by time and line
# data_df = data_df.sort_values(by=["original_line", "unixtime"])
data_df = data_df.sort_values(by=["unixtime"])
# turn to geopandas geodataframe
data_df = gpd.GeoDataFrame(
data_df,
geometry=gpd.points_from_xy(x=data_df.easting, y=data_df.northing),
crs="EPSG:3031",
)
data_df.head()
[152]:
| easting | northing | latitude | longitude | unixtime | original_line | geometry | |
|---|---|---|---|---|---|---|---|
| 592564 | 1307565.129 | 550917.135 | -76.995 | 67.153 | 1229692438.500 | 39 | POINT (1307565.129 550917.135) |
| 592565 | 1307617.217 | 550827.922 | -76.995 | 67.157 | 1229692440.500 | 39 | POINT (1307617.217 550827.922) |
| 592566 | 1307664.827 | 550736.002 | -76.995 | 67.161 | 1229692442.500 | 39 | POINT (1307664.827 550736.002) |
| 592567 | 1307707.061 | 550641.238 | -76.995 | 67.165 | 1229692444.500 | 39 | POINT (1307707.061 550641.238) |
| 592568 | 1307743.369 | 550543.805 | -76.995 | 67.170 | 1229692446.500 | 39 | POINT (1307743.369 550543.805) |
[153]:
print(f"Originally {len(data_df.original_line.unique())} lines")
Originally 47 lines
[154]:
airbornegeo.plotly_points(
data_df,
color_col="original_line",
hover_cols=[
"unixtime",
],
robust=False,
size=3,
)
3.2. Split lines on time gaps#
Here we assume any time gap greater than 2 minutes between successive points marks the end of one line and the start of another.
[ ]:
data_df["segments_by_time"] = airbornegeo.split_into_segments(
data_df,
threshold=60 * 2, # 2 minutes
column_name="unixtime",
)
print(f"{len(data_df.segments_by_time.unique())} segments")
46 segments
[156]:
airbornegeo.plotly_points(
data_df,
color_col="segments_by_time",
hover_cols=[
"original_line",
"unixtime",
],
robust=False,
size=3,
)
3.3. Split lines on distance gaps#
Here we assume any distance gap greater than 5 km between successive points marks the end of one line and the start of another.
[157]:
data_df["relative_distance"] = airbornegeo.relative_distance(
data_df,
easting_column="easting",
northing_column="northing",
)
data_df["segments_by_distance"] = airbornegeo.split_into_segments(
data_df,
threshold=5e3, # 5 km
column_name="relative_distance",
)
print(f"{len(data_df.segments_by_distance.unique())} segments")
46 segments
[158]:
airbornegeo.plotly_points(
data_df,
color_col="segments_by_distance",
hover_cols=["original_line", "relative_distance"],
robust=False,
size=3,
)
3.4. Split lines on track / heading changes#
Here we assume any change of track (heading) more than 45 degrees between successive points marks the end of one line and the start of another. First we need to calculate the track from the latitude and longitude of the data.
[183]:
data_df["track"] = airbornegeo.track(
data_df,
latitude_column="latitude",
longitude_column="longitude",
ellipsoid=False,
)
airbornegeo.plotly_points(
data_df,
color_col="track",
hover_cols=["original_line"],
size=3,
)
[ ]:
data_df = data_df.sort_values("unixtime")
data_df["segments_by_track"] = airbornegeo.split_into_segments(
data_df,
threshold=40, # 20 degree track change
column_name="track",
angular_difference=True, # the difference between two values on either side of (0/360) should be small
)
print(f"{len(data_df.segments_by_track.unique())} segments")
66 segments
[250]:
airbornegeo.plotly_points(
data_df,
color_col="segments_by_track",
hover_cols=["original_line", "track"],
robust=False,
size=3,
)
[ ]: