A ship-to-cloud machine learning pipeline built on the open-source Python Echostack software tools

Abstract

Successful application of machine learning (ML) methodology requires iterative development and testing of not only the models but also the entire workflow on the very platform and operating scenario the development aims to serve, before the framework is generalized to other settings. In this work we present our implementation of a ship-to-cloud ML pipeline during the 2023 Pacific hake acoustics-trawl survey. Hake is a keystone species in the northern California Current ecosystem and supports the largest fishery on the west coast of the U.S. By integrating an echogram semantic segmentation model targeting hake with the “Echostack” suite of open-source Python software packages, our pipeline transformed raw instrument-generated binary files into hake aggregation predictions, which were displayed in two ways: in a configurable Python dashboard that allows sharing widely with collaborators, and in Echoview for aligning with live screening. We transmitted data products with reduced resolution and the corresponding ML predictions to the cloud in sub-realtime, allowing shore-side interaction. We plan to incorporate biomass estimation based on initial fish biometric measurements, automate the orchestration of this ship-to-cloud pipeline, and prototype an ML-driven annotation framework in the future.

Date
Apr 11, 2024 9:00 AM
Location
Brest, France
Wu-Jung Lee
Wu-Jung Lee
Senior Oceanographer
Valentina Staneva
Valentina Staneva
Senior Data Scientist
Dingrui Lei
Dingrui Lei

I am interested in building applications!

Related