ERMITES brings together
international leading researchers and provides participants the
opportunity to gain deeper insight into advanced research trends in
scaled audiovisual information retrieval within an interdisciplinary framework. It is organized as a series
of long talks, during which attendees are invited to interact (links
to online videos of previous editions).
The goal of this edition is to train you to improve the performance of
soundscape or bioacoustic pattern detection and classification, at low
signal to noise ratio, and within the Big Data paradigm.
Thus, the objectives are three-fold: (a) to make the signal
representation more robust, (b) to develop more efficient supervised or unsupervised classification
of complex bioacoustic patterns,
(c) to collect and manage Big Bioacoustic Data
to improve model performance with respect to the variability of the targets.
Therefore ERMITES'15 will focus on methods
scaled to environmental survey using passive acoustics, design methods for accurate mid-level or high level features based on
advanced signal decomposition,
compressed sensing for large scale analyses, Deep Neural Net for accurate classification, as well as methods for real-time 3D tracking.
Illustrations will be given from cetaceans to birds
songs, bats to dolphins biosonars and other animals from deep forest and
Biodiversity analysis and environmental protection projects are some of the
direct outcomes of these algorithms. You'll learn more how to
systems and how to conduct research in this area.
A special session will present a synthesis of data processing methods
for passive acoustic biotope survey with software and hardware
demonstrations under monitoring scenarios (such as ones developed /
tested within SABIOD.ORG project : Jason and Bombyx plateforms, waveShark, SoundChaser, SeaPro, ...).
Introduction, presentation of the sessions and objectives. (10 min)
P. Flandrin, Research Director, CNRS, Académie des Sciences, Ecole Normale Supérieure Lyon
A time-frequency perspective on biosonar signals. (2 hours)
My talk will be devoted to a time-frequency description and
interpretation of some natural sonar signals used in bioacoustic active
navigation systems, namely bat echolocation calls. We'll observe and
explain some properties of the acoustic waveforms emitted by bats, which
consist in ultrasonic transient chirps of a few ms, with a wideband
spectrum of some tens of kHz between 40 and 100kHz. We'll theoretically
show that their performance is close to optimality, with adaptation to
multiple tasks (detection, estimation, recognition, interference
rejection,...). Besides biosonar issues, we'll also give a comprehensive
overview of the mathematical tools (based on Fourier, Wigner-Ville and
other time-frequency representations) that are involved in the analysis.
You can see this introduction to my talk with C. Vilani and some detailed mathematics here.
V. Lostanlen, PhD student, Dpt of Computer Science, Ecole Normale Supérieure Paris
Multivariate scattering for bioacoustic
similarity retrieval. (40 min)
Most of the relevant information in bioacoustic signals is carried in their
transient structure. However, this structure is subject to many factors of
intra-class variability : pitch shifts, changes in chirping contour and
formantic profile, rhythmic drifts and so forth. Thus, a desirable
representation for environmental sounds should disentangle these factors of
variability, hence providing a simple characterization of long-range
interactions. To this aim, we will present the scattering transform, a
multiscale operator that is able to represent accurately acoustic transients
while remaining stable to small deformations in the time-frequency plane. We
will report a numerical evaluation of this approach on the large-scale species
identification challenge BirdCLEF.
H. Glotin, Pr. Univ. Toulon, Inst. Universitaire de France (IUF), CNRS
LSIS, with J. Razik, S. Paris, Toulon
Methods for communicative sounds mining: large scale bird & whale songs classification : how sparse coding allows efficient bioacoustic indexing. (1 hour)
Sound is the primary carrier of communication and exploration for most of the
animals, enabling quick load of information in every ecosystem, from deep forest (bird,
insect, frog...) to long distances (thousands of km for whales), from infra to ultrasounds
(bats...). Scaled passive acoustic monitoring has recently been developped to assess
changes in fauna composition and biodiversity w.r.t. anthropic impact, and to improve
management of species or natural resources (forest, ocean...).
We'll present recent advances within
the 'Scaled Acoustic Biodiversity' SABIOD network of CNRS MASTODONS.
Terabytes are now recorded each month within SABIOD, in forest and
deep ocean, by innovative autonomous sensor arrays.
The most difficulties
are the time-frequency variability of the sources, heterogeneity and velocity
recordings, complexity of the acoustic paths and the sources mixture.
We'll demonstrate strategies to tackle these issues by advanced
signal processing and machine learning methods that we develop to
detect, classify and localize bioacoustic sources, in various ecosystems
More precisely, we'll show how pattern analyses, including sparse coding and scattering
are usefull for tropical forests soundscapes analyses (cf
NIPS4B, ICML4B, and LifeClef2014 and 2015 1000 species challenges), and
undersea acoustic survey (on Toulon astrophysic observatory, ONC Victoria...).
We demonstrate an efficient sparse coding classification towards world wide whale song recordings and perspectives for bird song analyses.
P. Dugan (Visio Conference), Dr. Univ. Cornell, New-York
High-Performance Computing Platform for
Analyzing Big Bioacoustic Data. (25 min)
Marine mammals are dependent on access to their normal acoustic habitats for basic life functions, including communication, food finding, navigation and predator detection. Cetaceans are adapted to produce and perceive a great variety of sounds that collectively span 4-6 orders of magnitude along the dimensions of frequency, time and space. Sounds from human activities, (vessel noise, energy exploration, commercial shipping) can result in measurable losses of marine mammal acoustic habitats; which drives the need for building technology capable of finding whale sounds in large databases of sounds. By converting sounds to pictures, using spectrograms, the human visual system is very good at finding whale calls and song, despite being inefficient and tedious for the human operator.
This talk will focus on advanced developments in computer algorithm technologies designed to automatically find whale sounds in large datasets of acoustic recordings. Recent developments in advanced computing has allowed researches to unlock new information about marine mammals in large datasets. The authors will summarize specific examples, recorded in the Stellwagen Bank National Marine Sanctuary, MA, USA, for processing large quantities of continuous sound data using advanced detection-classification analytics. This talk will also combine the application of high-performance-computing system called the acoustic data accelerator (HPC-ADA) to explore the spatio-temporal dynamics for a suite of acoustically active marine mammals (fin, humpback, minke, and right whales). Mechanics of the HPC-ADA will be discussed along with how distributed processing is tackling large datasets and high sample rates (200 kHz). The results yield insights into acoustic behavior for marine mammals with a goal to better help understand marine ecology for large cetaceans.
(see here some implementation details).
A. Joly, Researcher, INRIA, Montpellier
Towards multimodal environmental data indexing (crowd-sourced audiovisual contents in LifeCLEF). (1 hour)
We will discuss the applicability of content-based indexing methods for
the identification of living organisms in crowd-sourced audio-visual
data. We first remind the basics of content-based indexing techniques
and give a comprehensive overview of the state-of-the-art of this
Then we present how such methods can be efficiently deployed as
instance-based classifiers for the identification of living organisms,
in particular in large-scale corpora built from crowd-sourced
observations. We will therefore introduce LifeCLEF, an international
evaluation campaign gathering tens of research groups worldwide and
working on 3 real-world biodiversity challenges (Pl@ntNet images,
xeno-canto bird recordings, Fish4knowledge fish videos).
We will more particularly describe the 2014 participation of our
research group to that challenges and demonstrate the genericity of
content-based indexing methods for handling radically different
B) Research on the field / industrial applications
G. Pavan, Pr. Univ. Pavia, Italia
Long term bioacoustic and ecological analysis of marine and terrestrial soundscapes joint to noise monitoring. (1 hour)
The long term monitoring of the natural environment and of the changes
induced by human activities is an emerging issue worldwide. The acoustic
monitoring of the underwater world has been mainly developed in the
last two decades to monitor marine mammals presence and distribution in
relation to the impact of anthropogenic underwater sound sources (Navy
sonar, seismic exploration, ship noise). In the terrestrial environment
the acoustic studies have been mainly driven by the interest in studying
individual species and, more recently, in monitoring the biodiversity
of habitats, either pristine or under anthropogenic pressures.
Examples of the NEMO-ONDE-SMO experiments to monitor marine sounds and
of the SABIOD project focused on the monitoring of a Integral Nature
Reserve in central Italy will be presented. Original hardware and
software solutions will be also illustrated.
C. Gervaise, Researcher, GIPSA Lab, France
Biophonics and 3D transect for the classification of marine biotopes. (25 min)
Since several years, Underwater Soudscapes have raised as a promising
fingerprint of the state of marine ecosystems. Marine soudscape carry
information about the three major ecosystemic's components : biotic
life, abiotic forcing anthropic pressures. Thanks to recent instrumental
development, marine soundscapes may be recorded over long time period to
account for several relevant temporal scales. In addition to this
temporal characterization, spatial characterization (trends,
variabilities, environmental factors) must be assessed. In the
presentation, we will present new developments to study spatial
variabilities of soundscapes at two different scales (local-scale
100mx100m, meso-scale 3kmsx3kms). In situ protocol and instrumentation,
data processing and exemples on real data will be presented in detail.
D. Mauuary, PhD, PDG Cyberio, France
Real-time tracking of chiroptera and Crowdsourcing : 'My City my Bats Project'. (1 hour)
Bats use wide band acoustic signals (10-100 khz) for echolocating their
close landscape and preys. We show that these signals are also ideal for
being tracked by light-weight multi-sensors devices easily deployable
on the field. Consequently biologists may now easily catch new knowledge
about the animal behavior and ecology.
Acoustic tracking of bats may be considered at several scales :
microscopic, mesoscopic and macroscopic scale. We propose different kind
of land covering by acoustical sensors to address these different
The mesoscopic scale is currently under experimental investigation in an urban context ('My City my Bats Project').
Finally, we suggest that flight path tracking technological development
gives rise to new insights/research paradigms about how the bat performs
the acoustical tracking in its brain.
N. Boucher, PDG, SoundId,
Solutions for long distance recordings, azimuth and classification. (30 min)
I'll present our R & D advances of detection and recognition: first
I'll present a system that is already developed to extract the callers
from a dawn chorus at faster than real-time. I will then discuss our
methods to estimate the directionality and range of the source by a n x 8
channel system for air and sub-marine use, and to discuss how to obtain
good S/N with this system. In order to improve these approaches, we'll
show our new scaled algorithm that allows us to find how many unique
calls are present in a reference sample of any size, automatically and
quickly. We will show real-time recognition. We can compare million of
full spectral images with full spectral references per second. Finally,
I'll present a 1+ km air sound recorder (and extension to a 10+ km
system using cheapish parts (<$2000)). The system is recording
successfully in wind and rain.
G. Pavan, H. Glotin, D. Mauuary, P. Arlotto, N. Boucher
From individual sounds to soundscape analysis. Demonstration of
hardware and software tools. (40 min)
We present efficient hardware and software solutions for long term and precise
recording used in the fields in the SABIOD.ORG project. Some of them are
developed by univ. Pavia and univ. Toulon, Gipsa or Cyberio, to record
Alpin fauna, whales in front of Toulon, or bats in Port-Cros National
Parc ( BOMBYX / JASON involving 30 researchers in Toulon and international collaborations).
C) Other communications
Dr. T. Papadopoulos, S. Roberts, K. Willis,
James Martin Research Fellow,
University of Oxford, UK
Detecting bird sound in unknown acoustic background using crowdsourced training data. (25 min)
Biodiversity monitoring using audio recordings is achievable at a truly
global scale via large-scale deployment of inexpensive, unattended
recording stations or by large-scale crowdsourcing using recording and
species recognition on mobile devices. The ability, however, to reliably
identify vocalizing animal species is limited by the fact that acoustic
signatures of interest in such recordings are typically embedded in a
diverse and complex acoustic background. To avoid the problems
associated with modeling such backgrounds, we build generative models
of bird sounds and use the concept of novelty detection to screen
recordings to detect sections of data which are likely bird
vocalisations. We present detection results against various acoustic
environments and different signal-to-noise ratios. We discuss the issues
related to selecting the cost function and setting detection thresholds
in such algorithms. Our methods are designed to be scalable and
automatically applicable to arbitrary selections of species depending on
the specific geographic region and time period of deployment.
A. Eldridge (a), M.Casey (b), P. Moscoso (a), M. Peck (a) & N. Morales (c) - (a) University of Sussex, UK (b) University of Dartmouth, US (c) Santa Lucia Forest Reserve, Ecuador
Toward the Extraction of Ecologically-Meaningful Soundscape Objects: A New Direction for Soundscape Ecology and Rapid Acoustic Biodiversity Assessment?
Efficient methods of biodiversity assessment and monitoring are central to ecological research and crucial in conservation management; technological advances in remote acoustic sensing inspire new approaches. In line with the emerging field of Soundscape Ecology, the acoustic approach is based on the rationale that the ecological processes occurring within a landscape are tightly linked to and reflected in the high-level structure of the patterns of sounds emanating from those landscapes ¿ the soundscape. Rather than attempting to recognise species- specific calls, either manually or automatically, analysis of the high level structure of the soundscape tackles the problem of diversity assessment at the community (rather than species) level. Preliminary work has attempted to make a case for community-level acoustic indices. Existing indices provide simple statistical summaries (e.g. Shannon entropy calculated on frequency or time domain signal). In doing so structural complexities arising from spectro-temporal partitioning are lost, limiting their power both as monitoring and investigative tools. In this paper we consider sparse-coding and source separation algorithms for a means to access and summarise ecologically-meaningful sound objects. In doing so we highlight a potentially fruitful union of the conceptual framework of Soundscape Ecology and source separation methods as a new direction for understanding and assessing ecologically relevant interactions in the soundscape.
University of Hull,
The Yorkshire Soudscape Project. (25 min)
Recent developments in the field of ecoacoustics have yielded different
approaches to environmental monitoring through the use of sound
recording, including the capturing and analysis of entire soundscapes
(rather than individual species). May 2015 will mark exactly forty years
since the World Soundscape Project (from Simon Fraser University,
Vancouver) visited places within Europe in order to produce Five Village
Soundscapes and the accompanying European Sound Diary (both 1977). This
project focuses on the capturing and analysis of a range of soundscapes
using rural Yorkshire as its study area, developing effective sound
recording and analysis techniques in order to enhance environmental
monitoring; taking an interdisciplinary approach, combining sonic arts
and ecology practices. It begins with a comparative analysis of archived
sound recordings from 1975 as part of the World Soundscape Project
database at Simon Fraser University.
This paper will elaborate on the current, early, stages of the project
which have involved relocating the soundmarks of said catalogue within
the Yorkshire Dales, making preliminary notes on the changes that have
occurred, sonically and geographically- and developing hypotheses to
work from once rigorous data collection has taken place during the
anniversary period in May 2015. It will review relevant field and
compositional techniques including the creation of sound maps (e.g.
Waldlock 2011 and Pijanowski et al 2011), soundscape composition,
spectral analysis, the ontological challenges of working with aural
archives (e.g. Chasalow 2006), and a comparison of technological changes
in the field of sound recording (e.g. Stern 2003). In this sense,
practical and theoretical methodologies work hand-in-hand.
G. Lauvin, P. Sinclair (a), H. Glotin (b) -
(a) Univ. AMU, ESA, Bourges ENSAB, (b) Univ. Toulon
Locus Sonus Open Microphone Project and Split Soundscape: a multi medium listening and analysis. (25 min)
Locus Sonus, is a research group, maintained by the Art colleges of Aix-En-Provence (ESA Aix) and Bourges ENSAB). Its main aim is to explore the, ever evolving, relationship between sound, place and usage.
Since 2006 Locus Sonus has been organizing and maintaining a network of 'Open Microphones' that permanently stream live audio captures of 'soundscapes' to the internet.
During a recent encounter between H. Glotin head of SABIOD project, and Peter Sinclair head of Locus Sonus, it appeared that there exist unexpected similarities in the research projects of their respective labs.
G. Lauvin and Locus Sonus are developing a sound installation for the second edition of the Sound art Exhibition Domaine des Murmures, at the Chateau d'Avignon, in Camargue. The artwork will use the System developed by SABIOD to capture and transmit audio data from a bio-acoustically rich location close to the Chateau: aquatic, ultra Sonic and (audible) sonic data will be used to (re)-create a natural symphony. The naturally occurring acoustic environment will be orchestrated with sonified data, corresponding to the sub aquatic and ultra sonic sounds.
This project is part of G. Lauvin's doctoral research, Split Soundscape, that focuses on the paradigm of real-time transmission and reconstitution of soundscape.
U. Moliterno de Camargo,
University of Helsinki,
Identification uncertainty and probabilistic classification methods: from DNA
sequences to bird species identity. (25 min)
Just as next-generation sequencers have fueled the growth of genomics, autonomous recording units brought rapid growth in bioacoustics data processing. The amount of data generated by autonomous recorders as compared to manual listening by field observers can be compared to the revolution of massive parallel sequencing and conventional sequencing methods. Species identification through genetic barcoding is based on comparing the similarity of a query sequence to reference sequences obtained from well-identified samples. Similarly, acoustic species identification is based on comparing the similarity of a query vocalization to those present in a reference database. Both approaches have traditionally been based on arbitrary threshold levels of similarity between samples, resulting on a poor description of identification uncertainty. I will present a Bayesian approach to estimate identification uncertainty using DNA barcoding data and illustrate how I am adapting this framework to the automated identification of bird vocalizations.
J. Sebastian Ulloa (c), A. Gasc (a), P. Gaucher (b), J. Sueur (a)
(a) Inst. Systematique, Evolution, Biodiversite, ISYEB UMR 7205 CNRS MNHN UPMC EPHE, Museum Nat. d'Histoire Naturelle, Sorbonne Univ., Paris ; (b) CNRS USR 3456 Guyane, Cayenne, France ; (c) Equipe Communications Acoustiques, Neuro-PSI, UMR 9197 CNRS-Univ. Paris-Sud,
Screening big audio data to determine the time and space distribution of a bird in a
tropical forest. (25 min)
The tropical forest is a complex acoustic environment, besides background noise from wind
and rain, there are multiple sources of sound due to the diversity of co-existing species. Using
signal processing and pattern recognition techniques, we managed to implement and tune an
automatic system aimed to detect a specific acoustic signature; the song of the Amazonian
bird Lipaugus vociferans. The detection system scanned nearly 60 thousand files revealing
novel ecological information and rising further questions about this species. Our results
showed that the vocal activity of the L. vociferans follow a clear circadian rhythm. Spatial
distribution patterns corresponded to specific habitat features that we evidenced confronting
the results with hydrology and vegetation maps.
Registration fees are only 80 euros including 2 meals, coffee breaks, and proceedings
(additional 20 euros to subscribe to the gala dinner).