-
Notifications
You must be signed in to change notification settings - Fork 0
/
spectrograms.qmd
652 lines (381 loc) · 18.6 KB
/
spectrograms.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
---
title: <font size="7"><b>Building spectrograms</b></font>
toc: true
toc-depth: 2
toc-location: left
number-sections: true
highlight-style: pygments
format:
html:
df-print: kable
code-fold: show
code-tools: true
css: styles.css
link-external-icon: true
link-external-newwindow: true
---
::: {.alert .alert-info}
## **Objetives** {.unnumbered .unlisted}
- Understand how spectrograms are created
- Learn to create and customize spectrograms in R
:::
```{r, echo = FALSE}
library(knitr)
opts_chunk$set(tidy = TRUE, warning = FALSE, message = FALSE)
```
The spectrogram is a fundamental tool in the study of acoustic communication in vertebrates. They are basically a visual representation of the sound where the variation in energy (or power spectral density) is shown on both the frequency and the time domains. Spectrograms allow us to visually explore acoustic variation in our study systems, which makes it easy to distinguish structural differences at small temporal/spectral scales that our ears cannot detect.
We will use the **seewave** package and its sample data:
```{r, echo=TRUE, warning=FALSE, message=FALSE}
library(seewave)
# load examples
data(tico)
data(orni)
```
## Fourier transformation
In order to understand the information contained in a spectrogram it is necessary to understand, at least briefly, the [Fourier transformation](https://en.wikipedia.org/wiki/Fourier_transform). In simple words, this is a mathematical transformation that detects the periodicity in time series, identifying the different frequencies that compose them and their relative energy. Therefore it is said that it transforms the signals from the time domain to the frequency domain.
To better understand how it works, we can simulate time series composed of pre-defined frequencies. In this example we simulate 3 frequencies and join them in a single time series:
```{r}
# freq
f <- 11025
# time sequence
t <- seq(1/f, 1, length.out=f)
# period
pr <- 1/440
w0 <- 2 * pi/pr
# frec 1
h1 <- 5 * cos(w0*t)
plot(h1[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
# frec 2
h2 <- 10 * cos(2 * w0 * t)
plot(h2[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
# frec 3
h3 <- 15 * sin(3 * w0 * t)
plot(h3[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
```
This is what the union of the three frequencies looks like:
```{r}
H0 <- 0.5 + h1 + h2 + h3
plot(H0[1:75], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
```
Now we can apply the Fourier transform to this time series and graph the frequencies detected using a periodogram:
```{r}
fspc <- Mod(fft(H0))
plot(fspc, type="h", col="blue",
xlab="Frecuency (Hz)",
ylab="Amplitude (no units)")
abline(v = f / 2, lty = 2)
text(x = (f / 2) + 1650, y = 8000, "Nyquist Frequency")
```
We can make zoom in to frequencies below the Nyquist frequency:
```{r}
plot(fspc[1:(length(fspc) / 2)], type="h", col="blue",
xlab="Frecuency (Hz)",
ylab="Amplitude (no units)")
```
This diagram (taken from Sueur 2018) summarizes the process we just simulated:
```{r songs 2.3, out.width = "400px", echo = FALSE, fig.align= "center"}
knitr::include_graphics("images/FFT_decomp.png")
```
<font size="1"><b>Tomado de Sueur 2018</b></font>
The periodogram next to the spectrogram of these simulated sounds looks like this:
```{r, echo = FALSE, warning=FALSE, message=FALSE}
opar <- par()
par(mfrow = c(1, 2), mar =c(4, 4, 1, 0))
plot(fspc[1:(length(fspc) / 2)], 1:(length(fspc) / 2), type="l", col="blue", ylab="Frequency (Hz)",
xlab="Amplitude (no units)", ylim = c(110, 2890), lwd = 3, xlim = c(10000, 90000))
par(mar = c(4, 0, 1, 1))
spectro(H0, palette = reverse.gray.colors.1, scale = FALSE, grid = FALSE, f = f, flim = c(0.1, 2.9), flab = NULL, axisY = FALSE)
par(opar)
```
## From the Fourier transformation to the spectrogram
The spectrograms are constructed of the spectral decomposition of discrete time segments of amplitude values. Each segment (or window) of time is a column of spectral density values in a frequency range. Take for example this simple modulated sound, which goes up and down in frequency:
```{r, echo = FALSE}
norm <- dnorm(-4000:3999, sd=1000)
s <- synth2(env=norm, ifreq=500+(norm/max(norm))*1000, f=8000, plot=FALSE, osc=TRUE, ovlp=0)
s.wav <- tuneR::Wave(left = s, samp.rate = 8000, bit = 16)
monitoR::viewSpec(s.wav, ovlp = 95, main = NA, frq.lim = c(0, 3))
```
If we divide the sound into 10 segments and make periodograms for each of them we can see this pattern in the frequencies:
```{r, eval = F, echo = FALSE}
# animacion de periodograma junto a espectrograma
norm <- dnorm(-4000:3999, sd=1000)
s <- synth2(env=norm, ifreq=500+(norm/max(norm))*1000, f=8000, plot=FALSE, osc=TRUE, ovlp=0)
library("animation")
num.segmts <- 10
sq <- round(seq(1, length(s), length.out = num.segmts + 1))
sq.spc <- seq(0, duration(s, f), length.out = num.segmts + 1)
# loop over different number of time windows
saveGIF(
for(x in 2:length(sq))
{
# modulus fft
fspc1 <- Mod(fft(s[sq[(x-1)]:sq[x]]))
par(mfrow = c(1, 2), mar =c(4, 4, 1, 0), xaxt = "n", yaxt = "n")
plot(fspc1[1:(length(fspc1) / 2)], 1:(length(fspc1) / 2), type="l", col="blue", ylab="Frecuency (Hz)", xlab="Amplitude (no units)", lwd = 3, ylim = c(20, 220))#, xlim = c(100, max(fspc)))
usr <- par("usr")
axis(side = 2, at = seq(usr[3], usr[4], length.out = 4), labels = seq(0, 3, length.out = 4))
par(mar = c(4, 0, 1, 1))
spectro(s, palette = reverse.gray.colors.1, scale = FALSE, grid = FALSE, f = f, flim = c(0.2, 3), flab = NULL, axisY = FALSE, collevels = seq(-100, 0 , 5), wl = 250)
rect(xleft = sq.spc[x - 1], xright = sq.spc[x], ytop = 4, ybottom = 0, col = adjustcolor("blue", 0.1), border = NA)
text(x = mean(sq.spc[x - 1], sq.spc[x]), y = 2.5, labels = (1:num.segmts)[x - 1], cex = 3, pos = 4)
},
movie.name = "periodogram_spectro.gif", interval = 0.7, ani.width = 480 * 1.5)
```
```{r, eval = F, echo = FALSE}
# animacion de periodograma
norm <- dnorm(-4000:3999, sd=1000)
s <- synth2(env=norm, ifreq=500+(norm/max(norm))*1000, f=8000, plot=FALSE, osc=TRUE, ovlp=0)
# plot(s[1:1000], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
#
# plot(s[7000:8000], type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
#
#
# plot(s, type = "l", col = "blue", xlab = "Time (samples)", ylab = "Amplitude (no units)")
library("animation")
sq <- round(seq(1, length(s), length.out = 10))
# loop over different number of time windows
saveGIF(
for(x in 2:length(sq))
{
# modulus fft
fspc <- Mod(fft(s[sq[(x-1)]:sq[x]]))
plot(fspc, type="h", col="blue",
xlab="Frecuency (Hz)",
ylab="Amplitude (no units)",
xlim = c(0, 400),
xaxt = "n",
yaxt = "n"
)
},
movie.name = "periodogram.gif", interval = 0.5, ani.width = 480 * 1.5)
```
<img src="images/periodogram_spectro.gif" alt="Spectrogram resolution" width="600"/>
This animation shows in a very simple way the logic behind the spectrograms: if we calculate Fourier transforms for short segments of time through a sound (e.g. amplitude changes in time) and concatenate them, we can visualize the variation in frequencies over time.
## Overlap
When frequency spectra are combined to produce a spectrogram, the frequency and amplitude modulations are not gradual:
```{r, echo = FALSE}
s.wav <- tuneR::Wave(left = s, samp.rate = 8000, bit = 16)
monitoR::viewSpec(s.wav, main = NA, frq.lim = c(0, 3))
```
There are several "tricks" to smooth out the contours of signals with high modulation in a spectrogram, although the main and most common is window overlap. The overlap recycles a percentage of the amplitude samples of a window to calculate the next window. For example, the sound used as an example, with a window size of 512 points divides the sound into 15 segments:
```{r, echo = FALSE}
monitoR::viewSpec(s.wav, main = NA, frq.lim = c(0, 3), wl = 512)
num.segmts <- 15
usr <- par("usr")
sq.spc <- seq(usr[1], usr[2], length.out = num.segmts + 1)
for(x in 2:length(sq.spc))
rect(xleft = sq.spc[x - 1], xright = sq.spc[x], ytop = (usr[4] - usr[1]) / 2, ybottom = usr[3], col = adjustcolor("blue", 0.1), border = "blue")
```
A 50% overlap generates windows that share 50% of the amplitude values with the adjacent windows. This has the visual effect of making modulations much more gradual:
```{r, echo = FALSE}
monitoR::viewSpec(s.wav, main = NA, frq.lim = c(0, 3), wl = 512)
num.segmts <- 15
usr <- par("usr")
sq.spc <- seq(usr[1], usr[2], length.out = num.segmts + 1)
for(x in 2:length(sq.spc))
rect(xleft = sq.spc[x - 1], xright = sq.spc[x], ytop = (usr[4] - usr[1]) / 2, ybottom = usr[3], col = adjustcolor("blue", 0.1), border = "blue")
sq.spc <- sq.spc + sq.spc[1]
for(x in 2:(length(sq.spc) - 1))
rect(xleft = sq.spc[x - 1], xright = sq.spc[x], ybottom = (usr[4] - usr[1]) / 3, ytop = (usr[4] - usr[1]) / 8 * 7, col = adjustcolor("blue", 0.1), border = "blue")
```
Which increases (in some way artificially) the number of time windows, without changing the resolution in frequency. In this example, the number of time windows is doubled:
```{r, echo = FALSE}
monitoR::viewSpec(s.wav, main = NA, frq.lim = c(0, 3), wl = 512, ovlp = 50)
num.segmts <- 30
usr <- par("usr")
sq.spc <- seq(usr[1], usr[2], length.out = num.segmts + 1)
for(x in 2:length(sq.spc))
rect(xleft = sq.spc[x - 1], xright = sq.spc[x], ytop = usr[4], ybottom = usr[3], col = adjustcolor("blue", 0.1), border = "blue")
```
Therefore, the greater the overlap the greater the smoothing of the contours of the sounds:
```{r, eval = FALSE, echo = FALSE}
# animacion resolucion spectrograma
library(animation)
# loop over different number of time windows
saveGIF(for(i in c(seq(0, 99, by = 5), rep(99, 15)))
{
monitoR::viewSpec(s.wav, main = paste("overlap =", i), frq.lim = c(0, 3), wl = 512, ovlp = i)
num.segmts <- 15 * 100 / (100 - i)
usr <- par("usr")
sq.spc <- seq(usr[1], usr[2], length.out = num.segmts + 1)
for(x in 2:length(sq.spc))
rect(xleft = sq.spc[x - 1], xright = sq.spc[x], ytop = usr[4], ybottom = usr[4] * 3/4, col = adjustcolor("blue", 0.1), border = adjustcolor("blue", 1 - (i / 100)))
},movie.name = "spectro.overlap.gif", interval = 0.2, ani.width = 480 * 1.5)
```
<img src="images/spectro.overlap.gif" alt="Spectrogram overlap" width="600"/>
This increases the number of windows as a function of the overlap for this particular sound:
```{r, echo = FALSE}
vnts <- sapply(seq(0, 99, 1), function(i) (8000/512) * 100 / (100 - i))
par(mar = c(4, 4, 1, 1))
plot(seq(0, 99, 1), vnts, type = "l", col = "red", lwd = 4, ylab = "# of time windows", xlab = "Overlap (%)")
```
This increase in spectrogram sharpness does not come without a cost. The longer the time windows, the greater the number of Fourier transformations to compute, and therefore, the greater the duration of the process. This graphic shows the increase in duration as a function of the number of windows on my computer:
```{r, eval = FALSE, echo = FALSE}
tme <- sapply(seq(0, 99, 1), function(i)
a <- system.time(monitoR::viewSpec(s.wav, main = paste("overlap =", i), frq.lim = c(0, 3), wl = 512, ovlp = i, plot = FALSE)))
return(a[3])
)
write.csv(data.frame(t(tme)), "time of overlap.csv", row.names = FALSE)
```
```{r, eval = TRUE, echo = FALSE}
tme <- read.csv("./data/time of overlap.csv")[, 3]
par(mar = c(4, 4, 1, 1))
plot(vnts, tme * 1000, type = "l", col = "red", lwd = 4, xlab = "# of time windows", ylab = "Duration (ms)")
```
It is necessary to take this cost into account when producing spectrograms of long sound files (\> 1 min).
## Limitations
However, there is a trade-off between the resolution between the 2 domains: the higher the frequency resolution, the lower the resolution in time. The following animation shows, for the sound of the previous example, how the resolution in frequency decreases as the resolution in time increases:
<img src="images/spectro.precision.tradeoff2.gif" alt="Spectrogram resolution 2"/>
```{r, eval = F, echo = FALSE}
# animacion resolucion spectrograma
library(animation)
# loop over different number of time windows
saveGIF(for(x in seq(10, 402, 4))
{
# time windows
tw <- seq(1, 8000, length.out = x)
# fourier transformation for each time window
fspc <- lapply(2:length(tw), function(i) {
# transform and log
ft <- log(Mod(fft(s[tw[i - 1]:tw[i]])))
# standardize
ft <- ft / max(ft)
return(ft)
})
# put results together in a matrix
spc <- do.call(rbind, fspc)
# remover valores sobre frec Nyquist
spc <- spc[, 1:(ncol(spc) / 2)]
spc[spc > 1] <- 2.2
# plot spectrogram
image(z = spc, col = c("red", "white"), main = paste0(x, " time windows (~", round(1/x * 1000, 1), " ms each)"), xlab = "Time (s)", axes = FALSE, ylab = "Frequency (kHz)", ylim = c(0, 0.75))
# add axis
axis(side = 1)
axis(side = 2, at = seq(0, 0.75, length.out = 4), labels = seq(0, 3, length.out = 4))
# wait
# Sys.sleep(0.1)
},movie.name = "spectro.precision.tradeoff2.gif", interval = 0.12, ani.width = 480 * 1.5)
```
This is the relationship between frequency resolution and time resolution for the example signal:
```{r, eval = TRUE, echo = FALSE}
out <- lapply(seq(10, 402, 4), function(x)
{
tw <- seq(1, 8000, length.out = x)
# transform and log
ft <- log(Mod(fft(s[tw[1]:tw[2]])))
return(data.frame(wl = x, res.t = 1 / x, res.f = 4000 / length(ft)))
})
res <- do.call(rbind, out)
par( mar =c(4, 4, 1, 1))
plot(res$res.t, res$res.f, type = "l", col = "blue", lwd = 4, xlab = "Time window size (s)", ylab = "Frequency bin size (Hz)")
```
## Creating spectrograms in R
There are several R packages with functions that produce spectrograms in the graphical device. This chart (taken from Sueur 2018) summarizes the functions and their arguments: <img src="images/r-spectrogram-funs.png" alt="Spectrogram functions" width="700"/>
We will focus on making spectrograms using the `spectro ()` function of **seewave**:
```{r, eval = TRUE, echo = TRUE}
tico2 <- cutw(tico, from = 0.55, to = 0.9, output = "Wave")
spectro(tico2, f = 22050, wl = 512, ovlp = 90,
collevels = seq(-40, 0, 0.5),
flim = c(2, 6), scale = FALSE)
```
::: {.alert .alert-info}
<font size="5">Exercise</font>
- How can I increase the overlap between time windows?
- How much longer it takes to create a 99%-overlap spectrogram compare to a 5%-overlap spectrogram?
- What does the argument 'collevels' do? Increase the range and look at the spectrogram.
- What do the 'flim' and 'tlim' arguments determine?
- Run the examples that come in the `spectro()` function documentation
:::
Almost all components of a spectrogram in **seewave** can be modified. We can add scales:
```{r, eval = TRUE, echo = TRUE}
spectro(tico2, f = 22050, wl = 512, ovlp = 90,
collevels = seq(-40, 0, 0.5),
flim = c(2, 6), scale = TRUE)
```
Change the color palette:
```{r, eval = TRUE, echo = TRUE}
spectro(tico2, f = 22050, wl = 512, ovlp = 90,
collevels = seq(-40, 0, 0.5),
flim = c(2, 6), scale = TRUE,
palette = reverse.cm.colors)
```
```{r, eval = TRUE, echo = TRUE}
spectro(tico2, f = 22050, wl = 512, ovlp = 90,
collevels = seq(-40, 0, 0.5),
flim = c(2, 6), scale = TRUE,
palette = reverse.gray.colors.1)
```
Remove the vertical lines:
```{r, eval = TRUE, echo = TRUE}
spectro(tico2, f = 22050, wl = 512, ovlp = 90,
collevels = seq(-40, 0, 0.5),
flim = c(2, 6), scale = TRUE,
palette = reverse.gray.colors.1,
grid = FALSE)
```
Add oscillograms (waveforms):
```{r, eval = TRUE, echo = TRUE}
spectro(tico2, f = 22050, wl = 512, ovlp = 90,
collevels = seq(-40, 0, 0.5),
flim = c(2, 6), scale = TRUE,
palette = reverse.gray.colors.1,
grid = FALSE,
osc = TRUE)
```
::: {.alert .alert-info}
<font size="5">Exercise</font>
- Change the color of the oscillogram to red
- These are some of the color palettes that fit well the gradients in spectrograms:
<img src="images/palletes.png" alt="Spectrogram palletes" width="480"/>
<font size="1"><b>From Sueur 2018</b></font>
Use at least 3 palettes to generate the "tico2" spectrogram
- Change the relative height of the oscillogram so that it corresponds to 1/3 of the height of the spectrogram
- Change the relative width of the amplitude scale so that it corresponds to 1/8 of the spectrogram width
- What does the "zp" argument do? (hint: try `zp = 100` and notice the effect on the spectrogram)
- Which value of "wl" (window size) generates smoother spectrograms for the example "orni" object?
- The package `viridis` provides some color palettes that are better perceived by people with forms of color blindness and/or color vision deficiency. Install the package and try some of the color palettes available (try `?viridis`)
:::
## Dynamic spectrograms
The package [dynaSpec](https://marce10.github.io/dynaSpec) allows to create static and dynamic visualizations of sounds, ready for publication or presentation. These dynamic spectrograms are produced natively with base graphics, and are save as an .mp4 video in the working directory:
```{r, eval = FALSE}
ngh_wren <- read_sound_file("https://www.xeno-canto.org/518334/download")
custom_pal <- colorRampPalette( c("#2d2d86", "#2d2d86", reverse.terrain.colors(10)[5:10]))
library(dynaSpec)
scrolling_spectro(wave = ngh_wren, wl = 600,
t.display = 3, ovlp = 95, pal = custom_pal,
grid = FALSE, flim = c(2, 8), width = 700,
height = 250, res = 100, collevels = seq(-40, 0, 5),
file.name = "../nightingale_wren.mp4", colbg = "#2d2d86", lcol = "#FFFFFFE6")
```
<iframe width="100%" height="297" src="images/nightingale_wren.mp4" allowtransparency="true" style="background: #FFFFFF;" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen>
</iframe>
------------------------------------------------------------------------
## References
1. Araya-Salas, Marcelo and Wilkins, Matthew R. (2020), dynaSpec: dynamic spectrogram visualizations in R. R package version 1.0.0.
2. Sueur J, Aubin T, Simonis C. 2008. Equipment review: seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2):213--226.
3. Sueur, J. (2018). Sound Analysis and Synthesis with R.
------------------------------------------------------------------------
<font size="4">Session information</font>
```{r session info, echo=F}
sessionInfo()
```