Håkan Lundström
Search for other papers by Håkan Lundström in
Current site
Google Scholar
PubMed
Close
and
Jan-Olof Svantesson
Search for other papers by Jan-Olof Svantesson in
Current site
Google Scholar
PubMed
Close
Appendix 1
Software used

Appendix 1
Software used

In the process of analysis various software packages have been used, often in combination with other kinds of graphic transcription or musical notation. Certain types of software are well known to linguists and others to musicologists. Therefore, they are briefly presented here, with a few examples which – in the case of Praat and Melodyne – also serve as guides for understanding the graphs that occur in the text.1

Praat

Praat, which is free and developed by Paul Boersma and David Weeninck, is useful for analysis and annotation of short recordings or segments of longer recordings.2 Example 135 is a screen capture of a whole annotation. On top there is the soundwave, and below it the movement of the fundamental frequency (F0). The grey shading in the same area is the spectrogram in which overtones or timbre can be analysed. Tiers can be added and named according to preference. In this case, syllables were separated (manually) and written into the segments, with pitch measurements in the mel scale written into tier 4 below. The pitch measurements can also be annotated in semitones or at different Hertz settings, and they do not look hugely different; but the numbers recorded would of course differ. Other possible tiers could include pitch ranges for a particular section, or some measure of voice quality. Syllable-sized tiers may contain morphological information or other annotations.

Example 136 is a picture exported from the same file in Praat. It shows the pitch range in mels on the left on a more detailed scale, and the annotations below the pitch track. It is a little easier to read in terms of pitch and, since it is basically a line drawing, it is easier to display in print.

Example 135 Screen capture of an annotated Praat graph (Raven song). There is a segmentation based on phrases in tiers 1–4, segmentation built on syllables in tier 3, and pitch measurements of these in tier 4. The mel scale is on the right side (spanning from 83.35 to 170.6). The numbers on the left side denote loudness and Hertz, respectively. These depend on the settings, which can be changed in order to adapt the display to matters under investigation. The vertical dotted line is the playback cursor, and the figure on top of it shows where the cursor is placed in the clip. 14 Raven song

Example 136 Exported annotated Praat graph of Example 135 (Raven song). 14 Raven song

The mel scale used for measuring the pitch in Examples 135136 may be changed to a scale of semitones. The scale is shown on the right (Example 137). The exported version is shown in Example 138. The semitones are numbered in relation to a fixed standard of 100 Hz, showing how many semitones above or below 100 Hz the pitch is. By using settings, it is possible to transpose a melody to a level where, for instance, the tonic is at 100 Hz. Then the indication of semitones will be relevant in relation to the tonic.

Praat was developed for linguistic analysis. As shown here, it is also useful for some musical analysis.3 It can be used for monophonic music, that is, music consisting of a melody only, without harmony or musical accompaniment. Since most of our material in this study is monophonic, Praat has been useful in this context. It is also possible to look at musical and linguistic factors in the same display.

Example 137 Developments on the same file (Raven song) with new tiers showing functions of syllables (tier 5) and durations in milliseconds (tier 6). The pitch is rendered in semitones relative to 100 Hz. 14 Raven song

Example 138 Exported Praat graph of Example 136 (Raven song) 14 Raven song

Melodyne

Melodyne was actually constructed for pitch correction.4 It comprises many possibilities, but here it has been used only for graphic description of performances. Melodyne can take long pieces of performances, and the size (width and height) of the graph can be adjusted. The speed can also be changed for closer listening, for instance at half speed. The pitches are given in semitones in Example 139, but this can be changed to Hertz. Time is measured in seconds on the top scale, and by clicking anywhere in the graph, the pitch in Hertz at that particular moment may be obtained. Since Melodyne is developed particularly for music, the thin line in the graph shows the F0 of the performance. The wider fields – the blobs – partly surrounding this graph show amplitude but also approximate the graph to tone levels, much in the same way as a musicologist does when transcribing.

Above the graph, a transcription in notation may be selected. This notation is often useful when the movement is slow and pitches are relatively fixed. When necessary, horizontal lines (instead of arches) have been added above notes in order to indicate that they are tied together. Other signs that are added are self-explanatory. In more complex performances, manual revision is necessary; this is not possible in Melodyne but has to be accomplished in a separate notation program. The musical notations were made in Sibelius and Scorecloud.5

Example 139 The same Raven song as in Examples 135138 in a Melodyne graph, with words added manually. Vertical: pitch, horizontal: time (1 shaded column = 1 second). 14 Raven song

It is not possible to write text in the graph so as to add the words of a performance. A screen capture must be made, and words must be typed into the screen capture. Any other annotation can be achieved in the same manner, of course. When listening to the performance as it moves along the graph, Melodyne produces a very intuitive analysis of the sound, and it can cope with long recordings.

ELAN

ELAN was developed by The Language Archive (TLA) of the Max Planck Institute for Psycholinguistics.6 It can be used for various kinds of measurements of audio and video files. By adding tiers for various parameters, much information can be inserted and tagged with the audio/video recording. ELAN can handle long, continuous recordings.

Example 140 A Seediq performance on a video file in ELAN. The tiers used in this case are, from the top: melody, Seediq, gloss, translation, comments, tags.

In Example 140, the first tier that starts with ‘eg aa ed - …’ is an example of ‘letter notation’ that can also be used in Praat, although it is not technically possible to use standard music notation. This problem is also recognized by Morgan Sleeper, who uses an ABC notation developed by Christopher Walshaw for including music transcriptions in ELAN.7

Audacity

Audacity is useful for capturing analogue sound and for converting to .wav or .aiff files.8 The speed can be changed, and the captured sound file can be edited: cut, amplified, noise-reduced, etc. (Example 141). The factors that can be measured are amplitude and time. Audacity is easy to use for identifying and exporting audio files of, for instance, songs within comparatively long recordings that also include discussion or speech.

Example 141 The same Raven song as in Examples 135138 in Audacity. 14 Raven song

Notation software

There seems to be no notation software that is particularly suitable for musical transcription. The major Sibelius software is very flexible; but, on the other hand, it is quite difficult to use if not used on a regular basis. It is also quite costly.9 Sibelius has been used for some of the notations. ScoreCloud has also been used for some notations.10 It is easier to use; but, like several of the free notation software types, it was constructed in order to make the writing of conventional scores easy. For non-metric or metrically complex notations, it is necessary to find ways of neutralizing a number of automatic functions. Since the number of signs is limited, it is sometimes necessary to add signs manually to the final notation. Adding words is simple, and various phonetic signs can be used. ScoreCloud has a function enabling it to import audio-files for automatic notation. Editing such notations is, however, more time consuming than transcribing the recording manually. While far from perfect, the automatic notation of Melodyne is actually faster and can be used in order to facilitate manual transcription (cf. Example 139).

  • Collapse
  • Expand

All of MUP's digital content including Open Access books and journals is now available on manchesterhive.

 

In the borderland between song and speech

Vocal expressions in oral cultures

  • Example 135Screen capture of an annotated Praat graph (Raven song). There is a segmentation based on phrases in tiers 1–4, segmentation built on syllables in tier 3, and pitch measurements of these in tier 4. The mel scale is on the right side (spanning from 83.35 to 170.6). The numbers on the left side denote loudness and Hertz, respectively. These depend on the settings, which can be changed in order to adapt the display to matters under investigation. The vertical dotted line is the playback cursor, and the figure on top of it shows where the cursor is placed in the clip. 14 Raven song
  • Example 136Exported annotated Praat graph of Example 135 (Raven song). 14 Raven song
  • Example 137Developments on the same file (Raven song) with new tiers showing functions of syllables (tier 5) and durations in milliseconds (tier 6). The pitch is rendered in semitones relative to 100 Hz. 14 Raven song
  • Example 138Exported Praat graph of Example 136 (Raven song) 14 Raven song
  • Example 139The same Raven song as in Examples 135138 in a Melodyne graph, with words added manually. Vertical: pitch, horizontal: time (1 shaded column = 1 second). 14 Raven song
  • Example 140A Seediq performance on a video file in ELAN. The tiers used in this case are, from the top: melody, Seediq, gloss, translation, comments, tags.
  • Example 141The same Raven song as in Examples 135138 in Audacity. 14 Raven song

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 186 92 1
PDF Downloads 94 39 1