musiccaps
3 rows where audioset_names contains "Didgeridoo", audioset_names contains "Musical instrument" and audioset_names contains "Speech"
This data as json, CSV (advanced)
Suggested facets: audioset_names, audioset_ids
ytid ▼ | url | caption | aspect_list | audioset_names | author_id | start_s | end_s | is_balanced_subset | is_audioset_eval | audioset_ids |
---|---|---|---|---|---|---|---|---|---|---|
7k3M0pQvzhY | This audio contains someone playing a low didgeridoo while people are talking in the background. The didgeridoo sounds amplified. This song may be playing live at a festival. | ["amateur recording", "didgeridoo", "background noises", "people talking"] | ["Didgeridoo", "Music", "Musical instrument", "Speech"] | 6 | 200 | 210 | 0 | 0 | ["/m/02bxd", "/m/04rlf", "/m/04szw", "/m/09x0r"] | |
argBwTHDDVI | This clip features an instruction by a male voice on how to use an effect on a synth. After the narration, three notes are played on a synth. The first one is loud, the second note is lower and softer. The third note is lower but louder in volume. There are no other instruments in this song. | ["instructional audio", "synth sounds", "male narrator", "no vocal melody", "no other instruments"] | ["Didgeridoo", "Music", "Musical instrument", "Speech", "Synthesizer"] | 0 | 370 | 380 | 0 | 0 | ["/m/02bxd", "/m/04rlf", "/m/04szw", "/m/09x0r", "/m/0l14qv"] | |
yWU0zNEy2_I | The low quality recording features a didgeridoo melody playing outdoors. There are also some crowd chattering, water fountains and birds chirping sounds. The recording is noisy, as it was probably recorded with a phone. | ["low quality", "noisy", "birds chirping", "crowd talking", "water fountain sounds", "didgeridoo melody"] | ["Didgeridoo", "Music", "Musical instrument", "Speech"] | 4 | 90 | 100 | 0 | 0 | ["/m/02bxd", "/m/04rlf", "/m/04szw", "/m/09x0r"] |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [musiccaps] ( [ytid] TEXT PRIMARY KEY, [url] TEXT, [caption] TEXT, [aspect_list] TEXT, [audioset_names] TEXT, [author_id] TEXT, [start_s] TEXT, [end_s] TEXT, [is_balanced_subset] INTEGER, [is_audioset_eval] INTEGER, [audioset_ids] TEXT );