musiccaps
3 rows where aspect_list contains "flat male vocal", aspect_list contains "mono", aspect_list contains "noisy" and aspect_list contains "wooden percussions"
This data as json, CSV (advanced)
ytid ▼ | url | caption | aspect_list | audioset_names | author_id | start_s | end_s | is_balanced_subset | is_audioset_eval | audioset_ids |
---|---|---|---|---|---|---|---|---|---|---|
-w4HLksto_k | The low quality recording features a flat male vocal singing over digital piano melody and wooden percussion. The recording is noisy, in mono, loud and distorted. | ["low quality", "mono", "noisy", "flat male vocal", "wooden percussions", "digital piano melody", "loud", "distorted"] | ["Tabla", "Drum", "Percussion"] | 4 | 30 | 40 | 0 | 0 | ["/m/01p970", "/m/026t6", "/m/0l14md"] | |
3B-YYTbpFZE | The low quality recording features a flat male vocal narrating over flame sounds and wooden surface scratching sounds, after which there is a cut to a shimmering shaker and wooden percussion playing. The recording is noisy and in mono. | ["low quality", "mono", "noisy", "flat male vocal", "wooden percussions", "flame sounds", "shimmering shaker", "wooden surface scratching sounds", "documentary"] | ["Wood block", "Percussion"] | 4 | 30 | 40 | 0 | 0 | ["/m/01sm1g", "/m/0l14md"] | |
FKBryvLMTY4 | The low quality recording features a traditional song that consists of a flat male vocal, alongside monotone female vocal, singing over sustained strings melody, wooden percussion, shimmering shakers and some claps. It sounds passionate and the recording is in mono and noisy. | ["low quality", "noisy", "mono", "clapping", "flat male vocal", "monotone female vocal", "wooden percussions", "shimmering shakers", "traditional", "sustained strings melody", "passionate"] | ["Folk music", "Music"] | 4 | 30 | 40 | 0 | 0 | ["/m/02w4v", "/m/04rlf"] |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [musiccaps] ( [ytid] TEXT PRIMARY KEY, [url] TEXT, [caption] TEXT, [aspect_list] TEXT, [audioset_names] TEXT, [author_id] TEXT, [start_s] TEXT, [end_s] TEXT, [is_balanced_subset] INTEGER, [is_audioset_eval] INTEGER, [audioset_ids] TEXT );