musiccaps
2 rows where aspect_list contains "dance music", aspect_list contains "female voice", aspect_list contains "low quality recording" and aspect_list contains "moderate tempo"
This data as json, CSV (advanced)
ytid ▼ | url | caption | aspect_list | audioset_names | author_id | start_s | end_s | is_balanced_subset | is_audioset_eval | audioset_ids |
---|---|---|---|---|---|---|---|---|---|---|
CN2QSmhP-HI | This salsa song features a female voice singing the main melody. This is accompanied by the congas. The beat is a dance beat. Trumpets and a saxophone play fills in between lines. A piano plays a melody at the end of the song. The song starts with the voice singing a melody at a moderate tempo. After the piano plays, the tempo of the song increases. Other instruments cannot be heard as the quality of the recording is low. This song can be played in a Latin dance sequence in a movie. | ["low quality recording", "salsa song", "congas", "saxophone", "trumpet", "piano", "female voice", "moderate tempo", "dance music", "seductive rhythm"] | ["Music", "Salsa music"] | 0 | 30 | 40 | 0 | 1 | ["/m/04rlf", "/m/0ln16"] | |
MEew7OQ17HY | This audio clip features a female voice singing the main melody. The quality of the audio recording is low. The voice is accompanied by Latin style percussion. Male voices sing backing vocals. This is a dance song at a moderate tempo. The sound of a camera shutter is played at the beginning and end of the clip. Other musical instruments are barely audible due to the low quality of audio recording. | ["low quality recording", "female voice", "latin percussion", "male backing voices", "moderate tempo", "dance music", "camera shutter sound"] | ["Music", "Single-lens reflex camera", "Inside, small room"] | 0 | 10 | 20 | 0 | 1 | ["/m/04rlf", "/m/07bjf", "/t/dd00125"] |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [musiccaps] ( [ytid] TEXT PRIMARY KEY, [url] TEXT, [caption] TEXT, [aspect_list] TEXT, [audioset_names] TEXT, [author_id] TEXT, [start_s] TEXT, [end_s] TEXT, [is_balanced_subset] INTEGER, [is_audioset_eval] INTEGER, [audioset_ids] TEXT );