1
00:00:06,320 --> 00:00:11,499
[Music]

2
00:00:16,320 --> 00:00:20,800
welcome back everyone um thank you

3
00:00:19,439 --> 00:00:23,359
for

4
00:00:20,800 --> 00:00:24,160
bearing with us until the end of today

5
00:00:23,359 --> 00:00:25,840
um

6
00:00:24,160 --> 00:00:27,800
next up we have

7
00:00:25,840 --> 00:00:30,480
daniel giving us a talk about

8
00:00:27,800 --> 00:00:33,680
musicbrainz.org and wikidata.org

9
00:00:30,480 --> 00:00:35,920
um daniel is a frequent lca attendee and

10
00:00:33,680 --> 00:00:38,000
i remember him coming to our first ever

11
00:00:35,920 --> 00:00:40,079
go glam which was fantastic so it's good

12
00:00:38,000 --> 00:00:42,320
to see familiar faces giving talks i've

13
00:00:40,079 --> 00:00:43,120
been going to as many as lcas as you

14
00:00:42,320 --> 00:00:44,399
have

15
00:00:43,120 --> 00:00:47,039
oh geez

16
00:00:44,399 --> 00:00:49,120
it's been a long time then um he's also

17
00:00:47,039 --> 00:00:52,719
a music enthusiast and i will hand

18
00:00:49,120 --> 00:00:54,960
straight over to daniel thanks

19
00:00:52,719 --> 00:00:56,800
okay

20
00:00:54,960 --> 00:00:58,480
hey everyone um

21
00:00:56,800 --> 00:01:00,640
i'm gonna

22
00:00:58,480 --> 00:01:03,920
give an introduction to two projects

23
00:01:00,640 --> 00:01:07,200
that i've been editing in

24
00:01:03,920 --> 00:01:08,799
um for many many years

25
00:01:07,200 --> 00:01:09,840
um

26
00:01:08,799 --> 00:01:11,040
and

27
00:01:09,840 --> 00:01:13,680
hopefully

28
00:01:11,040 --> 00:01:16,000
you learned something

29
00:01:13,680 --> 00:01:16,000
um

30
00:01:19,200 --> 00:01:24,560
so about me um

31
00:01:21,520 --> 00:01:29,520
i'm daniel sobi i'm from adelaide

32
00:01:24,560 --> 00:01:32,400
i attend the conf um sort of in my

33
00:01:29,520 --> 00:01:35,240
spare time sitting in front of the tv

34
00:01:32,400 --> 00:01:37,200
i occasionally edit a website called

35
00:01:35,240 --> 00:01:38,640
musicbrainz.org

36
00:01:37,200 --> 00:01:42,560
um

37
00:01:38,640 --> 00:01:45,119
i have 31 000 edits and i'm not one of

38
00:01:42,560 --> 00:01:47,119
the software developers or

39
00:01:45,119 --> 00:01:49,920
there's things called auto editors that

40
00:01:47,119 --> 00:01:51,280
can auto approve things i'm just a

41
00:01:49,920 --> 00:01:53,520
normal

42
00:01:51,280 --> 00:01:55,840
person that's just been stuck around for

43
00:01:53,520 --> 00:01:55,840
ages

44
00:01:56,240 --> 00:01:59,280
um

45
00:01:57,200 --> 00:02:00,960
so the two projects

46
00:01:59,280 --> 00:02:02,640
that i'm going to talk about is a

47
00:02:00,960 --> 00:02:04,719
musicbrainz.org

48
00:02:02,640 --> 00:02:07,360
it's a website

49
00:02:04,719 --> 00:02:10,239
it's got music data it's got people

50
00:02:07,360 --> 00:02:12,239
things about artists labels

51
00:02:10,239 --> 00:02:14,400
recordings it's

52
00:02:12,239 --> 00:02:16,480
not necessarily all music

53
00:02:14,400 --> 00:02:18,480
there are things like audio books and

54
00:02:16,480 --> 00:02:20,640
other things that are sort of music

55
00:02:18,480 --> 00:02:22,480
related

56
00:02:20,640 --> 00:02:24,800
we don't really care if it's a

57
00:02:22,480 --> 00:02:26,400
commercial cd or

58
00:02:24,800 --> 00:02:28,160
someone's soundcloud that they've

59
00:02:26,400 --> 00:02:30,560
uploaded something it just

60
00:02:28,160 --> 00:02:34,239
has to exist

61
00:02:30,560 --> 00:02:35,879
and uh the other project um

62
00:02:34,239 --> 00:02:37,599
i'm going to talk about is uh

63
00:02:35,879 --> 00:02:40,319
wikidata.org

64
00:02:37,599 --> 00:02:40,319
it's sort of a

65
00:02:40,800 --> 00:02:45,920
parallel database used by wikipedia

66
00:02:43,920 --> 00:02:47,440
to um

67
00:02:45,920 --> 00:02:49,760
to link

68
00:02:47,440 --> 00:02:51,760
all the sites together and to store

69
00:02:49,760 --> 00:02:53,680
factual information

70
00:02:51,760 --> 00:02:55,040
so there's going to be things like the

71
00:02:53,680 --> 00:02:57,519
height of

72
00:02:55,040 --> 00:02:59,280
everest and the height of all the large

73
00:02:57,519 --> 00:03:01,360
mountains

74
00:02:59,280 --> 00:03:02,840
presidents

75
00:03:01,360 --> 00:03:05,280
all sorts of things

76
00:03:02,840 --> 00:03:07,200
films you name it there's probably an

77
00:03:05,280 --> 00:03:09,280
entry for it

78
00:03:07,200 --> 00:03:10,319
and

79
00:03:09,280 --> 00:03:12,159
it's a

80
00:03:10,319 --> 00:03:16,159
another database

81
00:03:12,159 --> 00:03:16,159
that anyone can edit and they sort of

82
00:03:16,239 --> 00:03:20,720
have two different approaches to how

83
00:03:17,840 --> 00:03:22,720
they're structured a bit so it's sort of

84
00:03:20,720 --> 00:03:23,680
related sort of do things differently

85
00:03:22,720 --> 00:03:26,239
and

86
00:03:23,680 --> 00:03:30,200
hopefully if i get to it i'll

87
00:03:26,239 --> 00:03:30,200
explain some of that on the way

88
00:03:32,080 --> 00:03:34,319
so

89
00:03:34,879 --> 00:03:40,400
it's database of musicbrainz it's

90
00:03:37,280 --> 00:03:42,720
database of audio recordings

91
00:03:40,400 --> 00:03:45,680
um each entity has a

92
00:03:42,720 --> 00:03:46,400
long string a uuid

93
00:03:45,680 --> 00:03:49,840
that's

94
00:03:46,400 --> 00:03:51,840
really important because

95
00:03:49,840 --> 00:03:54,000
there's an awful lot of name collision

96
00:03:51,840 --> 00:03:56,000
and that's sort of thing so there are

97
00:03:54,000 --> 00:03:58,560
two bands called pendulum

98
00:03:56,000 --> 00:04:00,640
they both have rear awards

99
00:03:58,560 --> 00:04:01,840
you have to be able to say which one it

100
00:04:00,640 --> 00:04:03,840
is

101
00:04:01,840 --> 00:04:06,000
when dealing with

102
00:04:03,840 --> 00:04:08,239
such a big database

103
00:04:06,000 --> 00:04:09,360
it's old user pretty much all user

104
00:04:08,239 --> 00:04:13,040
edited

105
00:04:09,360 --> 00:04:15,519
there's pretty much no bots that edit

106
00:04:13,040 --> 00:04:16,320
um it's

107
00:04:15,519 --> 00:04:17,680
um

108
00:04:16,320 --> 00:04:19,440
hosted

109
00:04:17,680 --> 00:04:20,959
and run by a

110
00:04:19,440 --> 00:04:23,919
non-profit organization called

111
00:04:20,959 --> 00:04:26,960
metabrains.org

112
00:04:23,919 --> 00:04:26,960
they do get funding

113
00:04:27,199 --> 00:04:31,440
the data is all public

114
00:04:30,479 --> 00:04:34,160
um

115
00:04:31,440 --> 00:04:36,960
mostly creative com zero license

116
00:04:34,160 --> 00:04:39,199
uh they get funding from uh things like

117
00:04:36,960 --> 00:04:40,080
google and spotify and

118
00:04:39,199 --> 00:04:42,720
other

119
00:04:40,080 --> 00:04:43,140
big music companies

120
00:04:42,720 --> 00:04:44,479
um

121
00:04:43,140 --> 00:04:46,960
[Music]

122
00:04:44,479 --> 00:04:49,360
so we provide the raw data and

123
00:04:46,960 --> 00:04:51,680
they'll use the data to help fix up

124
00:04:49,360 --> 00:04:54,000
their system and that's sort of

125
00:04:51,680 --> 00:04:57,280
um been enough to help

126
00:04:54,000 --> 00:05:00,000
fund the organization and keep a few

127
00:04:57,280 --> 00:05:04,000
key employees keeping the lights on

128
00:05:00,000 --> 00:05:04,000
fixing bugs adding new features

129
00:05:04,800 --> 00:05:08,479
um

130
00:05:06,800 --> 00:05:11,919
there's a public api

131
00:05:08,479 --> 00:05:13,680
anyone can do um use it

132
00:05:11,919 --> 00:05:15,120
uh the thing that we need

133
00:05:13,680 --> 00:05:17,199
from that if you're gonna use the public

134
00:05:15,120 --> 00:05:19,840
api is

135
00:05:17,199 --> 00:05:24,080
in your http headers you need to include

136
00:05:19,840 --> 00:05:25,759
a string to say who you are so we can

137
00:05:24,080 --> 00:05:27,440
let you know if you're doing something

138
00:05:25,759 --> 00:05:30,479
really bad

139
00:05:27,440 --> 00:05:32,639
that is occasionally being needed and

140
00:05:30,479 --> 00:05:33,680
if you have a qnap

141
00:05:32,639 --> 00:05:36,000
nas

142
00:05:33,680 --> 00:05:36,880
they would be

143
00:05:36,000 --> 00:05:39,840
um

144
00:05:36,880 --> 00:05:41,840
hitting the api every nas that

145
00:05:39,840 --> 00:05:42,800
was in existence was hitting the api so

146
00:05:41,840 --> 00:05:44,960
much

147
00:05:42,800 --> 00:05:47,759
that eventually they decided to block

148
00:05:44,960 --> 00:05:50,479
them and if you need to

149
00:05:47,759 --> 00:05:53,759
use the api you need to say can you

150
00:05:50,479 --> 00:05:56,160
please unblock me until they updated the

151
00:05:53,759 --> 00:05:59,520
software to not pit than that every

152
00:05:56,160 --> 00:05:59,520
single every five minutes

153
00:05:59,840 --> 00:06:03,919
um

154
00:06:01,039 --> 00:06:06,720
each entry hasn't edited history

155
00:06:03,919 --> 00:06:07,919
so you can easily say who was the idiot

156
00:06:06,720 --> 00:06:09,840
that

157
00:06:07,919 --> 00:06:12,800
made this change and

158
00:06:09,840 --> 00:06:15,199
okay hopefully it's not you that made it

159
00:06:12,800 --> 00:06:17,360
stuff things up

160
00:06:15,199 --> 00:06:17,360
so

161
00:06:18,000 --> 00:06:21,520
the basic data structure is we've got

162
00:06:20,080 --> 00:06:23,440
artists

163
00:06:21,520 --> 00:06:25,199
we've got release groups

164
00:06:23,440 --> 00:06:27,039
which i'll get to in a minute but that's

165
00:06:25,199 --> 00:06:30,080
sort of the

166
00:06:27,039 --> 00:06:32,080
overall concept of an album so

167
00:06:30,080 --> 00:06:35,120
there is a release which is a specific

168
00:06:32,080 --> 00:06:38,639
one so you might have a cd with

169
00:06:35,120 --> 00:06:39,600
13 tracks and a special edition with 18

170
00:06:38,639 --> 00:06:41,680
tracks

171
00:06:39,600 --> 00:06:43,520
and another edition with

172
00:06:41,680 --> 00:06:44,560
two cds

173
00:06:43,520 --> 00:06:45,360
so

174
00:06:44,560 --> 00:06:46,639
to

175
00:06:45,360 --> 00:06:48,880
fit

176
00:06:46,639 --> 00:06:51,199
all the concept of these different

177
00:06:48,880 --> 00:06:53,440
editions this the release group so

178
00:06:51,199 --> 00:06:54,639
anything that fits in the

179
00:06:53,440 --> 00:06:56,319
overall

180
00:06:54,639 --> 00:06:58,560
properties

181
00:06:56,319 --> 00:07:01,199
the data stores the release group

182
00:06:58,560 --> 00:07:03,840
anything that's specific to a

183
00:07:01,199 --> 00:07:03,840
a

184
00:07:04,000 --> 00:07:10,240
particular cd goes on the release

185
00:07:07,840 --> 00:07:11,599
there is medium so

186
00:07:10,240 --> 00:07:12,880
there's

187
00:07:11,599 --> 00:07:15,039
we can have

188
00:07:12,880 --> 00:07:17,680
data about things like

189
00:07:15,039 --> 00:07:20,240
records and

190
00:07:17,680 --> 00:07:22,240
wax cylinders and all sorts of other

191
00:07:20,240 --> 00:07:23,599
medium types

192
00:07:22,240 --> 00:07:25,039
uh

193
00:07:23,599 --> 00:07:27,199
you can add

194
00:07:25,039 --> 00:07:29,360
of course i've got recordings

195
00:07:27,199 --> 00:07:31,039
we'll get to that bit later

196
00:07:29,360 --> 00:07:32,720
there's works

197
00:07:31,039 --> 00:07:35,120
which is sort of the writing credits and

198
00:07:32,720 --> 00:07:38,080
that sort of thing as opposed to

199
00:07:35,120 --> 00:07:38,080
who sung or not

200
00:07:38,240 --> 00:07:43,840
um there's labels

201
00:07:40,639 --> 00:07:45,919
um the series

202
00:07:43,840 --> 00:07:48,319
uh relationships between the two and

203
00:07:45,919 --> 00:07:51,360
then there's a few other things like

204
00:07:48,319 --> 00:07:53,759
um we've got a definitive list of

205
00:07:51,360 --> 00:07:55,520
all possible instruments

206
00:07:53,759 --> 00:07:58,000
and things like that that are maintained

207
00:07:55,520 --> 00:08:00,080
by the

208
00:07:58,000 --> 00:08:01,840
the editors and developers

209
00:08:00,080 --> 00:08:03,120
so

210
00:08:01,840 --> 00:08:05,919
here's

211
00:08:03,120 --> 00:08:08,240
an example of what you'd go to see daft

212
00:08:05,919 --> 00:08:11,240
punk

213
00:08:08,240 --> 00:08:11,240
um

214
00:08:12,639 --> 00:08:16,960
one thing that you might want to

215
00:08:15,120 --> 00:08:19,599
take notice of

216
00:08:16,960 --> 00:08:20,720
for the from the glam perspective is

217
00:08:19,599 --> 00:08:23,120
uh we

218
00:08:20,720 --> 00:08:26,879
try to link to other databases and have

219
00:08:23,120 --> 00:08:30,560
identifiers as much as we can so

220
00:08:26,879 --> 00:08:31,440
ipi and isis are

221
00:08:30,560 --> 00:08:33,200
um

222
00:08:31,440 --> 00:08:36,000
therefore

223
00:08:33,200 --> 00:08:40,640
right societies and that sort of thing

224
00:08:36,000 --> 00:08:41,519
to track a particular person or a group

225
00:08:40,640 --> 00:08:43,680
um

226
00:08:41,519 --> 00:08:45,519
there's attributes to say

227
00:08:43,680 --> 00:08:46,560
who are the members of the group they'll

228
00:08:45,519 --> 00:08:48,959
be another

229
00:08:46,560 --> 00:08:50,959
artist entry so it

230
00:08:48,959 --> 00:08:53,200
so you can have

231
00:08:50,959 --> 00:08:55,839
further detail

232
00:08:53,200 --> 00:08:55,839
um

233
00:08:57,680 --> 00:09:02,880
we link to official home pages um

234
00:09:00,720 --> 00:09:04,240
youtube

235
00:09:02,880 --> 00:09:06,959
pages

236
00:09:04,240 --> 00:09:09,680
twitter accounts that sort of thing

237
00:09:06,959 --> 00:09:12,480
so if you've got a sound cloud

238
00:09:09,680 --> 00:09:14,000
we want that sort of information so that

239
00:09:12,480 --> 00:09:16,800
makes it easier for

240
00:09:14,000 --> 00:09:20,800
if someone finds an artist they can

241
00:09:16,800 --> 00:09:20,800
find more information about them

242
00:09:21,440 --> 00:09:24,160
so

243
00:09:22,880 --> 00:09:26,240
yeah lots of

244
00:09:24,160 --> 00:09:27,440
lots of urls lots of

245
00:09:26,240 --> 00:09:29,440
extra data

246
00:09:27,440 --> 00:09:31,839
so

247
00:09:29,440 --> 00:09:35,200
uh release groups

248
00:09:31,839 --> 00:09:37,200
sort of the overall thing so

249
00:09:35,200 --> 00:09:37,920
one thing that i'd like

250
00:09:37,200 --> 00:09:40,959
to

251
00:09:37,920 --> 00:09:43,680
to include is if you

252
00:09:40,959 --> 00:09:46,080
look down a bit further that

253
00:09:43,680 --> 00:09:48,560
um you can have um

254
00:09:46,080 --> 00:09:48,560
link to

255
00:09:48,839 --> 00:09:53,600
singles linked to other releases and

256
00:09:51,680 --> 00:09:56,560
there's a whole bunch of relationships

257
00:09:53,600 --> 00:09:58,399
of this relates to this other thing

258
00:09:56,560 --> 00:10:00,160
so everything's built

259
00:09:58,399 --> 00:10:03,600
with relationships

260
00:10:00,160 --> 00:10:05,279
that you as an editor can add and

261
00:10:03,600 --> 00:10:08,160
expanding

262
00:10:05,279 --> 00:10:09,839
um the knowledge so

263
00:10:08,160 --> 00:10:12,640
so if you look at

264
00:10:09,839 --> 00:10:14,480
associated singles from

265
00:10:12,640 --> 00:10:19,399
random access memories they've got get

266
00:10:14,480 --> 00:10:19,399
lucky instant crush etc

267
00:10:20,560 --> 00:10:26,160
so getting back down to the

268
00:10:23,440 --> 00:10:29,680
further level of a release

269
00:10:26,160 --> 00:10:30,959
you can have things like barcode

270
00:10:29,680 --> 00:10:33,640
there's

271
00:10:30,959 --> 00:10:35,600
cover art isn't hosted by

272
00:10:33,640 --> 00:10:38,399
musicbrainz.org

273
00:10:35,600 --> 00:10:39,760
that's hosted by archive.org for on our

274
00:10:38,399 --> 00:10:40,880
behalf

275
00:10:39,760 --> 00:10:44,160
so if you

276
00:10:40,880 --> 00:10:44,160
upload cover uh

277
00:10:44,640 --> 00:10:49,760
archive.org will take

278
00:10:47,680 --> 00:10:52,160
we'll host it for us and

279
00:10:49,760 --> 00:10:54,560
if there's a copyright issue

280
00:10:52,160 --> 00:10:57,040
they can deal with it as another

281
00:10:54,560 --> 00:11:00,040
organization

282
00:10:57,040 --> 00:11:00,040
um

283
00:11:01,200 --> 00:11:05,360
you've got a list of tracks

284
00:11:03,360 --> 00:11:07,040
there

285
00:11:05,360 --> 00:11:09,600
there'll be catalog numbers and that

286
00:11:07,040 --> 00:11:09,600
sort of thing

287
00:11:10,800 --> 00:11:14,800
um

288
00:11:12,079 --> 00:11:17,120
so mediums

289
00:11:14,800 --> 00:11:17,120
um

290
00:11:17,200 --> 00:11:21,040
there'll be a type so it'll be cd or dvd

291
00:11:20,240 --> 00:11:22,959
or

292
00:11:21,040 --> 00:11:24,240
just digital to download

293
00:11:22,959 --> 00:11:26,480
um

294
00:11:24,240 --> 00:11:28,399
one of the edge cases that i took or two

295
00:11:26,480 --> 00:11:31,200
educations that i'd like to

296
00:11:28,399 --> 00:11:31,200
point out is

297
00:11:31,440 --> 00:11:35,760
a thing called a pre-gap track

298
00:11:34,560 --> 00:11:39,760
um

299
00:11:35,760 --> 00:11:43,360
some cds when people were experimenting

300
00:11:39,760 --> 00:11:45,920
um a cd is not a doesn't have a table of

301
00:11:43,360 --> 00:11:47,200
con a proper table of contents it's not

302
00:11:45,920 --> 00:11:49,680
a data track

303
00:11:47,200 --> 00:11:50,480
it's just pcm audio

304
00:11:49,680 --> 00:11:52,240
and

305
00:11:50,480 --> 00:11:55,120
it's got a

306
00:11:52,240 --> 00:11:58,320
when you put in a cd it reads

307
00:11:55,120 --> 00:11:59,839
the table of contents is start at

308
00:11:58,320 --> 00:12:02,160
one minute start at three minutes

309
00:11:59,839 --> 00:12:04,639
started five minutes for the next track

310
00:12:02,160 --> 00:12:07,680
with a two seven gap

311
00:12:04,639 --> 00:12:09,200
um there are a handful of cds that have

312
00:12:07,680 --> 00:12:11,279
hidden tracks

313
00:12:09,200 --> 00:12:14,320
so you put the cd in

314
00:12:11,279 --> 00:12:15,200
you press rewind and there's hidden data

315
00:12:14,320 --> 00:12:16,240
there

316
00:12:15,200 --> 00:12:18,720
so

317
00:12:16,240 --> 00:12:20,800
if you've got some of the early hilltop

318
00:12:18,720 --> 00:12:23,519
hoods

319
00:12:20,800 --> 00:12:25,600
they have hidden tracks

320
00:12:23,519 --> 00:12:27,519
it's just them chatting but

321
00:12:25,600 --> 00:12:28,399
some cool things like that

322
00:12:27,519 --> 00:12:30,320
um

323
00:12:28,399 --> 00:12:32,560
the other example

324
00:12:30,320 --> 00:12:34,320
um

325
00:12:32,560 --> 00:12:37,040
is a from

326
00:12:34,320 --> 00:12:38,720
nine inch nails broken

327
00:12:37,040 --> 00:12:41,279
what they did is

328
00:12:38,720 --> 00:12:44,240
the maximum tracks that you can have on

329
00:12:41,279 --> 00:12:46,000
a cd is 99

330
00:12:44,240 --> 00:12:48,160
so they had

331
00:12:46,000 --> 00:12:49,839
the first six were normal

332
00:12:48,160 --> 00:12:53,600
then

333
00:12:49,839 --> 00:12:56,160
a whole bunch of tracks that had

334
00:12:53,600 --> 00:12:57,519
a fraction of a second as audio and then

335
00:12:56,160 --> 00:13:00,720
the last two

336
00:12:57,519 --> 00:13:02,959
are the tracks 98 and 99 so

337
00:13:00,720 --> 00:13:05,519
if you've got a cd player

338
00:13:02,959 --> 00:13:07,519
it would play skip to last two and you

339
00:13:05,519 --> 00:13:10,800
can't go back really easily because

340
00:13:07,519 --> 00:13:10,800
there's a whole bunch of silence

341
00:13:12,320 --> 00:13:15,920
so

342
00:13:13,760 --> 00:13:17,440
recordings is where you add an awful lot

343
00:13:15,920 --> 00:13:21,279
of data

344
00:13:17,440 --> 00:13:22,240
if you it's sort of up to the users how

345
00:13:21,279 --> 00:13:24,240
um

346
00:13:22,240 --> 00:13:26,560
good the data is but

347
00:13:24,240 --> 00:13:29,120
uh the schema allowance for things like

348
00:13:26,560 --> 00:13:32,839
who played best bass guitar who played

349
00:13:29,120 --> 00:13:35,760
keyboard who who

350
00:13:32,839 --> 00:13:38,320
sung so

351
00:13:35,760 --> 00:13:39,279
with some of these really popular tracks

352
00:13:38,320 --> 00:13:42,000
you get

353
00:13:39,279 --> 00:13:44,079
extra data that you can use to tag your

354
00:13:42,000 --> 00:13:45,920
music

355
00:13:44,079 --> 00:13:48,639
and

356
00:13:45,920 --> 00:13:50,800
the other thing is they usually if you

357
00:13:48,639 --> 00:13:52,800
someone's gone to the time

358
00:13:50,800 --> 00:13:54,880
they'll have works

359
00:13:52,800 --> 00:13:56,959
so works is

360
00:13:54,880 --> 00:13:57,920
who wrote the thing

361
00:13:56,959 --> 00:14:00,399
um

362
00:13:57,920 --> 00:14:02,800
it's got um

363
00:14:00,399 --> 00:14:04,800
ids from right societies

364
00:14:02,800 --> 00:14:07,279
allowing someone else to

365
00:14:04,800 --> 00:14:08,320
double check your work

366
00:14:07,279 --> 00:14:11,120
and

367
00:14:08,320 --> 00:14:11,120
they're all linked to

368
00:14:11,199 --> 00:14:15,440
to who actually wrote it and

369
00:14:14,320 --> 00:14:16,560
yeah

370
00:14:15,440 --> 00:14:18,240
um

371
00:14:16,560 --> 00:14:19,760
the other thing that

372
00:14:18,240 --> 00:14:21,920
uh

373
00:14:19,760 --> 00:14:23,279
you want to take note of is uh classical

374
00:14:21,920 --> 00:14:24,399
music

375
00:14:23,279 --> 00:14:26,800
uh

376
00:14:24,399 --> 00:14:28,399
in music brains is sort of

377
00:14:26,800 --> 00:14:31,040
more important

378
00:14:28,399 --> 00:14:32,639
to have works than it is to have

379
00:14:31,040 --> 00:14:35,120
artists on track

380
00:14:32,639 --> 00:14:37,279
because the classical music

381
00:14:35,120 --> 00:14:38,240
they tend to deal with

382
00:14:37,279 --> 00:14:41,360
um

383
00:14:38,240 --> 00:14:45,440
barriers which have movements and sub

384
00:14:41,360 --> 00:14:47,199
sub subsections so having that sort of

385
00:14:45,440 --> 00:14:49,440
structure

386
00:14:47,199 --> 00:14:50,959
as a series of works that link to other

387
00:14:49,440 --> 00:14:51,920
sub works

388
00:14:50,959 --> 00:14:54,000
allows

389
00:14:51,920 --> 00:14:55,519
people that are interested in classical

390
00:14:54,000 --> 00:14:59,839
music to say

391
00:14:55,519 --> 00:14:59,839
i want this want to listen to this bits

392
00:15:00,240 --> 00:15:04,320
um

393
00:15:02,480 --> 00:15:05,760
so

394
00:15:04,320 --> 00:15:07,839
if you've got time

395
00:15:05,760 --> 00:15:10,320
feel free to look at

396
00:15:07,839 --> 00:15:13,279
there's also labels

397
00:15:10,320 --> 00:15:16,240
which is sort of who you are into the

398
00:15:13,279 --> 00:15:17,680
copyright and who is the publisher

399
00:15:16,240 --> 00:15:19,360
and uh

400
00:15:17,680 --> 00:15:20,800
there's a

401
00:15:19,360 --> 00:15:22,480
list of string

402
00:15:20,800 --> 00:15:23,920
list of uh

403
00:15:22,480 --> 00:15:25,279
series is sort of something that they've

404
00:15:23,920 --> 00:15:27,760
added

405
00:15:25,279 --> 00:15:30,560
sort of five years ago for things like

406
00:15:27,760 --> 00:15:33,040
compilation albums which

407
00:15:30,560 --> 00:15:34,639
it's going to be the ministry is down

408
00:15:33,040 --> 00:15:36,639
year number

409
00:15:34,639 --> 00:15:38,560
that sort of thing

410
00:15:36,639 --> 00:15:40,399
so everything is a

411
00:15:38,560 --> 00:15:42,399
sort of fixed structure

412
00:15:40,399 --> 00:15:42,870
um creating

413
00:15:42,399 --> 00:15:45,040
um

414
00:15:42,870 --> 00:15:46,560
[Music]

415
00:15:45,040 --> 00:15:48,320
creating new relationships that's sort

416
00:15:46,560 --> 00:15:49,680
of hard coded

417
00:15:48,320 --> 00:15:51,600
so

418
00:15:49,680 --> 00:15:54,399
there's a lot of

419
00:15:51,600 --> 00:15:57,199
asking in forums discussing tickets on

420
00:15:54,399 --> 00:15:57,199
what gets added

421
00:15:58,720 --> 00:16:03,360
and some things like musical instruments

422
00:16:01,440 --> 00:16:05,839
there's a person that

423
00:16:03,360 --> 00:16:08,480
has a list of possible music instruments

424
00:16:05,839 --> 00:16:11,759
and you've got to ask them to add it to

425
00:16:08,480 --> 00:16:12,800
the system in the background back end

426
00:16:11,759 --> 00:16:15,040
um

427
00:16:12,800 --> 00:16:17,120
there's a query api which

428
00:16:15,040 --> 00:16:18,800
you send it a string and it'll return

429
00:16:17,120 --> 00:16:20,000
you with the album or the barcode or

430
00:16:18,800 --> 00:16:23,040
something like that

431
00:16:20,000 --> 00:16:26,240
and then you probably want to look up

432
00:16:23,040 --> 00:16:28,079
quit do a look up thing to say give me

433
00:16:26,240 --> 00:16:29,120
info about this id

434
00:16:28,079 --> 00:16:31,120
this

435
00:16:29,120 --> 00:16:33,360
album

436
00:16:31,120 --> 00:16:34,480
release group etc

437
00:16:33,360 --> 00:16:36,880
so the other

438
00:16:34,480 --> 00:16:39,360
sort of related thing

439
00:16:36,880 --> 00:16:41,600
if i've got time is um

440
00:16:39,360 --> 00:16:44,320
let's talk about wikidata

441
00:16:41,600 --> 00:16:46,639
so that sort of built in

442
00:16:44,320 --> 00:16:49,600
a different sort of thing it's

443
00:16:46,639 --> 00:16:52,079
everything is item property value

444
00:16:49,600 --> 00:16:54,480
so everything's built from that sort of

445
00:16:52,079 --> 00:16:55,360
basic structure

446
00:16:54,480 --> 00:16:56,720
so

447
00:16:55,360 --> 00:17:00,560
if you go to

448
00:16:56,720 --> 00:17:00,560
daf punk's wiki data entry

449
00:17:00,800 --> 00:17:07,120
it's an instance of an electronic

450
00:17:03,600 --> 00:17:09,120
uh electronic duo which is a

451
00:17:07,120 --> 00:17:12,559
instance of a musical group

452
00:17:09,120 --> 00:17:12,559
they've got start and end years

453
00:17:13,600 --> 00:17:18,439
he does good for things like um

454
00:17:18,640 --> 00:17:23,439
awards received

455
00:17:20,319 --> 00:17:25,120
you can easily add more rewards so they

456
00:17:23,439 --> 00:17:28,160
find a grammy

457
00:17:25,120 --> 00:17:28,160
record that properly

458
00:17:28,400 --> 00:17:32,799
and it's another thing for

459
00:17:30,960 --> 00:17:35,200
good source of

460
00:17:32,799 --> 00:17:36,400
including external line identifiers

461
00:17:35,200 --> 00:17:38,400
so you can

462
00:17:36,400 --> 00:17:40,400
from music brains you can link to

463
00:17:38,400 --> 00:17:42,640
wikidata and from wikidata you can link

464
00:17:40,400 --> 00:17:46,240
back to musicbrainz so

465
00:17:42,640 --> 00:17:49,120
looking at one you can look at the other

466
00:17:46,240 --> 00:17:50,480
um properties are added so relationships

467
00:17:49,120 --> 00:17:54,799
are added

468
00:17:50,480 --> 00:17:54,799
through a voting process process

469
00:17:55,200 --> 00:17:58,640
you sort of

470
00:17:56,960 --> 00:18:01,280
go through that process and say i want

471
00:17:58,640 --> 00:18:04,320
to have

472
00:18:01,280 --> 00:18:05,760
for pro something so going from a talk

473
00:18:04,320 --> 00:18:08,400
earlier

474
00:18:05,760 --> 00:18:09,520
if i could

475
00:18:08,400 --> 00:18:10,960
add my

476
00:18:09,520 --> 00:18:12,720
web id

477
00:18:10,960 --> 00:18:14,640
from talk earlier

478
00:18:12,720 --> 00:18:18,160
that was a

479
00:18:14,640 --> 00:18:18,160
something that was proposed

480
00:18:18,240 --> 00:18:22,960
and

481
00:18:19,360 --> 00:18:24,880
give it a what format of your urls uh

482
00:18:22,960 --> 00:18:27,679
and

483
00:18:24,880 --> 00:18:29,039
pattern matching and that's how wikidata

484
00:18:27,679 --> 00:18:31,440
extends there

485
00:18:29,039 --> 00:18:32,640
possible things that you can

486
00:18:31,440 --> 00:18:33,679
uh

487
00:18:32,640 --> 00:18:35,440
use

488
00:18:33,679 --> 00:18:36,880
so querying

489
00:18:35,440 --> 00:18:42,120
um

490
00:18:36,880 --> 00:18:42,120
is done through a thing called a graphql

491
00:18:42,320 --> 00:18:48,640
it's done through a thing called a spa

492
00:18:45,840 --> 00:18:50,640
through a sparkle i mean

493
00:18:48,640 --> 00:18:52,320
it sort of looks like this

494
00:18:50,640 --> 00:18:53,679
you

495
00:18:52,320 --> 00:18:54,840
you have a

496
00:18:53,679 --> 00:18:58,960
list of select

497
00:18:54,840 --> 00:19:01,120
statements so you define what you want

498
00:18:58,960 --> 00:19:02,960
you'll have

499
00:19:01,120 --> 00:19:05,120
a list of where sort of

500
00:19:02,960 --> 00:19:08,720
clauses to say

501
00:19:05,120 --> 00:19:10,240
this property on an entry

502
00:19:08,720 --> 00:19:12,799
look for things with this property on

503
00:19:10,240 --> 00:19:14,320
the entry

504
00:19:12,799 --> 00:19:17,200
where's

505
00:19:14,320 --> 00:19:18,080
exist or not exist this such and such

506
00:19:17,200 --> 00:19:20,799
limit

507
00:19:18,080 --> 00:19:21,840
so this is sort of the

508
00:19:20,799 --> 00:19:23,280
that's

509
00:19:21,840 --> 00:19:25,679
thing um

510
00:19:23,280 --> 00:19:28,160
so

511
00:19:25,679 --> 00:19:31,039
changing gears to

512
00:19:28,160 --> 00:19:34,000
if you wanted to look at

513
00:19:31,039 --> 00:19:35,520
uh things with the act my id

514
00:19:34,000 --> 00:19:37,280
this is the sort of query that you'd

515
00:19:35,520 --> 00:19:39,120
have to say

516
00:19:37,280 --> 00:19:42,720
give me the

517
00:19:39,120 --> 00:19:42,720
act my id which is that there

518
00:19:43,039 --> 00:19:47,360
and

519
00:19:44,960 --> 00:19:48,720
optionally return genre if they've got

520
00:19:47,360 --> 00:19:49,840
that on the

521
00:19:48,720 --> 00:19:53,679
film

522
00:19:49,840 --> 00:19:55,919
option include the country the director

523
00:19:53,679 --> 00:19:57,120
and based on so

524
00:19:55,919 --> 00:19:59,200
there's all sorts of things that you can

525
00:19:57,120 --> 00:20:00,640
do with

526
00:19:59,200 --> 00:20:02,559
with that so

527
00:20:00,640 --> 00:20:04,159
i think my time is

528
00:20:02,559 --> 00:20:06,880
nearly up

529
00:20:04,159 --> 00:20:09,120
so

530
00:20:06,880 --> 00:20:10,720
yes thank you so much and we've actually

531
00:20:09,120 --> 00:20:12,080
got quite a few questions for you which

532
00:20:10,720 --> 00:20:14,480
is fantastic

533
00:20:12,080 --> 00:20:17,120
um first question is are there any

534
00:20:14,480 --> 00:20:21,240
metadata standards for this stuff or is

535
00:20:17,120 --> 00:20:21,240
music brains just the de facto

536
00:20:22,559 --> 00:20:26,880
it's we've got an api and you use the

537
00:20:24,640 --> 00:20:28,000
api it's yeah

538
00:20:26,880 --> 00:20:28,960
it's all

539
00:20:28,000 --> 00:20:31,159
um

540
00:20:28,960 --> 00:20:34,240
it's available in

541
00:20:31,159 --> 00:20:36,000
json.xml idf

542
00:20:34,240 --> 00:20:38,720
it's sort of

543
00:20:36,000 --> 00:20:41,280
it used to be

544
00:20:38,720 --> 00:20:43,840
they've de-emphasized idf because

545
00:20:41,280 --> 00:20:44,960
no one was really using it it was an api

546
00:20:43,840 --> 00:20:47,280
that no one

547
00:20:44,960 --> 00:20:49,039
really used so everyone just uses json

548
00:20:47,280 --> 00:20:50,080
yeah pretty much

549
00:20:49,039 --> 00:20:52,880
okay

550
00:20:50,080 --> 00:20:56,720
um how does music brains deal with

551
00:20:52,880 --> 00:20:56,720
erroneous or conflicting data

552
00:20:56,880 --> 00:21:03,120
um

553
00:20:58,720 --> 00:21:05,840
it's usually someone will edit it out so

554
00:21:03,120 --> 00:21:06,880
i think the data quality is it's pretty

555
00:21:05,840 --> 00:21:11,520
much

556
00:21:06,880 --> 00:21:13,679
there is 90 95 accurate

557
00:21:11,520 --> 00:21:16,000
looking at random cd

558
00:21:13,679 --> 00:21:16,880
find the occasional typho here or there

559
00:21:16,000 --> 00:21:18,000
but

560
00:21:16,880 --> 00:21:20,080
um sort of

561
00:21:18,000 --> 00:21:22,720
because the way that the editing system

562
00:21:20,080 --> 00:21:26,159
is it enforces constraints so that

563
00:21:22,720 --> 00:21:28,080
people are less likely to make idiot

564
00:21:26,159 --> 00:21:30,159
moves you sort of

565
00:21:28,080 --> 00:21:31,760
feeds you down the path to the most

566
00:21:30,159 --> 00:21:34,480
correct

567
00:21:31,760 --> 00:21:39,200
data and someone can then

568
00:21:34,480 --> 00:21:39,200
um come back later and fix your mistakes

569
00:21:40,000 --> 00:21:45,840
um in the streaming world how does music

570
00:21:42,480 --> 00:21:45,840
brains work

571
00:21:46,080 --> 00:21:50,559
it

572
00:21:48,240 --> 00:21:54,000
doesn't really make a difference it's an

573
00:21:50,559 --> 00:21:54,799
album it's got a list of tracks

574
00:21:54,000 --> 00:21:55,760
it

575
00:21:54,799 --> 00:21:57,280
yeah

576
00:21:55,760 --> 00:21:59,520
it

577
00:21:57,280 --> 00:22:01,280
there's a digital medium so

578
00:21:59,520 --> 00:22:04,000
you might get

579
00:22:01,280 --> 00:22:05,039
it that it's digital and that's it

580
00:22:04,000 --> 00:22:08,000
okay

581
00:22:05,039 --> 00:22:09,760
um do record labels or artists send data

582
00:22:08,000 --> 00:22:11,840
to music brains in a way that can be

583
00:22:09,760 --> 00:22:14,640
ingested easily or are they just

584
00:22:11,840 --> 00:22:17,280
generally unhelpful

585
00:22:14,640 --> 00:22:19,679
uh they're generally unhelpful there's

586
00:22:17,280 --> 00:22:22,240
there's been a few proposals of back-end

587
00:22:19,679 --> 00:22:22,240
systems but

588
00:22:22,799 --> 00:22:27,600
they've sort of been

589
00:22:24,960 --> 00:22:28,559
against automation a fair bit it's sort

590
00:22:27,600 --> 00:22:30,799
of

591
00:22:28,559 --> 00:22:34,960
we want quality instead of

592
00:22:30,799 --> 00:22:34,960
random junk that the label sent you so

593
00:22:35,440 --> 00:22:41,120
yeah um and one last question is it only

594
00:22:38,799 --> 00:22:43,360
via api or are there forms that people

595
00:22:41,120 --> 00:22:45,280
can fill in to

596
00:22:43,360 --> 00:22:48,880
add data

597
00:22:45,280 --> 00:22:50,799
so you just go the website

598
00:22:48,880 --> 00:22:53,280
if you want to

599
00:22:50,799 --> 00:22:53,280
find a

600
00:22:53,600 --> 00:22:58,960
find dark punk

601
00:22:56,640 --> 00:23:02,240
punk

602
00:22:58,960 --> 00:23:02,240
so say if i wanted to

603
00:23:03,760 --> 00:23:07,440
edit this and

604
00:23:05,360 --> 00:23:09,600
mess with the title or something

605
00:23:07,440 --> 00:23:10,640
forget it

606
00:23:09,600 --> 00:23:13,120
and

607
00:23:10,640 --> 00:23:14,559
that's the title ah cool

608
00:23:13,120 --> 00:23:17,280
change the

609
00:23:14,559 --> 00:23:17,280
article

610
00:23:17,360 --> 00:23:21,280
something else

611
00:23:19,440 --> 00:23:23,120
click next next

612
00:23:21,280 --> 00:23:25,760
and

613
00:23:23,120 --> 00:23:28,240
click accept but you need to put in that

614
00:23:25,760 --> 00:23:29,360
note to say what you're doing so

615
00:23:28,240 --> 00:23:31,120
yeah

616
00:23:29,360 --> 00:23:33,360
sort of now that seems quite simple you

617
00:23:31,120 --> 00:23:34,799
follow the bounding box and hopefully

618
00:23:33,360 --> 00:23:36,559
get there

619
00:23:34,799 --> 00:23:39,120
we should um we should all contribute to

620
00:23:36,559 --> 00:23:41,360
this um what's relatively easy today so

621
00:23:39,120 --> 00:23:43,840
um that's fantastic well thank you so

622
00:23:41,360 --> 00:23:46,960
much daniel that's um

623
00:23:43,840 --> 00:23:48,559
it was great to see those those two

624
00:23:46,960 --> 00:23:50,400
there's two sites for us to contribute

625
00:23:48,559 --> 00:23:52,080
to because i always know that i'm happy

626
00:23:50,400 --> 00:23:55,440
to put in

627
00:23:52,080 --> 00:23:58,080
more data where possible to help out so

628
00:23:55,440 --> 00:24:00,400
thank you so much

629
00:23:58,080 --> 00:24:02,880
go to musicbrainz.org

630
00:24:00,400 --> 00:24:05,279
download a program called picard

631
00:24:02,880 --> 00:24:07,679
that's the tagger that they work on and

632
00:24:05,279 --> 00:24:10,320
there's a few other taggers so

633
00:24:07,679 --> 00:24:12,880
some of the cd rippers will

634
00:24:10,320 --> 00:24:16,320
you put the cd in it'll

635
00:24:12,880 --> 00:24:16,320
automatically retrieve the data

636
00:24:16,480 --> 00:24:18,880
wonderful

637
00:24:17,520 --> 00:24:22,520
thank you

638
00:24:18,880 --> 00:24:22,520
okay thank you

