Current state : LinguaLibre's naming convention is largely based on - as the main field separator.
( and ) are also field informers.
Review
LinguaLibre Queries Services let you see the current situation.
List of speaker by name and presense of - as separator :
sparql SELECT * WHERE { ?id prop:P2 entity:Q3 . SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . ?id rdfs:label ?name . } BIND (regex(STR(?name),"-") AS ?has_separator) } ORDER BY DESC (?has_separator)
How many:
sparql SELECT ?has_separator (COUNT(?has_separator) AS ?found) WHERE { ?id prop:P2 entity:Q3 . SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . ?id rdfs:label ?name . } BIND (regex(STR(?name),"-") AS ?has_separator) # filter( regex(?name, "-" )) } #ORDER BY DESC (?has_separator) GROUP BY (?has_separator)
51/1000 (5%) of the speakers' username contain -, which makes regex on their files more unpredictable. A better field separator would be welcome.
Suggestion
On peut/doit prendre un plus rare, a minima qui n'est pas un des caractères présents sur nos claviers.
U+FF0D - FULLWIDTH HYPHEN-MINUS LL-Q150 (fra)-Roll-Morton-vert.wav
Ou doubler le séparateur, pour créer une pratique unique. LL--Q150 (fra)--Roll-Morton--vert.wav