Le Petit Larousse, testimone della storia del XX secolo? Introducing the Nénufar project: a digital diachronic edition Hervé Bohbot1, Giancarlo Luxardo1, Agnès Steuckardt1, Chantal Wionet2 1Praxiling 2Centre UMR 5267 (Université Paul-Valéry Montpellier 3, CNRS) Norbert Elias UMR 8562 (Université d’Avignon et des Pays de Vaucluse, CNRS) ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Plan 1. Larousse dictionaries  Petit Larousse illustré, birth and evolution  Rare digital resources  Opportunity and rationale: « orthographic rectifications » of 1990 2. The Nénufar project: corpus and web site 3. Witnessing social and terminology changes  Nomenclature changes 1906-1925  Design/conceal: two opposite evolutions across XXth c. • From ethnicity and clichés to political correctness • From prudishness to sexual education Conclusions ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Larousse dictionaries Pierre Larousse (1817–1875) • 1856: Nouveau dictionnaire de la langue française (700 p.) • 1865-1876: Grand Dictionnaire universel du XIXe siècle (15 vol. in-4°, 20.000 p. + 2 suppl. 1878 & 1890) Librairie Larousse dir. Claude Augé (1854-1924) 1889: Dictionnaire complet illustré 1897-1904: Nouveau Larousse Illustré (7 vol.) 1905: Petit Larousse illustré (PLI) 1907–1908: Larousse pour tous (2 vol.) 1922: Larousse universel (2 vol.) 1924: Nouveau Petit Larousse illustré dir. Paul Augé (1881-1951) • 1927-1933: Larousse du XXe siècle (6 vol.) • 1948: Nouveau Larousse universel (2 vol.) • • • • • • ICHLL-9 - Santa Margherita Ligure – 22/6/2018 PLI: birth and evolution Predecessors: • 1856: Nouveau dictionnaire de la langue française (3 sections: language, locutions, proper names). • 1904: Nouveau Larousse illustré (nomenclature).  First edition published in July 1905 (dated 1906).  Handy and affordable, with each edition selling several hundreds of thousands copies.  Several editions (tirages) every year (millésime), with minor updates (ca. 230 between 1905 and 1924).  Rewriting (refonte) in 1924 (renamed Nouveau Petit Larousse Illustré)  About 430 editions between 1924 and 1947 (« semirefonte » : 1936)  Rewriting in 1947 (named again Petit Larousse Illustré) 1905 1924 ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Rare digital resources • On the Web, few digitized copies:  Google Books : Petit Larousse Illustré 1906 (5th edition).  Gallica : Petit Larousse Illustré 1922 (185th edition). Poor image quality: only readable. • Petit Larousse illustré 1905 en ligne (  Produced by UMR Lexique, Dictionnaires et Informatique H. Manuélian, A. Bruscand, N. Cholewka et A.M. Hetzel under the direction of Jean Pruvost.  Developed from 2004 to 2009.  Full-text search, metadata, TEI-based (« entryFree »)  Language part of the first edition. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Official site (down since 2015): Mirror site (without images): An im p o rta nt m ile sto ne b ut with limite d q ue ry fe a ture s, no up d a te s a nd no t o p e n so urc e . ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Orthographic « rectifications » In the rectifications orthographiques du français (1990), the recommended spelling is nénufar, according to etymology: (XIIIe siècle) Du persan, ‫ ﯾﻠوﻓر‬nīlūfar ou de l’arabe ‫ ﻧﯾﻧوﻓر‬nīnūfar ; plus avant, du sanskrit नीलोतपल, nīlotpala (« lotus bleu »), composé de नील, nīla (« bleu-noir ») et उतपल, utpala (« lotus »). 1948 1955 1906 2012 NÉNUPHAR,  NÉNUFAR n. m. (ar. Ninūfar) Plante aquatique... ICHLL-9 - Santa Margherita Ligure – 22/6/2018 2. The Nénufar project Nouvelle Édition Numérique de Fac-similés de Référence Editions published prior to 1948 are public domain according to French law, however, no open full text resource is currently available. The projects aims to:  Develop a diachronic corpus of a new lexicographic resource with heritage significance, targeting both a scientific audience (linguists, historians, sociologists…) and the general public for studying recent history of language, culture and techniques.  Contribute to current trends of modeling and standardization of lexicographic resources: reference digital edition format based on XML-TEI, for long-term preservation and expert querying; integration with Linked Open Data projects.  Make data available under a Creative Commons licence. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Web site Coming soon (summer 2018) with 1906 to 1924 editions. – New data will be added regularly. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 ICHLL-9 - Santa Margherita Ligure – 22/6/2018 XML TEI ICHLL-9 - Santa Margherita Ligure – 22/6/2018 (Aiming to) an interface providing best features... Plain or advanced search in full text or illustrations : multicriteria, part of words, year and type of modification... ICHLL-9 - Santa Margherita Ligure – 22/6/2018 3. Witnessing changes In their foreword, Editors pretend to give the most complete (for this format), updated year after year, image of the language and a neutral point of view. We may check this assertion and: • Retrieve all incoming and outgoing entries in the nomenclature. • Trace changes for each entry (evolution of senses, definitions, examples). ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Comparison 1906-1924 • 258 new entries. • 37 entries removed.  Less than 1% of changes in nomenclature. • 2860 entries modified.  ca. 7 % of modified entries. Almost half of new/deleted entries in the first three years: adjustment/completion of first edition. Year 1906 1907 1908 1909 1910 1911 1912 1913 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 ICHLL-9 - Santa Margherita Ligure – 22/6/2018 New entries 32 25 47 12 12 15 26 12 13 12 18 12 8 1 1 7 5 0 0 258 Deletions 12 1 5 1 2 3 5 1 3 3 1 37 Some new entries Words used in a military context: 1912 (antimilitarisme, antimilitariste), 1916 (encercler), 1917 (dreadnought, endeuillé, sidérant, sidéré), 1918 (boche, crapouillot, lacrymogène, serbe, torpillage), 1919 (camoufler), 1920 (alerter), 1921 (cagna ou cagnat, gelure) Politics/social/ideas: 1912 (septennat), 1916 (scientisme, scientiste, syndicalisme, syndicaliste), 1921 (bolchévik, bolchévisme, bolchéviste), 1922 (pogrom ou pogrome, soviet) Techniques: 1912 (cardan, freinage, freiner, technicien, monoplan), 1913 (biplan, polycopie), 1914 (sténotypie), 1915 (dactylotype, hydroaéroplane, motoculture), 1916 (avion, carter, fuselage), 1917 (sidérographie), 1918 (hydroavion, hydravion) ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Some new entries Medicine: 1911 (galactagogue, hypertrophié, insensibilisation, métrite, pileux), 1912 (puériculture), 1914 (trachome, uricémie), 1918 (sinusite) Rural life: 1912 (fongicide), 1913 (doucin), 1914 (caune), 1915 (brabant, écorage, écorer), 1922 (piégeage) Physics/chemistry: 1911 (pechblende), 1913 (ethane, valence), 1915 (coulomb), 1922 (mazout) Anglicisms: 1910 (smoking), 1911 (skiff), 1912 (select), 1913 (fox), 1914 (tract), 1915 (copyright), 1916 (carter, golf), 1917 (dreadnought) Familiar words: 1911 (fripouille), 1912 (voyou, romanichel), 1918 (pinard), 1921 (chichi) ICHLL-9 - Santa Margherita Ligure – 22/6/2018 A gap to be filled... 1906 1912 The entry of this opportunate BOTYS was probably due to some legal issue with using the trade name BOTTIN. (BOTYS went out in 1948 and BOTTIN is back in 1990).  A this time, updates have to deal with page composition: the last word of the page must not change to avoid « propagation ». For each addition, you may look for a mirroring shortening (and vice versa). ICHLL-9 - Santa Margherita Ligure – 22/6/2018 A Major new edition The Nouveau Petit Larousse Illustré (NPLI) in 1924 (dated 1925) All pages changed « tirage » number is reset to 1 1,000+ new entries. 2 % of changes in nomenclature. ca. 13,000 modifications (mostly details)  30 % of modified entries. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 From ethnicity and clichés to political correctness First half of the century: words to designate the other that would be considered now ahighly offensive. 1968 1981: Enfant de race noire. 1998 1948 1960 From 1952: PLI recommend to use Noir instead of Nègre. From 1960: négritude. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 ARABE adj. et n. [..] Fam. Usurier, homme dur en affaire. [..] 1959 JUDAÏQUE (da-i-ke) adj. (du lat. judæus, juif). Qui appartient aux juifs : la loi judaïque. Qui s'attache mesquinement à la lettre en négligeant l'esprit, comme le faisaient les pharisiens juifs : interprétation judaïque. JUIF, IVE n. et adj. [..] Qui professe la religion judaïque (en ce sens s'écrit avec une minuscule) : il existe beaucoup de juifs en Pologne. N. m. Fig. Usurier. [..] JUIVERIE (rî) n. f. Quartier d'une ville habité par les juifs. Fam. et par dénigr., ensemble des juifs. Boutique d'usurier. Rapacité sordide. In 1948, new entries: ENJUIVER, YOUDI, YOUPIN et YOUTRE ! YOUDI et YOUTRE are out in 1962, ENJUIVER in 1965 and YOUPIN in 1985. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 1967 From prudishness to sexual education From 1906 to 1924, you would not find any anatomic word for genitalia or sexual diseases in the Petit Larousse. Need to preserve a young audience from inappropriate terms? Though, some of these words may be found in harmless articles: AMAUROSE (mô-ro-ze) n. f. (gr. amaurôsis, obscurcissement). Cécité plus ou moins complète, causée par l'atrophie du nerf optique, la syphilis, etc. Vulgairement goutte sereine. BISTOURNAGE (bis) n. m. Castration, par torsion sous-cutanée, du cordon testiculaire, principalement chez le taureau. Unless it is not what you think: ÉJACULATION (si-on) n. f. Action d'éjaculer. Courte prière émise avec ferveur. ÉJACULER (lé) v. a. [..] Darder, lancer avec force hors de soi. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 From prudishness to sexual education  1925: blennorragie, syphilis, syphilitique, testicule, vagin, vaginal, vaginisme, vaginite, vulve But...  1960: clitoris, pénis, verge  1973: vulvaire  1981: clitoridien, pénien, testiculaire An example of a growing definition: 1906: ÉRECTION (rèk-si-on) n. f. [..] Etat de tension de certains tissus. 1948: ÉRECTION (rèk-si-on) n. f. [..] Etat de tension de certains tissus organiques. 1973: ÉRECTION (rèk-si-on) n. f. [..] Etat de gonflement de certains tissus organiques. 1981: ÉRECTION (rèk-si-on) n. f. [..] Etat de gonflement de certains tissus organiques, de certains organes, en particulier le pénis, en état de turgescence. 1998: ÉRECTION (rèk-si-on) n. f. [..] Physiol. Gonflement et durcissement temporaire de certains organes ou tissus, par afflux de sang. – Spécialem. Durcissement du pénis. ICHLL-9 - Santa Margherita Ligure – 22/6/2018 Conclusions  Studying recent dictionaries is fun!  The Petit Larousse may not be the best dictionary from a lexicographic or linguistic point of view but it was from the beginning and is still by far the most known and used for French language.  It is definitely a witness – not a neutral one – and probably, to some unquantifiable degree, an actor of the XXth and now XXIst centuries.  This part of French recent popular culture will soon be available to all through the Nénufar web site and XML/Lemon linked open resources and we hope, will lead to various studies.  Next step (2019) will be the digitization of the historical part (proper names) of the dictionary. ICHLL-9 - Santa Margherita Ligure – 22/6/2018