Wikidata:FactGrid
The FactGrid project aims to become a wikibase instance for historical research in its widest sense. Why did we decide to run a separate Wikibase instance - in an official cooperation with Wikimedia Berlin? Basically because we will try to attract "original research". Researchers should be confident that the minutest observation they contribute can be made visible as a micro publication on our platform.
We jumped into the project with data collected on the Illuminati but the platform is open to historical research from all sides. As it is not too much fun to work on an empty and potentially isolated platform we are presently in a joint venture with Germany's national library with the aim of a massive GND-Input (more on this below).
Our blog on https://blog.factgrid.de should keep you informed about all our projects and events. If you want to fill data into the FactGrid, you are welcome. Contact us and get a real name account! We have neither implemented notability criteria nor do we bother researchers with a original research policy. You can generate your family's genealogy on our platform or use the database for a five-year research project at your university. The FactGrid is CC0 and open to be referenced by Wikidata.
2019-04-17 Memorandum of Understanding between the University of Erfurt and the German National Library – to base the FactGrid on GND data in a joint project
editWe are proud to announce a new and massive Wikibase project that should keep a large community busy for far more than a year: Last month the president of the University of Erfurt, Prof. Dr. Walter Bauer-Wabnegg, and Dr. Elisabeth Niggemann, director-general of the German National Library in Frankfurt and Leipzig (DNB) signed a memorandum of understanding that aims to bring GND data into the FactGrid – on a grand scale.
The GND, the German Integrated Authority File, is an authority file of millions of persons plus corporate bodies, conferences and events, geographic information, topics and works – designed to shape the exchange between libraries, archives and academic projects in the DACH countries of Germany, Austria and Switzerland.
integrating the GND into the FactGrid had been our constant topic of discussion during the last year. A Wikibase instance becomes a cool thing to contribute to, as soon as it becomes the research tool that you would use yourself in your research. GND data links into the world of open data; they clarify who or what you are speaking of in your research in all German-language contexts – and they will reach out to the other global authority files and to the universe of library data.
In April 2018 it became clearer that the FactGrid would eventually be one of several Wikibase instances which could and should in this case aim for a larger federation. Early in June it transpired that the German National Library was on its way to test Wikibase in a software evaluation, with the aim to run possibly about ten Wikibase instances in a constant exchange with each other. That was when we contacted the DNB with our own agenda to import their data. We wanted to try, so that our proposal, could become a platform for "original research" – a platform without GND or Wikidata criteria of notability – in the evolving network of Wikibase platforms. Users will be allowed to create Q-Numbers for infants who died right after birth on FactGrid, and the GND and Wikidata will be free to decide under their criteria of notability and relevance, whether they would like to use our information – information they can now quote as original research from the FactGrid platform (with the detailed information of the projects behind this research).
Whilst the GND is CCO and free to be copied, the open joint venture with the German National Library aims to bring transparency into the data input. The more transparency we can bring into all the design decisions in this early stage, the better the wikibase platforms we are heading towards, will eventually be able to communicate with each other.
Now a team has to be formed. The German National Library and the Gotha research institutions of the University of Erfurt will send members into the team. The question is: Will we be able to broaden this team? We should have experts from the Wikimedia communities on board – people who know Wikibase and Wikidata, people who are used to community work on a regular wiki.
- We would like to attract people who know how to formulate SPARQL searches and who will be able to test data models and make suggestions for the improved data models we should use, in order to handle the massive data sets we are expecting.
- We're looking for Wikibase experts who know how to bring in tens of millions of records into a Wikibase installation, and who know how to interconnect these records with genealogical and geographical links.
- We do not yet know how we will keep the FactGrid manageable with respect to the wave of doublets and name parallels we are facing: The GND has these name parallels in unprecedented numbers. We will have to find ways to quickly inform researchers whether a person they have found in a document is already on the FactGrid or whether they will have to create the item. The hunt for items to be merged will become a permanent issue and we do not yet know how to technically support a community on this collective quest.
- We will create new and complex fields of expertise: Millions of personal data sets will come with career statements. The FactGrid will turn all these statements into Q-Items, which we will have to organise in order to allow sociological searches for instance. The FactGrid project on historical jobs and their evolution will be only one of these projects.
- We need players with Wikipedia experience: Though we will restrict ourselves to clear name accounts, we widely invite users with professional to private ambition to join the platform with their projects – whether they are focused on private genealogy or on publicly funded historical research.
- We will have to provide a simplified FactGrid user interface that will bypass the SPARQL QueryService and the mushrooming Wikibase input pages. Magnus Manske’s Reasonator might become our standard interface for regular users, who will access the FactGrid as if they are accessing library catalogues – through organsied input forms.
- We will eventually need help with database maintenance. It is particularly unfortunate that our project is primarily the work of historians, who do not always have a keen eye on how to optimally supply this technology.
The FactGrid will grow – and it will offer plenty of space for people to develop their own projects within this growth.
Scan of the Memorandum of Understanding (in German)
2018-11-16/17: Wikibase/Illuminatenorden Data-Mining Workshop: 10 Reisestipendien nach Gotha zu vergeben
editDas Forschungszentrum Gotha veranstaltet jährlich einen „Illuminatenworkshop“ mit dem Ziel, aktuelle Forschung zum Geheimorden der 1770er und 1780er Jahre zu bündeln.
Nachdem wir dieses Jahr in einer Kooperation mit Wikimedia Deutschland gut 5.000 komplexere Sätze von Metadaten zu den Akten des Illuminatenordens in einer Wikibase-Instanz verfügbar machten, möchten wir mit diesem Aufruf Wikidata-Enthusiasten einladen, gemeinsam mit Forschenden einen ersten Blick in diesen Datenschatz hinein zu wagen.
- Welche Visualisierungen (Netzwerke, Timelines, geographischen Erfassungen…) lassen sich aus dem Datenmaterial ziehen?
- Wie müsste man die Daten strukturieren, um noch ganz andere Forschungsfragen anzugehen?
- Wie lässt sich das dieses Datenmaterial am besten mit Volltext-Transkripten von Dokumenten verknüpfen?
- Wie kann unser bisheriges konventionelles Wiki – die Gotha Illuminati-Research Base – aus der Datenbank Information beziehen?
- Wie modellieren wir Datenobjekte auf einem Kurs, der Informations-Redundanzen vermeidet?
- Wie würde man Daten eingeben, um Zeitschnitte (und Repräsentationen auf alten Landkarten) zu bewerkstelligen?
- Wie gelingt es uns, unsere Reasonator-Integration klug in Richtung eines mehrsprachigen, Übersichtlichkeit generierenden Informationsangebots zu nutzen?
- Wie machen wir unsere Arbeit optimal im Wikidata-Universum nutzbar?
Unser diesjähriger Workshop soll Datenanalyse und Forschung zusammenbringen. Mitspieler, die sich mit SPARQL Datenbankabfragen und Visualisierungen auskennen, wollen wir einladen, mit der Forschung ins Gespräch zu kommen. Wir haben die Daten, die Datenbank und detailliertes Wissen über die die Aktenlage des Illuminatenordens mit seinen gut 1350 Mitgliedern und Tausenden von internen Dokumenten, um das Data-Mining spannend machen – aber erfassen im Moment kaum, welche Aufschlüsse uns unser eigenes Material in der ganz neuen technischen Erschließung gibt. Wir verfügen über Erfahrung mit unserem bisherigen Arbeitsinstrument, einem konventionellen Wiki, aber erfassen im Moment gerade in Ansätzen, was das komplexere Medium der Wikidata-Technologie uns an hinzukommenden Optionen der praktischen Arbeit liefert.
Wer am wissenschaftlichen Programm im ganzen Umfang teilnehmen will, kann am Freitag den 16. November 2018 ab 9:00 am Forschungszentrum Gotha in unsere Forschungsdiskussionen Einblick nehmen.
Der Data-Mining-Workshop, dem die vorliegende Einladung speziell gilt, wird am Nachmittag Forschende und Wikidata-Kenner zusammenbringen. Die Veranstaltungen werden im Verlauf des Nachmittags getrennt in einen Wikibase Bastel-Workshop und in die Serie der spezifischeren Fachreferate.
Ein gemeinsames Abendessen ist für 19:00 angesetzt. Das Forschungszentrum steht Bastelwütigen indes danach noch die ganze Nacht offen.
Am zweiten Workshop-Tag wollen wir gegen 11:00 die Gruppen wieder zusammenführen, um voneinander zu lernen:
- Was können Forschende aus der Datenbank gewinnen?
- Was wünschten sich Datenbankenthusiasten an Forschungsarbeit, um im Data-Mining wesentlich tiefer einsteigen zu können?
- Wie würde man ein größeres Aktenerschließungsprojekt mit dieser Datenbank am besten organisieren?
Wir können Reisekosten (unter der sich entwickelnden Etatlage sicher aus dem deutschsprachigen Raum), Unterbringung und Tagegelder übernehmen. Teilnahmewünsche sind mit eingehenderen Aussagen zu Arbeitsinteressen bis zum 5. November 2018 zu richten an [email protected].
Erste Suchen und Visualisierungen zur Anregung
Sample queries: https://database.factgrid.de/wiki/Sample_queries
Ongoing data input
editSee: https://database.factgrid.de/wiki/Data_Input for the spread sheets we are processing at the moment.
Contact us if you have done such things, all help is welcome.
The Illuminati Files Online: Some remarks on the FactGrid's first - roughly 16,000 - datasets
editThe FactGrid (still lacking a unique design) has, over the last six weeks, just received its first project data - data from Halle’s and Gotha’s Illuminati research of the last 20 years, which I will briefly outline in the following. Some of the datasets are still lacking the complexity we are aiming at, others have gained quite some depth:
We are presently listing some 2,000 people, about 1,350 of whom have been Illuminati members in the 1780s; the rest are mostly people mentioned in the correspondences which we are trying to map. We are hesitating here with the far more complex biographical input which he have in store, since this would be better done in joint ventures with the GND and with Katrin Möller, university of Halle, who has become the leading expert on 18th-century careers and professions. We are negotiating in both directions. Our roughly 2,000 titles of research on the Illuminati come in equally rudimentary sets. Again, we have realized that it would be more interesting to base this work on library catalogue norms from the outset.
But we are quite detailed with the more than 9,000 documents currently associated with the original Illuminati: documents that survived with the publications of Illuminati documents by the Bavarian state of 1786 and 1787 and with the archives of Adam Weishaupt, Johann Joachim Christoph Bode, of Gotha’s first lodge "The Compass", and of Ernest II of Saxony-Gotha-Altenburg.
Some 3,000 of these documents received detailed and complex metadata on our database; they will shed light on Gotha's Illuminati research of the last five years. Led by Martin Mulsow, Markus Meumann and I have been mapping and analysing Illuminati Essays collected in the “Schwedenkiste”, the famous “Swedish Box”. Our work had started with volume 13 and had soon moved beyond. Junior scholars had joined us with their research. Christian Wirkner dealt excessively with the minutes of the meetings collected in volume 15 and with exemplary strands of “Quibus Licet”/“Reproch” communication which the Order had designed to guide its members. Gotha’s research comes therefore with an emphasis on volumes 11 to 16 of the “Swedish Box”
The Gotha Illuminati Research Base, a conventional MediaWiki which we had set up in order to organise the team’s research became within a year of its existence the pool of ongoing work for the entire (small) community of Illuminati researchers. It attracted the first massive external data inputs: Hermann Schüttler offered us a list of members which he eventually connected with biographies to all our documents. He moved on with a technological relic: an Access database of the research he and Reinhard Markner had conducted for the two volumes of The Illuminati Correspondence (published in 2005 and 2013 respectively). These data are now the deeper layer under the special qid Item: Q11305.
Schüttler provides the following overview of this special database:
1. What is in it? Letters collected by a) Adam Weishaupt now located Hamburg’s State Archive, as deposit of the “United Five” (lodges) Hamburg, and b) the of the “Schwedenkiste”, - collected here by Johann Joachim Christoph Bode and augmented by Ernst II. of Saxony-Gotha-Altenburg, now located in the Prussian Privy State Archives, Berlin (formerly located in State Archive of the GDR, Merseburg). We have added any materials of the Illuminati context from other sources, where we could spot them.
2. Who has been working on this from when to when? The data was collected during the years of the IZEA project between early 2000 and mid-2005 by Hermann Schüttler and have been partially supplemented by Reinhard Markner. 3. Which archives have been visited? Hamburg’s State Archive, the Prussian Privy State Archives, Berlin, the Bavarian Main State Archives, Munich (Weishaupt letters in the “Kasten Schwarz” collection), the Palatine State Library, Speyer (collection Schwanckhardt).
4. Which publications are linked to this? The titles, which are listed under the qid Item: Q11305.
For the first time, the FactGrid can therefore present the tip of the iceberg of the Illuminati files online. Future projects will be able to add to this now that the grid is laid out – but we could also enter a far more dynamic process offering the digitized documents together with the metadata we have here collected. This will only be conceivable in a project for which the Grand Lodges and the archives involved remain to be won.
The creative part of the work can now begin
editCompared to the previous MediaWiki, the FactGrid is structurally in one aspect at a disadvantage: The wiki offers transcripts and summaries of hundreds of documents and sessions of the Order (see here what we did with " Document 70, Schwedenkiste volume 13 in our conventional Wiki and see here the respective database record: Item:Q6641). It would be exciting to make the transcripts available in the database itself. We will write out two work contracts, on the one hand to bring database information into the conventional wiki – on the other hand to develop a concept of how we could use the FactGrid as our sole repository for data, extended digests, transcripts and media files.
The database is otherwise far superior to the previous MediaWiki: It creates the grid which future research can use while it is opening the door to the data mining that can now begin. One will now be able to map the entire work of the Order between the late 1770s and 1788 (as far as it is still documented). It will be possible to examine membership careers (as soon as we get on with the personal data in greater detail): Who got into the Order on whose proposal? What kind of networking did members unfold within the Order at what stage of its brief existence?
The project will now need groups to bring their own research into the database. It should at the same moment begin to attract "data scientists" and "data visualizers". And, of course: We are immensely curious about what information others will make visible where we can only offer our gut feelings and sketchy ideas of personal networks that developed in this web. The FactGrid blog will be a good place to present data analytics, which SPARQL enthusiasts can now do far better than the researchers behind the data resource.
More
editIn alphabetical order: Lorenza Castella, Erik Liebscher, Reinhard Markner, Markus Meumann, Martin Mulsow, Hermann Schüttler, Olaf Simons and Christian Wirkner
Blog
editsee also Wikidata:FactGrid/blog-archive