Index: Introduction · Enrollment · Instructions · Schema structure · Schema transformations · References
If you consider contributing to this project, we highly recommend you read and follow our Team privacy guide before you continue reading.
A list of active Soldiers is available on the bulletin providers' page.
Schema Soldiers are developers from all around the world, that in a collaborative effort to help cleaning the public sector, decide to give away a bit of their time adding bulletin definitions (called schemas) of their choice to StateMapper. This task requires:
- advanced knowledge of Regular Expressions
- basic knowledge of the JSON format
- a Github user account
- a working installation of StateMapper
To enroll as a Soldier, it is enough to follow these instructions until the end, and push successfuly whatever bulletin schema to us :)
-
First, follow these instructions to install StateMapper locally.
-
Then read the entire Schemas documentation.
-
Implement your own schemas (.json) and avatars (.png):
- a country or continent schema: if missing, use the country template.
mkdir schemas/XX cp documentation/schemas/templates/COUNTRY.json schemas/XX/XX.json
- an institution schema: if missing, use the provider template.
cp documentation/schemas/templates/PROVIDER.json schemas/XX/YOUR_PROVIDER_ID.json
- your new bulletin schema: this is where the hard work is gonna happen. Use the bulletin template.
cp documentation/schemas/templates/XX/BULLETIN.json schemas/XX/YOUR_BULLETIN_ID.json
See farther documentation below to understand how to do so.
- a country or continent schema: if missing, use the country template.
-
Curate your bulletin schema going through each tab:
Tab Instructions Schema part fetch: first you gonna have to precise how to retrieve the bulletin for a given date, id, format, or whatever combination of them. This is mostly done using parameter URLs with patterns like {date:format(m/d/Y)}. fetchProtocoles parse: then you gonna have to describe, for each format (now available: pdf, xml and html), the way to understand the retrieved bulletin. Often you are going to use XPath and Regular Expressions. Use follow: true
to fetch pertinent sub-documents.parsingProtocoles rewind: you should now be able to download many years of bulletins to your machine. make sure the daemon is started, and enable the spider from the rewind tab. extract: now you can focus on extracting statuses
. You have to describe how to obtain them from a similar structure, with the attributes described in the Extraction section of the Schemas documentationextractProtocoles rewind: rewind again, enabling the extract
option of your spider.extractProtocoles -
Push your schema to the project's repository:
git add schemas/XX/YOUR_SCHEMA.* git commit -m "a descritive comment about your last changes" git push # and enter your credentials
Reading the Developer's guide can also be useful, since it contains important information about the software's processing layers.
← Project's homepage · Copyright © 2017-2018 StateMapper.net · Licensed under GNU AGPLv3 · ↑ top