family — MediaWiki families#

Objects representing MediaWiki families.

class family.Family[source]#

Bases: object

Parent singleton class for all wiki families.

Families are immutable and initializer is unsupported. Any class modification should go to __post_init__() class method.

Changed in version 3.0: the family class is immutable. Having an __init__ initializer method a NotImplementedWarning will be given.

Changed in version 8.0: alphabetic, alphabetic_revised and fyinterwiki attributes where removed.

Changed in version 8.2: obsolete setter was removed.

Changed in version 8.3: Having an initializer method a FutureWarning will be given.

Changed in version 9.0: raises RuntimeError if an initializer method was found; __post_init__() classmethod should be used instead.

classmethod __post_init__()#

Post-init processing for Family class.

The allocator will call this class method after the Family class was created and no __init__() method is used and __post_init__() is defined in your Family subclass. This can be used for example to expand Family attribute lists.

Warning

The __post_init__() classmethod cannot be inherited from a superclass. The current family file class is considered only.

Caution

Never modify the current attributes directly; always use a copy. Otherwise the base class is modified which leads to unwanted side-effects.

Example:

@classmethod
def __post_init__(cls):
    """Add 'yue' code alias."""
    aliases = cls.code_aliases.copy()
    aliases['yue'] = 'zh-yue'
    cls.code_aliases = aliases

Added in version 8.3.

apipath(code)[source]#

Return path to api.php.

Return type:

str

archived_page_templates: dict[str, tuple[str, ...]] = {}#

A dict of tuples for different sites with names of archive templates that indicate an edit of non-archive bots should be avoided.

base_url(code, uri, protocol=None)[source]#

Prefix uri with port and hostname.

Parameters:
  • code (str) – The site code

  • uri (str) – The absolute path after the hostname

  • protocol – The protocol which is used. If None it’ll determine the protocol from the code.

Returns:

The full URL ending with uri

Return type:

str

categories_last: list[str] = []#

When both at the bottom should categories come after interwikilinks?

TODO: T86284 Needed on Wikia sites, as it uses the CategorySelect extension which puts categories last on all sites. TO BE DEPRECATED!

category_attop: list[str] = []#

attop is a list of categories that prefer to have the category links at the top of the page.

category_on_one_line: list[str] = []#

on_one_line is a list of languages that want the category links one-after-another on a single line.

category_redirect_templates: dict[str, Sequence[str]] = {'_default': []}#

A list of category redirect template names in different languages.

category_redirects(code, fallback='_default')[source]#

Return list of category redirect templates.

Parameters:

fallback (str)

category_text_separator = '\n\n'#

String used as separator between category links and the text

closed_wikis: list[str] = []#

Not open for edits; stewards can still edit.

code_aliases: dict[str, str] = {}#

Code mappings which are only an alias, and there is no ‘old’ wiki. For all except ‘nl_nds’, subdomains do exist as a redirect, but that should not be relied upon.

codes#

classproperty Get list of codes used by this family.

Return type:

set[str]

cross_allowed: list[str] = []#

A list with the name in the cross-language flag permissions.

cross_projects: set[str] = {}#

A set of projects that share cross-project sessions.

cross_projects_cookies = ['centralauth_Session', 'centralauth_Token', 'centralauth_User']#

A list with the name for cross-project cookies, default for wikimedia centralAuth extensions.

crossnamespace: CrossnamespaceType = {}#

Allows crossnamespace interwiki linking.

Lists the possible crossnamespaces combinations; keys are originating namespace; values are dicts where keys are the originating langcode, or _default and values are dicts where keys are the languages that can be linked to from the lang+ns, or _default; values are a list of namespace numbers.

Examples:

Allowing linking to pt 102 namespace from any other lang 0 namepace is:

crossnamespace[0] = {
    '_default': { 'pt': [102]}
}

While allowing linking from pt 102 namespace to any other lang 0 namespace is

crossnamespace[102] = {
    'pt': { '_default': [0]}
}
dbName(code)[source]#

Return the name of the MySQL database.

Return type:

str

disambcatname: dict[str, str] = {}#

A dict with the name of the category containing disambiguation pages for the various languages. Only one category per language, and without the namespace, so add things like:

‘en’: “Disambiguation”

disambig(code, fallback='_default')[source]#

Return list of disambiguation templates.

Raises:

KeyError – unknown title for disambig template

Parameters:

fallback (str | None)

Return type:

list[str]

disambiguationTemplates: dict[str, Sequence[str]] = {'_default': []}#

A list of disambiguation template names in different languages.

domains#

classproperty Get list of unique domain names included in this family.

These domains may also exist in another family.

Return type:

set[str]

edit_restricted_templates: dict[str, tuple[str, ...]] = {}#

A dict of tuples for different sites with names of templates that indicate an edit should be avoided.

encoding(code)[source]#

Return the encoding for a specific language wiki.

Return type:

str

encodings(code)[source]#

Return list of historical encodings for a specific language wiki.

eventstreams_host(code)[source]#

Hostname for EventStreams.

Added in version 3.0.

eventstreams_path(code)[source]#

Return path for EventStreams.

Added in version 3.0.

from_url(url)[source]#

Return whether this family matches the given url.

It is first checking if a domain of this family is in the domain of the URL. If that is the case it’s checking all codes and verifies that a path generated via APISite.articlepath and Family.path matches the path of the URL together with the hostname for that code.

It is using Family.domains to first check if a domain applies and then iterates over Family.codes to actually determine which code applies.

Parameters:

url (str) – the URL which may contain a $1. If it’s missing it is assumed to be at the end.

Returns:

The language code of the url. None if that url is not from this family.

Raises:

RuntimeError – When there are multiple languages in this family which would work with the given URL.

Return type:

str | None

get_address(code, title)[source]#

Return the path to title using index.php with redirects disabled.

Return type:

str

get_archived_page_templates(code)[source]#

Return tuple of archived page templates.

Added in version 3.0.

get_edit_restricted_templates(code)[source]#

Return tuple of edit restricted templates.

Added in version 3.0.

hostname(code)[source]#

The hostname to use for standard http connections.

instance#

classproperty Get the singleton instance.

This is a placeholder to invoke allocator before it’s allocated. Allocator will override this classproperty.

interface(code)[source]#

Return interface to use for code.

Parameters:

code (str)

Return type:

str

interwiki_attop: list[str] = []#

attop is a list of languages that prefer to have the interwiki links at the top of the page.

interwiki_forward: str | None = None#

Some families, e. g. commons and meta, are not multilingual and forward interlanguage links to another family (wikipedia). These families can set this variable to the name of the target family.

interwiki_on_one_line: list[str] = []#

on_one_line is a list of languages that want the interwiki links one-after-another on a single line

interwiki_putfirst: dict[str, str] = {}#

Which languages have a special order for putting interlanguage links, and what order is it? If a language is not in interwiki_putfirst, alphabetical order on language code is used. For languages that are in interwiki_putfirst, interwiki_putfirst is checked first, and languages are put in the order given there. All other languages are put after those, in code-alphabetical order.

interwiki_removals#

classproperty Return a list of interwiki codes to be removed from wiki pages.

Codes that should be removed, usually because the site has been taken down.

Changed in version 8.2: changed from list to invariant frozenset.

Return type:

frozenset[str]

interwiki_replacements#

classproperty Return an interwiki code replacement mapping.

Which language codes no longer exist and by which language code should they be replaced. If for example the language with code xx: now should get code yy:, add {‘xx’:’yy’} to code_aliases.

Changed in version 8.2: changed from dict to invariant mapping.

Return type:

Mapping[str, str]

interwiki_text_separator = '\n\n'#

String used as separator between interwiki links and the text.

isPublic(code)[source]#

Check the wiki require logging in before viewing it.

Return type:

bool

langs: dict[str, str] = {}#
language_groups = {'arab': ['ar', 'ary', 'arz', 'azb', 'ckb', 'fa', 'glk', 'ks', 'lrc', 'mzn', 'ps', 'sd', 'ur', 'ha', 'kk', 'ku', 'pnb', 'ug'], 'chinese': ['wuu', 'zh', 'zh-classical', 'zh-yue', 'gan', 'ii', 'ja', 'za'], 'cyril': ['ab', 'av', 'ba', 'be', 'be-tarask', 'bg', 'bxr', 'ce', 'cu', 'cv', 'kbd', 'koi', 'kv', 'ky', 'mk', 'lbe', 'mdf', 'mn', 'mo', 'myv', 'mhr', 'mrj', 'os', 'ru', 'rue', 'sah', 'tg', 'tk', 'udm', 'uk', 'xal', 'ha', 'kk', 'sh', 'sr', 'tt'], 'grec': ['el', 'grc', 'pnt'], 'latin': ['aa', 'ace', 'af', 'ak', 'als', 'an', 'ang', 'ast', 'ay', 'bar', 'bat-smg', 'bcl', 'bi', 'bm', 'br', 'bs', 'ca', 'cbk-zam', 'cdo', 'ceb', 'ch', 'cho', 'chy', 'co', 'crh', 'cs', 'csb', 'cy', 'da', 'de', 'diq', 'dsb', 'ee', 'eml', 'en', 'eo', 'es', 'et', 'eu', 'ext', 'ff', 'fi', 'fiu-vro', 'fj', 'fo', 'fr', 'frp', 'frr', 'fur', 'fy', 'ga', 'gag', 'gd', 'gl', 'gn', 'gv', 'hak', 'haw', 'hif', 'ho', 'hr', 'hsb', 'ht', 'hu', 'hz', 'ia', 'id', 'ie', 'ig', 'ik', 'ilo', 'io', 'is', 'it', 'jbo', 'jv', 'kaa', 'kab', 'kg', 'ki', 'kj', 'kl', 'kr', 'ksh', 'kw', 'la', 'lad', 'lb', 'lg', 'li', 'lij', 'lmo', 'ln', 'lt', 'ltg', 'lv', 'map-bms', 'mg', 'mh', 'mi', 'ms', 'mt', 'mus', 'mwl', 'na', 'nah', 'nap', 'nds', 'nds-nl', 'ng', 'nl', 'nn', 'no', 'nov', 'nrm', 'nv', 'ny', 'oc', 'om', 'pag', 'pam', 'pap', 'pcd', 'pdc', 'pfl', 'pih', 'pl', 'pms', 'pt', 'qu', 'rm', 'rn', 'ro', 'roa-rup', 'roa-tara', 'rw', 'sc', 'scn', 'sco', 'se', 'sg', 'simple', 'sk', 'sl', 'sm', 'sn', 'so', 'sq', 'srn', 'ss', 'st', 'stq', 'su', 'sv', 'sw', 'szl', 'tet', 'tl', 'tn', 'to', 'tpi', 'tr', 'ts', 'tum', 'tw', 'ty', 'uz', 've', 'vec', 'vi', 'vls', 'vo', 'wa', 'war', 'wo', 'xh', 'yo', 'zea', 'zh-min-nan', 'zu', 'az', 'chr', 'ckb', 'ha', 'iu', 'kk', 'ku', 'rmy', 'sh', 'sr', 'tt', 'ug', 'za'], 'scand': ['da', 'fo', 'is', 'nb', 'nn', 'no', 'sv']}#

Some languages belong to a group where the possibility is high that equivalent articles have identical titles among the group.

ldapDomain = ()#

LDAP domain if your wiki uses LDAP authentication.

linktrail(code)[source]#

Return regex for trailing chars displayed as part of a link.

Note

Returns a string, not a compiled regular expression object.

Deprecated since version 7.3.

Parameters:

code (str)

Return type:

str

static load(fam=None)[source]#

Import the named family.

Parameters:

fam (str | None) – family name (if omitted, uses the configured default)

Returns:

a Family instance configured for the named family.

Raises:

pywikibot.exceptions.UnknownFamilyError – family not known

maximum_GET_length(code)[source]#

Return the maximum URL length for GET instead of POST.

Deprecated since version 8.0: Use config.maximum_GET_length instead.

name: str | None = None#

The family name

property obsolete: mappingproxy[str, str | None]#

Old codes that are not part of the family.

Interwiki replacements override removals for the same code.

Returns:

mapping of old codes to new codes (or None)

path(code)[source]#

Return path to index.php.

Return type:

str

post_get_convert(site, getText)[source]#

Do a conversion on the retrieved text from the Wiki.

For example a X-conversion in Esperanto.

pre_put_convert(site, putText)[source]#

Do a conversion on the text to insert on the Wiki.

For example a X-conversion in Esperanto.

protocol(code)[source]#

The protocol to use to connect to the site.

May be overridden to return ‘http’. Other protocols are not supported.

Changed in version 8.2: https is returned instead of http.

Parameters:

code (str) – language code

Returns:

protocol that this family uses

Return type:

str

querypath(code)[source]#

Return path to query.php.

Return type:

str

removed_wikis: list[str] = []#

Completely removed sites.

scriptpath(code)[source]#

The prefix used to locate scripts on this wiki.

This is the value displayed when you enter {{SCRIPTPATH}} on a wiki page (often displayed at [[Help:Variables]] if the wiki has copied the master help page correctly).

The default value is the one used on Wikimedia Foundation wikis, but needs to be overridden in the family file for any wiki that uses a different value.

Parameters:

code (str) – Site code

Raises:

KeyError – code is not recognised

Returns:

URL path without ending ‘/’

Return type:

str

shared_image_repository(code)[source]#

Return the shared image repository, if any.

shared_urlshortner_wiki: tuple[str, str] | None = None#

Some wiki farms have UrlShortener extension enabled only on the main site. This value can specify this last one with (lang, family) tuple.

ssl_hostname(code)[source]#

The hostname to use for SSL connections.

ssl_pathprefix(code)[source]#

The path prefix for secure HTTP access.

Return type:

str

title_delimiter_and_aliases = ' _'#

Titles usually are delimited by a space and the alias is replaced to this delimiter; e.g. “Main page” is the title with spaces as delimiters but “Main_page” also works. Other families may have different settings.

Note

The first character is used as delimiter, the others are aliases.

Warning

This attribute is used within re.sub() method. Use escape sequence if necessary

Added in version 7.0.

verify_SSL_certificate(code)[source]#

Return whether a HTTPS certificate should be verified.

Added in version 5.3: renamed from ignore_certificate_error

Parameters:

code (str) – language code

Returns:

flag to verify the SSL certificate; set it to False to allow access if certificate has an error.

Return type:

bool

family.AutoFamily(name, url)[source]#

Family that automatically loads the site configuration.

Parameters:
  • name (str) – Name for the family

  • url (str) – API endpoint URL of the wiki

Returns:

Generated family class

Return type:

SingleSiteFamily

class family.DefaultWikibaseFamily[source]#

Bases: WikibaseFamily

A base class for a Wikimedia Wikibase Family.

This class holds defauls for calendarmodel(), default_globe() and globes() to prevent code duplication.

Warning

Possibly you have to adjust the repository site in WikibaseFamily.entity_sources() to get the valid entity.

Added in version 8.2.

calendarmodel(code)[source]#

Default calendar model for WbTime datatype.

Return type:

str

default_globe(code)[source]#

Default globe for Coordinate datatype.

Return type:

str

globes(code)[source]#

Supported globes for Coordinate datatype.

class family.FandomFamily[source]#

Bases: Family

Common features of Fandom families.

Added in version 3.0: renamed from WikiaFamily

langs: dict[str, str]#

classproperty Property listing family languages.

scriptpath(code)[source]#

Return the script path for this family.

class family.SingleSiteFamily[source]#

Bases: Family

Single site family.

domains#

classproperty Return the full domain name of the site.

hostname(code)[source]#

Return the domain as the hostname.

class family.SubdomainFamily[source]#

Bases: Family

Multi site wikis that are subdomains of the same top level domain.

domains#

classproperty Return the domain name of the sites in this family.

langs: dict[str, str]#

classproperty Property listing family languages.

Return type:

dict[str, str]

class family.WikibaseFamily[source]#

Bases: Family

A base class for a Wikibase Family.

Added in version 8.2.

entity_sources(code)[source]#

Provide reopsitory site information for entity types.

The result must be structured as follows:

{<entity type>: (<family code>, <family name>)}

for example:

{‘property’: (‘test’, ‘wikidata’)}

If an empty dict is returned, all entity types are found in the current DataSite.

The result is used by DataSite.get_repo_for_entity_type

Parameters:

code (str)

Return type:

dict[str, tuple[str, str]]

interface(code)[source]#

Return ‘DataSite’ for Wikibase family.

Return type:

str

class family.WikimediaFamily[source]#

Bases: Family

Class for all wikimedia families.

Changed in version 8.0: knows_codes attribute was added.

code_aliases: dict[str, str] = {'be-x-old': 'be-tarask', 'dk': 'da', 'jp': 'ja', 'minnan': 'zh-min-nan', 'mo': 'ro', 'nan': 'zh-min-nan', 'nb': 'no', 'nds_nl': 'nds-nl', 'zh-cn': 'zh', 'zh-tw': 'zh'}#

Code mappings which are only an alias, and there is no ‘old’ wiki. For all except ‘nl_nds’, subdomains do exist as a redirect, but that should not be relied upon.

content_families = {'commons', 'incubator', 'lingualibre', 'mediawiki', 'species', 'wikibooks', 'wikidata', 'wikifunctions', 'wikinews', 'wikipedia', 'wikiquote', 'wikisource', 'wikiversity', 'wikivoyage', 'wiktionary'}#
cross_projects: set[str] = {'commons', 'incubator', 'lingualibre', 'mediawiki', 'meta', 'outreach', 'species', 'strategy', 'wikibooks', 'wikidata', 'wikifunctions', 'wikimania', 'wikimediachapter', 'wikinews', 'wikipedia', 'wikiquote', 'wikisource', 'wikiversity', 'wikivoyage', 'wiktionary'}#

A set of projects that share cross-project sessions.

disambcatname: dict[str, str] = {'wikidata': 'Q1982926'}#

A dict with the name of the category containing disambiguation pages for the various languages. Only one category per language, and without the namespace, so add things like:

‘en’: “Disambiguation”

domain#

classproperty Domain property.

eventstreams_host(code)[source]#

Return ‘https://stream.wikimedia.org’ as the stream hostname.

Return type:

str

eventstreams_path(code)[source]#

Return path for EventStreams.

Return type:

str

known_codes = ['aa', 'ab', 'ace', 'ady', 'af', 'ak', 'als', 'alt', 'am', 'ami', 'an', 'ang', 'ar', 'arc', 'ary', 'arz', 'as', 'ast', 'atj', 'av', 'avk', 'awa', 'ay', 'az', 'azb', 'ba', 'ban', 'bar', 'bat-smg', 'bcl', 'be', 'be-tarask', 'bg', 'bh', 'bi', 'bjn', 'blk', 'bm', 'bn', 'bo', 'bpy', 'br', 'bs', 'bug', 'bxr', 'ca', 'cbk-zam', 'cdo', 'ce', 'ceb', 'ch', 'cho', 'chr', 'chy', 'ckb', 'co', 'cr', 'crh', 'cs', 'csb', 'cu', 'cv', 'cy', 'da', 'dag', 'de', 'din', 'diq', 'dk', 'dsb', 'dty', 'dv', 'dz', 'ee', 'el', 'eml', 'en', 'eo', 'es', 'et', 'eu', 'ext', 'fa', 'ff', 'fi', 'fiu-vro', 'fj', 'fo', 'fr', 'frp', 'frr', 'fur', 'fy', 'ga', 'gag', 'gan', 'gcr', 'gd', 'gl', 'glk', 'gn', 'gom', 'gor', 'got', 'gu', 'guw', 'gv', 'ha', 'hak', 'haw', 'he', 'hi', 'hif', 'ho', 'hr', 'hsb', 'ht', 'hu', 'hy', 'hyw', 'hz', 'ia', 'id', 'ie', 'ig', 'ii', 'ik', 'ilo', 'inh', 'io', 'is', 'it', 'iu', 'ja', 'jam', 'jbo', 'jv', 'ka', 'kaa', 'kab', 'kbd', 'kbp', 'kcg', 'kg', 'ki', 'kj', 'kk', 'kl', 'km', 'kn', 'ko', 'koi', 'kr', 'krc', 'ks', 'ksh', 'ku', 'kv', 'kw', 'ky', 'la', 'lad', 'lb', 'lbe', 'lez', 'lfn', 'lg', 'li', 'lij', 'lld', 'lmo', 'ln', 'lo', 'lrc', 'lt', 'ltg', 'lv', 'mad', 'mai', 'map-bms', 'mdf', 'mg', 'mh', 'mhr', 'mi', 'min', 'mk', 'ml', 'mn', 'mni', 'mnw', 'mo', 'mr', 'mrj', 'ms', 'mt', 'mus', 'mwl', 'my', 'myv', 'mzn', 'na', 'nah', 'nan', 'nap', 'nb', 'nds', 'nds-nl', 'ne', 'new', 'ng', 'nia', 'nl', 'nn', 'no', 'nov', 'nqo', 'nrm', 'nso', 'nv', 'ny', 'oc', 'olo', 'om', 'or', 'os', 'pa', 'pag', 'pam', 'pap', 'pcd', 'pcm', 'pdc', 'pfl', 'pi', 'pih', 'pl', 'pms', 'pnb', 'pnt', 'ps', 'pt', 'pwn', 'qu', 'rm', 'rmy', 'rn', 'ro', 'roa-rup', 'roa-tara', 'ru', 'rue', 'rw', 'sa', 'sah', 'sat', 'sc', 'scn', 'sco', 'sd', 'se', 'sg', 'sh', 'shi', 'shn', 'si', 'simple', 'sk', 'skr', 'sl', 'sm', 'smn', 'sn', 'so', 'sq', 'sr', 'srn', 'ss', 'st', 'stq', 'su', 'sv', 'sw', 'szl', 'szy', 'ta', 'tay', 'tcy', 'te', 'tet', 'tg', 'th', 'ti', 'tk', 'tl', 'tn', 'to', 'tpi', 'tr', 'trv', 'ts', 'tt', 'tum', 'tw', 'ty', 'tyv', 'udm', 'ug', 'uk', 'ur', 'uz', 've', 'vec', 'vep', 'vi', 'vls', 'vo', 'wa', 'war', 'wo', 'wuu', 'xal', 'xh', 'xmf', 'yi', 'yo', 'za', 'zea', 'zh', 'zh-classical', 'zh-cn', 'zh-min-nan', 'zh-tw', 'zh-yue', 'zu']#
property languages_by_size: list[str]#

Language codes of the largest wikis.

They should be roughly sorted by size.

Changed in version 9.0: Sorting order is retrieved via wikistats for each call.

Raises:

NotImplementedError – Family is not member of multi_language_content_families

multi_language_content_families = ['wikibooks', 'wikinews', 'wikipedia', 'wikiquote', 'wikisource', 'wikiversity', 'wikivoyage', 'wiktionary']#
other_content_families = ['lingualibre', 'mediawiki', 'wikidata', 'wikifunctions']#
shared_image_repository(code)[source]#

Return Wikimedia Commons as the shared image repository.

shared_urlshortner_wiki: tuple[str, str] | None = ('meta', 'meta')#

Some wiki farms have UrlShortener extension enabled only on the main site. This value can specify this last one with (lang, family) tuple.

wikimedia_org_content_families = ['commons', 'incubator', 'species']#
wikimedia_org_families = {'commons', 'incubator', 'meta', 'outreach', 'species', 'strategy', 'wikimania', 'wikimediachapter', 'wikitech'}#
wikimedia_org_meta_families = ['meta', 'outreach', 'strategy', 'wikimediachapter', 'wikimania']#
wikimedia_org_other_families = ['wikitech']#
class family.WikimediaOrgFamily[source]#

Bases: SingleSiteFamily, WikimediaFamily

Single site family for sites hosted at *.wikimedia.org.

domain#

classproperty Return the parents domain with a subdomain prefix.

Return type:

str