- dbstore1007:3312 (T299481)
- db2148:3306
- db2138:3312
- db2126:3306
- db2125:3306
- db2107:3306
- db2104:3306 (codfw master)
- db2101:3312 (backup T299876)
- db2095:3312
- db2088:3312
- db1182:3306
- db1170:3312
- db1162:3306
- db1156:3306
- db1155:3312
- db1146:3312
- db1139:3312 (backup T299876)
- db1129:3306
- db1122:3306 (master)
- db1105:3312
- db1102:3312 (backup T299876)
- clouddb1021:3312 (T299480)
- clouddb1018:3312 (T299480)
- clouddb1014:3312 (T299480)
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T31744 FlaggedRev installation (deployment) requests (tracking) | |||
Stalled | None | T143886 Activating Flagged revisions on ar.wikinews | |||
Stalled | None | T204354 Flagged Revisions for Vietnamese Wikipedia | |||
Stalled | None | T205145 Deploy FlaggedRevs on bn.wikibooks | |||
Stalled | None | T221933 Enable Flagged Revisions (for trial run purpose) at the Chinese Wikipedia | |||
Open | None | T185664 Code stewardship review: FlaggedRevs | |||
Resolved | Ladsgroup | T277883 Drop all low-use and unused features of FlaggedRevs to make it more maintainable | |||
Resolved | Ladsgroup | T300774 Drop fr_img_* columns | |||
Open | None | T291916 Tracking task for Bullseye migrations in production | |||
Resolved | Marostegui | T298585 Upgrade WMF database-and-backup-related hosts to bullseye | |||
Resolved | Ladsgroup | T300510 Upgrade s2 to Bullseye | |||
Resolved | Marostegui | T306417 Switchover s2 master (db1122 -> db1162) |
Event Timeline
Change 761675 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):
[operations/puppet@production] db2088: Disable notifications
Change 761675 merged by Ladsgroup:
[operations/puppet@production] db2088: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-10T17:39:33Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20558 and previous config saved to /var/cache/conftool/dbconfig/20220210-173932-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-10T17:39:58Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20559 and previous config saved to /var/cache/conftool/dbconfig/20220210-173957-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by ladsgroup@cumin1001 for host db2088.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by ladsgroup@cumin1001 for host db2088.codfw.wmnet with OS bullseye completed:
- db2088 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202101740_ladsgroup_32617_db2088.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-operations) [2022-02-10T18:25:48Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20567 and previous config saved to /var/cache/conftool/dbconfig/20220210-182547-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-10T18:31:08Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20570 and previous config saved to /var/cache/conftool/dbconfig/20220210-183107-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by ladsgroup@cumin1001 for host db2104.codfw.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by ladsgroup@cumin1001 for host db2104.codfw.wmnet with OS bullseye completed:
- db2104 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202151145_ladsgroup_12094_db2104.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Change 762804 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):
[operations/puppet@production] db1170: Disable notifications
Change 762804 merged by Ladsgroup:
[operations/puppet@production] db1170: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-15T12:40:36Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20785 and previous config saved to /var/cache/conftool/dbconfig/20220215-124035-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-15T12:42:08Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20787 and previous config saved to /var/cache/conftool/dbconfig/20220215-124207-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by ladsgroup@cumin1001 for host db1170.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by ladsgroup@cumin1001 for host db1170.eqiad.wmnet with OS bullseye completed:
- db1170 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202151243_ladsgroup_5262_db1170.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-operations) [2022-02-15T13:28:58Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20790 and previous config saved to /var/cache/conftool/dbconfig/20220215-132857-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-15T14:14:12Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20797 and previous config saved to /var/cache/conftool/dbconfig/20220215-141411-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-15T14:25:12Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20800 and previous config saved to /var/cache/conftool/dbconfig/20220215-142511-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-15T15:10:26Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20808 and previous config saved to /var/cache/conftool/dbconfig/20220215-151026-ladsgroup.json
Change 762982 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):
[operations/puppet@production] db1156: Disable notifications
Change 762982 merged by Ladsgroup:
[operations/puppet@production] db1156: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-16T05:47:51Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20852 and previous config saved to /var/cache/conftool/dbconfig/20220216-054749-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by ladsgroup@cumin1001 for host db1156.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by ladsgroup@cumin1001 for host db1156.eqiad.wmnet with OS bullseye completed:
- db1156 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202160551_ladsgroup_29873_db1156.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-operations) [2022-02-16T06:26:11Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20853 and previous config saved to /var/cache/conftool/dbconfig/20220216-062610-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T07:11:25Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20856 and previous config saved to /var/cache/conftool/dbconfig/20220216-071125-ladsgroup.json
Change 763177 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):
[operations/puppet@production] db1146: Disable notifications
Change 763177 merged by Ladsgroup:
[operations/puppet@production] db1146: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-16T08:05:33Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20859 and previous config saved to /var/cache/conftool/dbconfig/20220216-080531-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T08:07:17Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20860 and previous config saved to /var/cache/conftool/dbconfig/20220216-080717-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T09:07:37Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20865 and previous config saved to /var/cache/conftool/dbconfig/20220216-090737-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T09:09:25Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'T300510', diff saved to https://phabricator.wikimedia.org/P20866 and previous config saved to /var/cache/conftool/dbconfig/20220216-090924-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by ladsgroup@cumin1001 for host db1146.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by ladsgroup@cumin1001 for host db1146.eqiad.wmnet with OS bullseye completed:
- db1146 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202160923_ladsgroup_20285_db1146.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-operations) [2022-02-16T10:23:03Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20872 and previous config saved to /var/cache/conftool/dbconfig/20220216-102302-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T11:08:17Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20879 and previous config saved to /var/cache/conftool/dbconfig/20220216-110816-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T11:21:45Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20881 and previous config saved to /var/cache/conftool/dbconfig/20220216-112145-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-16T12:07:00Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20891 and previous config saved to /var/cache/conftool/dbconfig/20220216-120659-ladsgroup.json
Change 763568 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):
[operations/puppet@production] db1105: Disable notifications
Mentioned in SAL (#wikimedia-operations) [2022-02-17T17:25:06Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20992 and previous config saved to /var/cache/conftool/dbconfig/20220217-172504-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-17T17:26:50Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20993 and previous config saved to /var/cache/conftool/dbconfig/20220217-172650-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by ladsgroup@cumin1001 for host db1105.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by ladsgroup@cumin1001 for host db1105.eqiad.wmnet with OS bullseye completed:
- db1105 (WARN)
- Downtimed on Icinga
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202202171729_ladsgroup_29670_db1105.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-operations) [2022-02-17T18:09:00Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20999 and previous config saved to /var/cache/conftool/dbconfig/20220217-180900-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-17T18:54:15Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P21004 and previous config saved to /var/cache/conftool/dbconfig/20220217-185414-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-17T19:07:48Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P21006 and previous config saved to /var/cache/conftool/dbconfig/20220217-190748-ladsgroup.json
Mentioned in SAL (#wikimedia-operations) [2022-02-17T19:53:02Z] <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P21009 and previous config saved to /var/cache/conftool/dbconfig/20220217-195302-ladsgroup.json
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1001 for host db1122.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1001 for host db1122.eqiad.wmnet with OS bullseye completed:
- db1122 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204260808_marostegui_2917823_db1122.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB