02 Aggr Wafl v2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26
At a glance
Powered by AI
The key takeaways are that WAFL inconsistencies can occur in aggregates and flexible volumes, and tools like WAFL_check and wafliron can be used to check and fix these inconsistencies.

WAFL inconsistencies with aggregates and flexible volumes can be fixed by running WAFL_check and wafliron on the aggregates and flexible volumes. WAFL_check will check for inconsistencies while wafliron will fix any inconsistencies found.

Tools that can be used to check aggregates for inconsistencies include WAFL_check, which will check all aggregates or a specific named aggregate, and wafliron, which can check aggregates from the 1-5 menu or command line using aggr wafliron.

Module 2.

WAFL Inconsistencies

WAFL Inconsistencies

DOT 7.0 Update Aggregate Performance

Student Guide

Objectives At the conclusion of this module, you will be able to:


Describe how to fix WAFL inconsistencies with aggregates and flexible volumes Execute WAFL_check and wafliron on flexible volumes and aggregates

Student Guide

Aggregates & Flexible Volumes With Aggregates & Flexible Volumes


Data is actually stored in the aggregate Therefore, file system inconsistencies actually live in the aggregate Aggregates can be inconsistent WAFL_check and wafliron are aggregate operations

Student Guide

Tools for Aggregates


Filer offline
WAFL_check Will prompt to check all aggregates or (will also check all flexible volumes contained inside aggregate) WAFL_check <aggrname> (will check flexible aggregates contained in named aggregate)

Filer offline momentarily


wafliron (From 1-5 menu) Checks every volume. This includes aggregates with flexvols, traditional volumes, and even the root volume aggr wafliron (from command prompt) Non-root aggregates

Utilized on traditional volumes as before


vol wafliron (from command prompt) Can use aggr wafliron as well

Student Guide

Running WAFL_check Running WAFL_check will check flexible volumes in aggregate


Selection (1-5)? WAFL_check aggr1 Checking aggr1... WAFL_check NetApp Release RanchorsteamN_040517_2215 Starting at Wed May 19 00:43:27 GMT 2004 Phase 1: Verify fsinfo blocks. Phase 2: Verify metadata indirect blocks. ... Phase 5: Check volumes. Phase 5a: Check volume inodes Phase 5a time in seconds: 0 Phase 5b: Check volume contents

Student Guide

Running WAFL_check
Checking volume vol1... Phase [5.1]: Verify fsinfo blocks. Phase [5.2]: Verify metadata indirect blocks. ... Phase [5.6d]: Check blocks used. Phase [5.6d] time in seconds: 0 Phase [5.6] time in seconds: 0 WAFL_check time in seconds: 7 (No filesystem state changed.) Phase 5b time in seconds: 7
Note The [ ] around the phase number indicates an operation on a flexible volume

Phase 6: Clean up. Phase 6a: Find lost nt streams. Phase 6a time in seconds: 0 Phase 6b: Find lost files. Phase 6b time in seconds: 0 Phase 6c: Find lost blocks. Phase 6c time in seconds: 0 Phase 6d: Check blocks used. Phase 6d time in seconds: 0 Phase 6 time in seconds: 0 WAFL_check total time in seconds: 14 (No filesystem state changed.)

Changes to flexible volumes are batched up and run at the end of the job.

Student Guide

Running WAFL_check Upon completion of WAFL_check you will be prompted to apply changes
These changes have been queued to be fixed on aggregates and flexible volumes inside the aggregate

Student Guide

Running wafliron
filer*> aggr wafliron start aggr1 Wed May 19 00:38:20 GMT [wafl.iron.start:notice]: Starting wafliron on aggregate aggr1. Wed May 19 00:38:21 GMT [wafl.iron.start:notice]: Starting wafliron on volume vol1. filer*> Wed May 19 00:39:05 GMT [wafl.scan.iron.done:info]: Volume vol1, wafliron completed. Wed May 19 00:39:06 GMT [wafl.scan.iron.done:info]: Aggregate aggr1, wafliron completed. Wed May 19 00:39:12 GMT [wafl.scan.typebits.done:info]: Type bit scan done on vol vol1. filer*>
Note Wafliron will iron the flexible volumes in the aggregate

Student Guide

wafliron status
filer*> aggr wafliron status wafliron is active on aggregate: aggr1 Scanning (7% done). filer*> filer*> wafl scan status Aggregate aggr1: Scan id Type of scan progress 3 wafliron demand 73 (27/27) of 1092 Volume vol1: Scan id Type of scan progress 4 wafliron demand 69 (20/20) of 693 Notice there are 2 scanners that are active.
Scan id 3 is ironing the aggregate aggr1 Scan id 4 is ironing the flexible volume vol1 that is contained inside of aggr1.

wafl iron with a f volume will force a waflifon on volumes which are read-only (snapshots/syncmirror)

Student Guide

Logs and Errors


Logs
wafliron still logs errors with syslog messages WAFL_check is the same as before for flexible volumes Logged to /etc/crash/wafl/ For aggregates, its a two step process: /vol/<aggrname>/WAFL_check_logs/WAFL_check (Kept in metadir in aggregate as aggr is not mountable) Once booted, info is placed into: /vol/<rootvol>/etc/WAFL_check_logs/<aggrname>/W AFL_check If there are no errors, a log will NOT be created

Errors
filer*> vol wafliron start vol1 Cannot run wafliron on a flexible volume. filer*> vol wafliron start aggr1 vol wafliron: 'aggr1' is an aggregate; use aggr wafliron'

Student Guide

Internal information
File System Inconsistencies With Aggregates and Flexible Volumes

Student Guide

Additional Information Anchor Steam TOI


Lots of bit and Engineering level information

http://web.netapp.com/engineering/pro jects/wafl/vv/vvol_wack_notes.txt

Eng notes
These notes are truncated, for full notes go to URL. Notes and thoughts about WAFL_check and (hybrid) virtual volumes Andy Kahn ([email protected]) Last updated Feb 7th, 2003 $Id: //depot/doc/main/project/wafl/vv/vvol_wack_notes.txt#1 $ This document describes the changes to WAFL_check needed to support virtual volumes. Specifically, this only applies to hybrid virtual volumes. For completeness, pure virtual volumes will be mentioned briefly at the end of this document. It is assumed that the reader is reasonably familiar with the overall changes needed for hybrid virtual volumes (refer to design doc).

Student Guide

In this document, the following terms are used:

- "pvbn's" or "physical vbn's" - VBN's in the physical volume. - "vvbn's" or "virtual vbn's" - VBN's in the virtual volume. - "pvol" - physical volume. - "vvol" - virtual volume. - "vvid" - virtual volume id. Depending on the context, "vvol" may be used interchangeably to refer to the vvol container file/inode.

The aggregrate/blakegrate terminology hasn't been adopted in this document. In the meantime, feel free to substitute:

- "pvbn" -> "aggvbn" or "avbn" or "blakevbn" or "bvbn" - "vvbn" -> "vbn" - "pvol" -> "aggregate" or "blakegrate" - "vvol" -> "volume" - "vvid" -> "volume id" Sections in this document:

1. New to the physical volume 2. WAFL_check on the physical volume 3. WAFL_check on a pvol with vvols being destroyed 4. WAFL_check on a vvol 5. Pure virtual volumes

Student Guide

1. New to the physical volume


----------------------------------------------------------------------------To support hybrid vvol's, the pvol has these additional changes: - A new metafile at fileid 88, the "vvol_owner", or the "owner map". This file maps pvbn's to the vvol which owns it and the vvol's corresponding vvbn.

- Inodes with a new type, WAFL_TYPE_VVOL. These are the "wafl" container inodes which reside in the pvol's metadirectory. The contents of this container file is the vvol itself.

- Regular inodes for the "raid" file, which contains raid-like information for a vvol. These live in the same directory as their corresponding container file.

2. WAFL_check on the physical volume


----------------------------------------------------------------------------Running WAFL_check on the pvol requires WAFL_check to be aware of the new additions, and to fix things if they are inconsistent. The general order of execution will thus be: 1. Make sure all used blocks in the pvol show up as being owned in the owner map by the pvol itself. 2. Make sure all WAFL_TYPE_VVOL inodes are accounted for. 3. Make sure owner map entries looks sane and is owned by either the pvol or a vvol. 4. WAFL_check all vvol's.

Student Guide

Step 1: pvol blocks. - After checking the pvol's inofile's buftree, check the owner map's buftree. - Rescan the inofile's buftree, but this time, each block needs to be checked with the owner map. If the owner map shows that the block is not owned by the pvol, clear the entry (to indicate that it *is* owned by the pvol). - For all remaining files, including metafiles, the buftree scan of their inode also checks the owner map to ensure ownership by the pvol. The only exception is for the vvol container blocks; they reside in the pvol, but are owned by the vvol.

Step 2: WAFL_TYPE_VVOL inodes - While scanning inodes in the pvol, all fileid's of the WAFL_TYPE_VVOL inodes are stored in a list. This list will be used later, after we've finished checking the pvol. - Once WAFL_check is completed with the pvol, all vvol's in the pvol are configured (aka "discovered"). Specifically, the pvol's metadirectory is scanned for any vvol's. All vvol's found are not mounted at this point in time. - Compare the WAFL_TYPE_VVOL list found during the inofile scan against the list of vvol's discovered. For brevity, "wack list" refers to the first list while "pvol vvol list" refers to the latter.

Student Guide

There are four possibilities which can result: - Case 0: If an inode is in neither list, do nothing (trivial case). - Case 1: If an inode is in both lists, then only need to check if the WAFL_FLAG_METAFILE is set. If it isn't, set it. - Case 2: If an inode is in the pvol's vvol list, but not in the wack list, then that means the inode is *not* of type WAFL_TYPE_VVOL. Change its type to WAFL_TYPE_VVOL, and set WAFL_FLAG_METAFILE if it isn't set already. - Case 3: If an inode is in the wack list, but not in the pvol's vvol list, then this is either data corruption or a lost vvol. Set its type to WAFL_TYPE_REGULAR, clear WAFL_FLAG_METAFILE, and check if fbn's 1 or 2 look like valid volinfo's. If either do, it's likely we found a lost vvol, so move it lost+found.

Step 3: owner map


- Check the owner map by scanning through each entry. If the entry corresponds to a physical block that is marked in-use by either the active map or the summary map, then: - Check if the vvol id is zero (aka, owned by the physical volume.) If so, the vvbn value should also be zero. Clear - Check if the vvol id in the entry exists in the pvol's vvol list. If it is not, clear the entry. Note that that the vvbn value is not checked, because we allow sparse vvol's, which can have a vvbn range that is larger than the pvol's vvbn range.

Step 4: Run WAFL_check on each individual vvol. See section 4.

Student Guide

4. WAFL_check on a vvol
----------------------------------------------------------------------------When checking all the vvol's within a pvol, offline vvol's are also made available. They are not actually mounted or brought online, so their mount state is unchanged, and that they stay offline after the WAFL_check. The code path is mostly the same as it is for a pvol. The vvol does: wafl_load_superblks() wack_boot_volume_from_disk() wack_dowack_vol() wack_dowack_vol_finish() Vvol's which were offline also do wack_unload_volume(). The main differences for a vvol are:

- The owner map (fileid 88) doesn't exist, so it is never loaded nor checked. - All blocks in the vvol are checked against its pvol's owner map for correct ownership. If a virtual volume correctly owns a block, mark this block as "claimed" in the bufs_claimed status file that WAFL_check uses. - If the vvol does not own a block, then check if the block is already claimed. If it is, then this vvol doesn't own it and cannot claim it. Prune this block from the buftree. - Otherwise, this vvol can claim it. Once all vvol's have been checked, the bufs_claimed file is scanned to find any unclaimed blocks. Unclaimed blocks are then given to the physical volume (clear the corresponding entry in the owner map), and moved to lost+found.

Student Guide

5. Pure virtual volumes


----------------------------------------------------------------------------If a physical volume only contained pure virtual volumes, then there would be no need for the owner map. In this case, WAFL_check only needs to be aware of the new inode type (WAFL_TYPE_VVOL) and do the comparison checking with the list of WAFL_TYPE_VVOL inodes encountered during the inofile scan against the list of vvols found during the vvol discovery phase. Otherwise, running WAFL_check on a pure virtual volume will behave just like it does on a present day physical volume.Slide 13

Student Guide

Topic Review If you run WAFL_check at the special boot menu, what will be checked? If you run wafliron at the Special Boot Menu, what would be ironed? If you wanted to check the status of wafliron, what command would you use? If you wanted to wafliron a non-root aggregate, which command would you use?

Student Guide

Exercises

Student Guide

Exercise: WAFL_check and wafliron


Objective
When you have completed this module, you will be able to do the following: Execute WAFL_check at the special boot prompt Recognize the differences when checking aggregates vs. traditional volumes Execute wafliron and view status of the operation

Exercise Overview
This exercise is to highlight the differences in output when executing WAFL_check and wafliron on aggregates and flexible volumes.

Time Estimate
20 Minutes

Required Hardware, Software, and Tools


Hardware Standard class setup

Software DOT 7.0

Start of Exercise
WAFL_check on aggregates and volumes. Step 1. 2. Action Start the simulator by executing the maytag.L file. When Prompted, enter Y to the floppy boot question. After the 1-5 menu appears, enter 22/7 to view the hidden commands.

Student Guide

3.

Enter the following; WAFL_check When prompted, answer Yes or y to which aggregates you wish to WAFL_check The system will output what is occurring with the WAFL_check. Notice that the output in the [] are operations being performed on flexible volumes. The system will now display the status and ask you to reboot. Watch the messages file for a status.

4.

This is just an example of a WAFL_check on a system with a traditional vol0 and aggr1 containing a flexible volume. Selection (1-5)? WAFL_check Mon Oct 18 20:58:59 GMT [fmmbx_instanceWorke:info]: Disk 3a.33 is a primary mailbox disk Mon Oct 18 20:58:59 GMT [fmmbx_instanceWorke:info]: Disk 3a.17 is a primary mailbox disk Mon Oct 18 20:58:59 GMT [fmmbx_instanceWorke:info]: normal mailbox instance on primary side Mon Oct 18 20:59:00 GMT [raid.assim.disk.brokenPreAssim:error]: Broken Disk 3a.24 Shelf ? Bay ? [NET APP X270_SCHT6036F10 NA05] S/N [3JA72MSH000074297CC7] detected prior to assimilation. It should be removed. Mon Oct 18 20:59:00 GMT [raid.disk.unload.done:info]: Unload of Disk 3a.24 Shelf 1 Bay 8 [NETAPP X 270_SCHT6036F10 NA05] S/N [3JA72MSH000074297CC7] has completed successfully Check vol0? y Check aggr1? y Checking vol0... WAFL_check NetApp Release RanchorsteamN_041010_2215 Starting at Mon Oct 18 20:59:09 GMT 2004 Phase 1: Verify fsinfo blocks. Phase 2: Verify metadata indirect blocks. Phase 3: Scan inode file. Phase 3a: Scan inode file special files. Phase 3a time in seconds: 1 Phase 3b: Scan inode file normal files.

Student Guide

(inodes 5%) (inodes 10%) (inodes 15%) (inodes 20%) (inodes 25%) (inodes 30%) (inodes 35%) (inodes 40%) (inodes 45%) (inodes 50%) (inodes 55%) (inodes 60%) (inodes 65%) (inodes 70%) (inodes 75%) (inodes 80%) (inodes 85%) (inodes 90%) (inodes 95%) Phase 3b time in seconds: 5 Phase 3 time in seconds: 6 Phase 4: Scan directories. (dirs 15%) (dirs 26%) (dirs 45%) (dirs 46%) (dirs 46%) (dirs 66%) (dirs 75%) (dirs 81%) (dirs 92%) Phase 4 time in seconds: 2 Phase 6: Clean up. Phase 6a: Find lost nt streams. Phase 6a time in seconds: 0 Phase 6b: Find lost files. Phase 6b time in seconds: 4 Phase 6c: Find lost blocks. Phase 6c time in seconds: 0 Phase 6d: Check blocks used. Phase 6d time in seconds: 7 Phase 6 time in seconds: 11 WAFL_check total time in seconds: 19 (No filesystem state changed.) Checking aggr1... WAFL_check NetApp Release RanchorsteamN_041010_2215

Student Guide

Starting at Mon Oct 18 20:59:29 GMT 2004 Phase 1: Verify fsinfo blocks. Phase 2: Verify metadata indirect blocks. Phase 3: Scan inode file. Phase 3a: Scan inode file special files. Phase 3a time in seconds: 0 Phase 3b: Scan inode file normal files. (inodes 5%) (inodes 10%) (inodes 15%) (inodes 20%) (inodes 25%) (inodes 30%) (inodes 35%) (inodes 41%) (inodes 46%) (inodes 51%) (inodes 56%) (inodes 61%) (inodes 66%) (inodes 71%) (inodes 76%) (inodes 82%) (inodes 87%) (inodes 92%) (inodes 97%) Phase 3b time in seconds: 2 Phase 3 time in seconds: 2 Phase 4: Scan directories. Phase 4 time in seconds: 0 Phase 5: Check volumes. Phase 5a: Check volume inodes Phase 5a time in seconds: 0 Phase 5b: Check volume contents Checking volume flexvol1... Phase [5.1]: Verify fsinfo blocks. Phase [5.2]: Verify metadata indirect blocks. Phase [5.3]: Scan inode file. Phase [5.3a]: Scan inode file special files. Phase [5.3a] time in seconds: 0 Phase [5.3b]: Scan inode file normal files. (inodes 5%) (inodes 10%) (inodes 15%) (inodes 20%) (inodes 25%)

Student Guide

(inodes 30%) (inodes 35%) (inodes 40%) (inodes 45%) (inodes 50%) (inodes 55%) (inodes 60%) (inodes 65%) (inodes 70%) (inodes 75%) (inodes 80%) (inodes 85%) (inodes 90%) (inodes 95%) Phase [5.3b] time in seconds: 1 Phase [5.3] time in seconds: 3 Phase [5.4]: Scan directories. Phase [5.4] time in seconds: 0 Phase [5.6]: Clean up. Phase [5.6a]: Find lost nt streams. Phase [5.6a] time in seconds: 1 Phase [5.6b]: Find lost files. Phase [5.6b] time in seconds: 2 Phase [5.6c]: Find lost blocks. Phase [5.6c] time in seconds: 0 Phase [5.6d]: Check blocks used. Phase [5.6d] time in seconds: 1 Phase [5.6] time in seconds: 4 Volume flexvol1 WAFL_check time in seconds: 8 (No filesystem state changed.) Phase 5b time in seconds: 8 Phase 6: Clean up. Phase 6a: Find lost nt streams. Phase 6a time in seconds: 0 Phase 6b: Find lost files. Phase 6b time in seconds: 2 Phase 6c: Find lost blocks. Phase 6c time in seconds: 0 Phase 6d: Check blocks used. Phase 6d time in seconds: 1 Phase 6 time in seconds: 3 WAFL_check total time in seconds: 13 (No filesystem state changed.) Press any key to reboot system.[LCD:info] Rebooting

Student Guide

wafliron on the root aggregate Step 1. 2. 3. 4. 5. Action Start the simulator by executing the maytag.L file. When Prompted, enter Y to the floppy boot question. At the 1-5 menu, enter 22/7 to display the secret list of commands. Note that wafliron is listed at the bottom. At the 1-5 menu, enter wafliron. The filer will start wafliron and begin to boot up. View the console messages as the system boots. Once booted, enter the status command occasionally and note WAFL Irons process on the volumes: filer>priv set advanced filer>vol wafliron status filer>aggr wafliron status

Student Guide

You might also like