Isilon Troubleshooting Guide File System Locking

Isilon Troubleshooting Guide: File Systems - Locking
This troubleshooting guide applies to OneFS 7.2 - 8.0 Revised December 5, 2016
IMPORTANT!
If you arrived at this
Start guide from a Protocols
guide, consult a coach
or SME.
Are you Go to:

connecting Isilon Troubleshooting Guide:
to the cluster over SMB, Yes Protocols - Protocol Routing
NFS, HTTP,
or FTP?
No
Is the cluster
unresponsive to Yes Go to 2A
isi commands?
No
Is the issue related to

a node that has split from the
Yes Go to 4A
cluster, and that cannot
reconnect?
No
Are hangdumps appearing in

Yes Go to 5A
the /var/log/messages log?
No
Consult a coach. If a coach is not

available, consult your supervisor or
manager for further direction.
We appreciate your help in improving this document.

_________________
Submit your feedback at http://bit.ly/isi-docfeedback.
Page 1 of 6 © Copyright EMC Corporation. All rights reserved.
Indeterminate transactions Revised December 5, 2016
2A
Check for indeterminate transactions by running the following command:
isi_for_array -sX sysctl efs.journal.indeterminate_txns
See example output at the bottom of this page.
Refer to:
OneFS: Node expectedly reboots and/or any
of the following errors are seen in messages:
Did the command "Double failure detected for txn_p" "txn
return anything Yes (X:xxxxxxxxxx) is not resolved" "error =
other than zero? 98dexitcode: XXXX: EJDEADLOCK",
467837
No
Refer to:
Isilon OneFS: Nodes that have run for more than 248.5
consecutive days may restart without warning which may lead to
______
potential data unavailability, 462835 Caution:
After initiating a Code Red Engagement, per the
and previous KB, do not make changes to the cluster until
you get a response to your escalation.
UPDATE: ETA 202452: Isilon OneFS: Nodes that have run for
______
497 consecutive days may restart without warning, 301837 Continue through this guide, checking for known
issues and gathering as much information as you can.
Go to 3A
Example isi_for_array -sX sysctl efs.journal.indeterminate_txns output:

cluster-2# isi_for_array -sX sysctl efs.journal.indeterminate_txns
cluster-1: efs.journal.indeterminate_txns: 0

_________________
Deadlocks Revised December 5, 2016
Note CAUTION!
Two common symptoms of deadlocks: If isi commands are not responding when you receive
 isi commands are unresponsive the case, do not run any additional isi commands as
 Clients cannot access the cluster you try to troubleshoot.
For more information, see:

Researching the causes of lock contention and
deadlock, ______
471792
How to recover from a cluster-wide deadlock, 303990

______
Note
If the terminal appears blank,
3A press Ctrl + C. This method
prevents running a command
unintentionally.
Go to:
Is more than one OneFS: How to recover
from a cluster-wide Did documentation
node in the cluster Yes Yes End
deadlock?, 303990 solve the problem?
unresponsive?
No
No

Do hangdumps appear in
the /var/log/messages Yes Go to 4A
log?
No


_________________
Merge Lock
Revised December 5, 2016
4A
Verify that a shared merge lock is failing by running the following command:
isi_for_array -s sysctl efs.gmp.merge_lock_state
Output similar to the following appears:
node 1: efs.gmp.merge_lock_state: NO_EXCLUSIVE

node 2: efs.gmp.merge_lock_state: NO_EXCLUSIVE
node 3: efs.gmp.merge_lock_state: EXCLUSIVE_WAITING
Note the output for node 3, whose state is EXCLUSIVE_WAITING.

This status indicates a merge lock failure. (There may be more than one node in
this state.)
Rule out hardware issues, go

to: Isilon Troubleshooting
Are any nodes in a merge Guide: Hardware - Top Level
Yes No
lock state? and continue troubleshooting.
Run the following command on each node that has a merge

lock:
isi_bug_info > /var/crash/isi_bug_info.txt

If Hardware was not the cause of
This command gathers information about the merge lock, and the issue, consult a coach. If a coach is
redirects it to a text file in the not available, consult your supervisor or
/var/crash log. manager for further direction.


_________________
Hangdumps Revised December 5, 2016
5A
Is this an active issue,

or are you performing
Post-event Active Issue
post-event root-cause
analysis?

Establish an SSH connection to: available, consult your supervisor or
elvis.igs.corp
Change the directory to the location of the log set by Note

running the following command, where <path The path name will be included in
name> is the path to the cluster logs: the case notes. Alternatively, you
can locate the path name with the
cd /logs/<path name> Log Search engine.
Run the following command to launch the

log analysis application:
nilp
At the prompt, type hang, and then press

Enter.
Type the number that corresponds to the

date of the hangdump you want to graph,
and then press Enter.
Go to 6A

_________________
Hangdumps, continued Revised December 5, 2016
6A
Follow the instructions provided in the

"Identifying the cause of lock contention or
deadlocks" section of Researching the
causes of lock contention and deadlock,
471792.
In the log directory, find the folder called

lockviz. Within that directory, find the
.svg file. This file contains output similar to
the diagram at right.
Examine the diagram to identify

the source of the lock contention.
Add your findings to the case notes, and
then consult a coach. If a coach is not
Fig. 1: Lock contention diagram
How to interpret lock contention diagrams
 The green boxes identify the two entities in contention

for a lock. The entity at the top of the diagram is trying to
obtain the lock. It is known as the locker. The entity at
the bottom of the diagram has possession of the
contested object. This entity is known as the owner.
 The red oval is known as a waiter. Waiters are lockers
that have requested a lock, but not yet acquired it.
 The yellow diamond identifies the object the locker and
owner are contesting.
 The blue oval identifies the type of lock the owner has
placed on the contested object.
For a more thorough explanation, see a coach or SME.

_________________

Isilon Troubleshooting Guide File System Locking

Uploaded by

Copyright:

Available Formats

Isilon Troubleshooting Guide File System Locking

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Isilon Troubleshooting Guide File System Locking

Uploaded by

Copyright:

Available Formats

Isilon Troubleshooting Guide: File Systems - Locking

Are you Go to:

Is the issue related to

Are hangdumps appearing in

Consult a coach. If a coach is not

We appreciate your help in improving this document.

Check for indeterminate transactions by running the following command:

isi_for_array -sX sysctl efs.journal.indeterminate_txns

See example output at the bottom of this page.

Example isi_for_array -sX sysctl efs.journal.indeterminate_txns output:

We appreciate your help in improving this document.

For more information, see:

How to recover from a cluster-wide deadlock, 303990

Consult a coach. If a coach is not

Consult a coach. If a coach is not

We appreciate your help in improving this document.

isi_for_array -s sysctl efs.gmp.merge_lock_state

Output similar to the following appears:

node 1: efs.gmp.merge_lock_state: NO_EXCLUSIVE

Note the output for node 3, whose state is EXCLUSIVE_WAITING.

Rule out hardware issues, go

Run the following command on each node that has a merge

isi_bug_info > /var/crash/isi_bug_info.txt

Consult a coach. If a coach is not

We appreciate your help in improving this document.

Is this an active issue,

Consult a coach. If a coach is not

Change the directory to the location of the log set by Note

Run the following command to launch the

At the prompt, type hang, and then press

Type the number that corresponds to the

We appreciate your help in improving this document.

Follow the instructions provided in the

In the log directory, find the folder called

Examine the diagram to identify

How to interpret lock contention diagrams

 The green boxes identify the two entities in contention

For a more thorough explanation, see a coach or SME.

We appreciate your help in improving this document.

You might also like