Solarisx86 Sparc Boot Troubleshoot

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 33

SPARC: The Boot PROM

Each SPARC based system has a PROM (programmable read-only memory) chip with a program
called the monitor. The monitor controls the operation of the system before the Solaris kernel is
available. When a system is turned on, the monitor runs a quick self-test procedure to checks the
hardware and memory on the system. If no errors are found, the system begins the automatic
boot process.
Note
Some older systems might require PROM upgrades before they will work with the Solaris
system software. Contact your local service provider for more information.

SPARC: The Boot Process


The following table describes the boot process on SPARC based systems.
Table 151 SPARC: Description of the Boot Process
Boot Phase

Description

Boot PROM

1. The PROM displays system identification information and then runs


self-test diagnostics to verify the system's hardware and memory.

2. Then, the PROM loads the primary boot program, bootblk, whose
purpose is to load the secondary boot program (that is located in the ufs
file system) from the default boot device.

Boot
Programs

3. The bootblk program finds and executes the secondary boot program,
ufsboot, and loads it into memory.

Boot Phase

Description

4. After the ufsboot program is loaded, the ufsboot program loads the
kernel.

Kernel
Initialization

5. The kernel initializes itself and begins loading modules by using


ufsboot to read the files. When the kernel has loaded enough modules to
mount the root (/) file system, the kernel unmaps the ufsboot program and
continues, using its own resources.

6. The kernel creates a user process and starts the /sbin/init process,
which starts other processes by reading the /etc/inittab file.

init

7. The /sbin/init process starts the run control (rc) scripts, which
execute a series of other scripts. These scripts (/sbin/rc*) check and
mount file systems, start various processes, and perform system
maintenance tasks.

IA: The Boot Process


The following table describes the boot process on IA based systems.
Table 153 IA: Description of the Boot Process

Boot Phase

Description

BIOS

1. When the system is turned on, the BIOS runs self-test diagnostics to
verify the system's hardware and memory. The system begins to boot
automatically if no errors are found. If errors are found, error messages are
displayed that describe recovery options.
The BIOS of additional hardware devices are run at this time.

2. The BIOS boot program tries to read the first physical sector from the
boot device. This first disk sector on the boot device contains the master
boot record mboot, which is loaded and executed. If no mboot file is found,
an error message is displayed.

Boot
Programs

3. The master boot record, mboot, which contains disk information needed
to find the active partition and the location of the Solaris boot program,
pboot, loads and executes pboot.

4. The Solaris boot program, pboot loads bootblk, the primary boot
program, whose purpose is to load the secondary boot program that is
located in the ufs file system.

5. If there is more than one bootable partition, bootblk reads the fdisk
table to locate the default boot partition, and builds and displays a menu of
available partitions. You have a 30-second interval to select an alternate
partition from which to boot. This step only occurs if there is more than one
bootable partition present on the system.

6. bootblk finds and executes the secondary boot program, boot.bin or


ufsboot, in the root (/) file system. You have a 5second interval to

Boot Phase

Description

interrupt the autoboot to start the Solaris Device Configuration Assistant.

7. The secondary boot program, boot.bin or ufsboot, starts a command


interpreter that executes the /etc/bootrc script, which provides a menu of
choices for booting the system. The default action is to load and execute the
kernel. You have a 5second interval to specify a boot option or to start the
boot interpreter.

Kernel
initialization

8. The kernel initializes itself and begins loading modules by using the
secondary boot program (boot.bin or ufsboot) to read the files. When the
kernel has loaded enough modules to mount the root (/) file system, the
kernel unmaps the secondary boot program and continues, using its own
resources.

9. The kernel creates a user process and starts the /sbin/init process,
which starts other processes by reading the /etc/inittab file.

init

10. The /sbin/init process starts the run control (rc) scripts, which
execute a series of other scripts. These scripts (/sbin/rc*) check and
mount file systems, start various processes, and perform system
maintenance tasks

Extended Diagnostics: If diag-switch? and diag-level are set, additional diagnostics will
appear on the system console.

auto-boot?:

If the auto-boot? PROM parameter is set, the boot process will begin. Otherwise,
the system will drop to the ok> PROM monitor prompt, or (if sunmon-compat? and securitymode are set) the > security prompt.
The boot process will use the boot-device and boot-file PROM parameters unless diagswitch? is set. In this case, the boot process will use the diag-device and diag-file.
bootblk: The OBP (Open Boot PROM) program loads the bootblk primary boot program from
the boot-device (or diag-device, if diag-switch? is set). If the bootblk is not present or
needs to be regenerated, it can be installed by running the installboot command after booting
from a CDROM or the network. A copy of the bootblk is available at /usr/platform/`arch
-k`/lib/fs/ufs/bootblk

ufsboot: The secondary boot program, /platform/`arch -k`/ufsboot is run. This program
loads the kernel core image files. If this file is corrupted or missing, a bootblk: can't find
the boot program or similar error message will be returned.
kernel: The kernel is loaded and run. For 32-bit Solaris systems, the relevant files are:

/platform/`arch -k`/kernel/unix

/kernel/genunix

For 64-bit Solaris systems, the files are:

/platform/`arch -k`/kernel/sparcV9/unix

/kernel/genunix

As part of the kernel loading process, the kernel banner is displayed to the screen. This includes
the kernel version number (including patch level, if appropriate) and the copyright notice.
The kernel initializes itself and begins loading modules, reading the files with the ufsboot
program until it has loaded enough modules to mount the root filesystem itself. At that point,
ufsboot is unmapped and the kernel uses its own drivers. If the system complains about not
being able to write to the root filesystem, it is stuck in this part of the boot process.
The boot -a command singlesteps through this portion of the boot process. This can be a useful
diagnostic procedure if the kernel is not loading properly.
/etc/system: The /etc/system

file is read by the kernel, and the system parameters are set.

The following types of customization are available in the /etc/system file:

moddir: Changes path of kernel modules.

forceload: Forces loading of a kernel module.

exclude: Excludes a particular kernel module.

rootfs: Specify the system type for the root file system. (ufs is the default.)

rootdev: Specify the physical device path for root.

set: Set the value of a tuneable system parameter.

If the /etc/system file is edited, it is strongly recommended that a copy of the working file be
made to a well-known location. In the event that the new /etc/system file renders the system
unbootable, it might be possible to bring the system up with a boot -a command that specifies
the old file. If this has not been done, the system may need to be booted from CD or network so
that the file can be mounted and edited.
kernel initialized: The kernel creates PID 0 ( sched). The sched process is sometimes called the
"swapper."
init: The kernel starts PID 1 (init).
init: The init

process reads the /etc/inittab and /etc/default/init and follows the


instructions in those files.
Some of the entries in the /etc/inittab are:

fs: sysinit (usually /etc/rcS)

is: default init level (usually 3, sometimes 2)

s#: script associated with a run level (usually /sbin/rc#)

rc scripts: The rc scripts execute the files in the /etc/rc#.d directories. They are run by the
/sbin/rc# scripts, each of which corresponds to a run level.
Debugging can often be done on these scripts by adding echo lines to a script to print either a "I
got this far" message or to print out the value of a problematic variable.
x86: Boot Process

The following table describes the boot process on x86 based systems.
Table 16-2 x86: Description of the Boot Process
Boot

Description

Phase
BIOS

1. When the system is turned on, the BIOS runs self-test diagnostics to
verify the system's hardware and memory. The system begins to boot
automatically if no errors are found. If errors are found, error messages
are displayed that describe recovery options.

The BIOS of additional hardware devices are run at this time.


2. The BIOS boot program tries to read the first disk sector from the boot
device. This first disk sector on the boot device contains the master boot
record mboot, which is loaded and executed. If no mboot file is found, an
error message is displayed.
Boot
3. The master boot record, mboot, contains disk information needed to
Programs find the active partition and the location of the Solaris boot program,
pboot, loads and executes pboot, mboot.
4. The Solaris boot program, pboot, loads bootblk, the primary boot
program. The purpose of bootblk is to load the secondary boot program,
which is located in the UFS file system.
5. If there is more than one bootable partition, bootblk reads the fdisk
table to locate the default boot partition, and builds and displays a menu
of available partitions. You have a 30 seconds to select an alternate
partition from which to boot. This step occurs only if there is more than
one bootable partition present on the system.
6. bootblk finds and executes the secondary boot program, boot.bin or
ufsboot, in the root (/) file system. You have five seconds to interrupt the
autoboot to start the Solaris Device Configuration Assistant.
7. The secondary boot program, boot.bin or ufsboot, starts a command
interpreter that executes the /etc/bootrc script. This script provides a
menu of choices for booting the system. The default action is to load and
execute the kernel. You have a 5second interval to specify a boot option
or to start the boot interpreter.
Kernel
8. The kernel initializes itself and begins loading modules by using the
initializatio secondary boot program (boot.bin or ufsboot) to read the files. When
n
the kernel has loaded enough modules to mount the root ( /) file system,
the kernel unmaps the secondary boot program and continues, using its
own resources.
9. The kernel creates a user process and starts the /sbin/init process,
which starts other processes by reading the /etc/inittab file.
init

10. In this Oracle Solaris release, the /sbin/init process starts

/lib/svc/bin/svc.startd, which starts system services that do the

following:

Check and mount file systems

Configure network and devices

Start various processes and perform system maintenance tasks

In addition, svc.startd executes the run control (rc) scripts for compatibility.
x86: Boot Files

In addition to the run control scripts and boot files, there are additional boot files that are
associated with booting x86 based systems.
Table 16-3 x86: Boot Files
File

Description

/etc/bootrc

Contains menus and options for booting the Oracle


Solaris release.

/boot

Contains files and directories needed to boot the


system.

/boot/mdboot

DOS executable that loads the first-level bootstrap


program (strap.com) into memory from disk.

/boot/mdbootbp

DOS executable that loads the first-level bootstrap


program (strap.com) into memory from diskette.

/boot/rc.d

Directory that contains install scripts. Do not modify the


contents of this directory.

/boot/solaris

Directory that contains items for the boot subsystem.

/boot/solaris/boot.bin

Loads the Solaris kernel or stand-alone kmdb. In


addition, this executable provides some boot firmware
services.

/boot/solaris/boot.rc

Prints the Oracle Solaris Operating OS on an x86


system and runs the Device Configuration Assistant in
DOS-emulation mode.

/boot/solaris/bootconf.exe DOS executable for the Device Configuration Assistant.


/boot/solaris/bootconf.txt Text file that contains internationalized messages for

Device Configuration Assistant (bootconf.exe).


/boot/solaris/bootenv.rc

Stores eeprom variables that are used to set up the boot


environment.

/boot/solaris/devicedb

Directory that contains the master file, a database of all


possible devices supported with realmode drivers.

/boot/solaris/drivers

Directory that contains realmode drivers.

/boot/solaris/itup2.exe

DOS executable run during install time update (ITU)


process.

/boot/solaris/machines

Obsolete directory.

/boot/solaris/nbp

File associated with network booting.

/boot/solaris/strap.rc

File that contains instructions on what load module to


load and where in memory it should be loaded.

/boot/strap.com

DOS executable that loads the second-level bootstrap


program into memory.

Note
rpc.bootparamd,

which is usually a requirement on the server side for performing a network


boot, is not required for a GRUB based network boot.
The GRUB menu.lst file lists the contents of the GRUB main menu. The GRUB main
menu lists boot entries for all the OS instances that are installed on your system,
including Solaris Live Upgrade boot environments. The Solaris software upgrade
process preserves any changes that you make to this file.

Solaris 10 boot process : SPARC


1Share
17Share
1Tweet
1Share
Solaris 10 boot process : SPARC
Solaris 10 boot process : x86/x64

The boot process for SPARC platform involves 5 phases as shown in the diagram below. There is
a slight difference in booting process of a SPARC based and x86/x64 based solaris operating
system.

Boot PROM phase

1. The boot PROM runs the power on self test (POST) to test the hardware.
2. The boot PROM displays the banner with below information
Model type
processor type
Memory
Ethernet address and host ID
3. Boot PROM reads the PROM variable boot-device to determine the boot device.
4. Boot PROM reads the primary boot program (bootblk) [sector 1 to 15] and executes it.
Boot program phase

1. bootblk loads the secondary boot program ufsboot into memory.


2. ufsboot reads and loads the kernel. The kernel is composed of 2 parts :

unix (platform specific kernel)

genunix (platform independent kernel)

3. ufsboot combines these 2 kernel into one complete kernel and loads into memory.
Kernel initialization phase

1. The kernel reads the configuration file /etc/system.


2. Kernel initializes itself and loads the kernel modules. The modules usually reside in /kernel
and /usr/kernel directories. (Platform specific drivers in /platform/uname -i/kernel and
/platform/uname -m/kernel directories)
Init phase

1. Kernel starts the /etc/init daemon (with PID 1).


2. The /etc/init daemon starts the svc.startd process which is responsible for starting and stopping
the services.
3. The /etc/init daemon uses a file called /etc/inittab to boot up the system to the appropriate run
level mentioned in this file.
Legacy Run Levels
Run level specifies the state in which specific services and resources are available to users.
0
- system running PROM monitor (ok> prompt)
s or S
- single user mode with critical file-systems mounted.(single user
can access the OS)
1
- single user administrative mode with access to all file-systems.
(single user can access the OS)
2
- multi-user mode. Multiple users can access the system. NFS and
some other network related daemons does not run
3
- multi-user-server mode. Multi user mode with NFS and all other
network resources available.
4
- not implemented.
5
- transitional run level. Os is shutdown and system is powered off.
6
- transitional run level. Os is shutdown and system is rebooted to
the default run level.

svc.startd phase

1. After kernel starts the svc.startd daemon. svc.startd daemon executes the rc scripts in the /sbin
directory based upon the run level.
rc scripts
Now with each run level has an associated script in the /sbin directory.
# ls -l /sbin/rc?
-rwxr--r-3 root
-rwxr--r-1 root
-rwxr--r-1 root

sys
sys
sys

1678 Sep 20
2031 Sep 20
2046 Sep 20

2012 /sbin/rc0
2012 /sbin/rc1
2012 /sbin/rc2

-rwxr--r--rwxr--r--rwxr--r--rwxr--r--

1
3
3
1

root
root
root
root

sys
sys
sys
sys

1969
1678
1678
4069

Sep
Sep
Sep
Sep

20
20
20
20

2012
2012
2012
2012

/sbin/rc3
/sbin/rc5
/sbin/rc6
/sbin/rcS

Each rc script runs the corresponding /etc/rc?.d/K* and /etc/rc?.d/S* scripts. For example for a
run level 3, below scripts will be executed by /sbin/rc3 :
/etc/rc3.d/K*
/etc/rc3.d/S*

The syntax of start and stop run scripts is


S##name_of_script - Start run control scripts
K##name_of_scrips - Stop run control scripts

Note the S and K in caps. Scripts starting with small s and k will be ignored. This can be used to
disable a script for that particular run level.
Solaris 10 boot process : SPARC
Solaris 10 boot process : x86/x64
Solaris 10 boot process : x86/x64
1Share
1Share
1Tweet
1Share
Solaris 10 boot process : SPARC
Solaris 10 boot process : x86/x64

In the last post we saw the boot process in solaris 10 on SPARC platform. The boot process on
x86/x64 hardware is bit different than the SPARC hardware. The x86/x64 hardware also involves
the 5 step boot process, same as the SPARC hardware. Refer the flow diagram below.

Boot PROM phase

1. The BIOS (Basic Input Output System) ROM runs the power on self test (POST) to test the
hardware.
2. BIOS tries to boot from the device mentioned in the boot sequence. (We can change this by
pressing F12 or F2).
3. When booting from the boot disk, BIOS reads the master boot program (mboot) on the first
sector and the FDISK table.
Boot program phase

1. mboot finds the active partition in FDISK table and loads the first sector containing GRUB
stage1.
2. GRUB stage1 in-turn loads GRUB stage2.
3. GRUB stage2 locates the GRUB menu file /boot/grub/menu.lst and displays GRUB main
menu.
4. Here user can select to boot the OS from partition or disk or network.
5. GRUB commands in /boot/grub/menu.lst are executed to load a pre-constructed primary boot

archive (usually /platform/i86pc/boot_archive in solaris 10).


6. GRUB loads a program called multiboot, which assembles core kernel modules from
boot_archive and starts the OS by mounting the root filesystem.
Kernel initialization phase

1. The kernel reads the configuration file /etc/system.


2. Kernel initializes itself and loads the kernel modules. The modules usually reside in /kernel
and /usr/kernel directories. (Platform specific drivers in /platform/uname -i/kernel and
/platform/uname -m/kernel directories)
Init phase

1. Kernel starts the /etc/init daemon (with PID 1).


2. The /etc/init daemon starts the svc.startd process which is responsible for starting and stopping
the services.
3. The /etc/init daemon uses a file /etc/inittab. A sample inittab file looks like below :
ap::sysinit:/sbin/autopush -f /etc/iu.ap
sp::sysinit:/sbin/soconfig -f /etc/sock2path
smf::sysinit:/lib/svc/bin/svc.startd>/dev/msglog 2<>/dev/msglog /dev/msglog
2<>/dev/msglog

The init tab file as shown above contains four fields :


id:rstate:action:process

The process keyword specifies the process to execute for the action keyword. For example
/usr/sbin/shutdown -y -i5 -g0 is the process to execute for the action powerfail
Legacy Run Levels
Run level specifies the state in which specific services and resources are available to users.
Below are the run levels available in solaris :
0
- system running PROM monitor (ok> prompt)
s or S
- single user mode with critical file-systems mounted.(single user
can access the OS)
1
- single user administrative mode with access to all file-systems.
(single user can access the OS)
2
- multi-user mode. Multiple users can access the system. NFS and
some other network related daemons does not run
3
- multi-user-server mode. Multi user mode with NFS and all other
network resources available.
4
- not implemented.
5
- transitional run level. Os is shutdown and system is powered off.
6
- transitional run level. Os is shutdown and system is rebooted to
the default run level.

svc.startd phase

1. After kernel starts the svc.startd daemon. svc.startd daemon executes the rc scripts in the /sbin
directory according to the run level.

rc scripts
Now with each run level has an associated script in the /sbin directory.
# ls -l /sbin/rc?
-rwxr--r-3 root
-rwxr--r-1 root
-rwxr--r-1 root
-rwxr--r-1 root
-rwxr--r-3 root
-rwxr--r-3 root
-rwxr--r-1 root

sys
sys
sys
sys
sys
sys
sys

1678
2031
2046
1969
1678
1678
4069

Sep
Sep
Sep
Sep
Sep
Sep
Sep

20
20
20
20
20
20
20

2012
2012
2012
2012
2012
2012
2012

/sbin/rc0
/sbin/rc1
/sbin/rc2
/sbin/rc3
/sbin/rc5
/sbin/rc6
/sbin/rcS

Each rc script runs the corresponding /etc/rc?.d/K* and /etc/rc?.d/S* scripts. For example for a
run level 3, below scripts will be executed by /sbin/rc3 :
/etc/rc3.d/K*
/etc/rc3.d/S*

The syntax of start and stop run scripts is


S##name_of_script - Start run control scripts
K##name_of_scrips - Stop run control scripts

Note the S and K in caps. Scripts starting with small s and k will be ignored. This can be used to
disable a script for that particular run level.

GRUB booting on Solaris x86


Let's talk bit more about booting the Solaris on x86 architecture (NOT on SPARC)
Now, on Solaris 10, the default boot loader is GRUB (GRand Unified Bootloader).
The GRUB loads boot archive in system's memory.
Okay so what's boor archive now?
Well, simply it's bunch of critical files that's needed during boot time before / (root)
file system is mounted. And these critical files are kernel modules and configuration
files.
Sun says, "Boot archive is interface that is used to boot Solaris OS". Remember,
there is not boot archive on SPARC, only on x86 architecture.
The GRUB has menu so you can select OS instance you want to boot.
Sometimes you may need to perform next two tasks below and luckily, there are
nice commands for that. I'll talk about them more later.

Rebuild the boot archive

Install the GRUB boot loader

But let's do first thing first and this is overview of booting.

The system is powered on.

BIOS (yes this is x86, not SPARC) initializes CPU, memory and platform
hardware

BIOS loads boot loader (GRUB) from boot device

GRUB takes control now

GRUB shows menu with boot options (predefined in configuration file


/boot/grub/menu.list)

There are also options for edit command - press "e" or for CLI - press "c"
o

Example: Boot in a single user mode

Press e when GRUB main menu shows up

Move to kernel /platform/i86pc/multiboot line. It has additional


options if you boot from ZFS.

Press e to edit this command. Prompt grub edit> shows up.

Type -s at the end of line. Press Enter brings you back to main menu.

Once in main menu, press b to boot in single user mode.

Same can be done for reconfigure boot with adding -r

If you need verbose for better troubleshooting, use -v

GRUB boots primary boot archive (see menu.list file module


/platform/i86pc/boot_archive ) and multiboot program (see this in menu.list
file: kernel /platform/i86pc/multiboot ).

Primary boot archive is file system image containing kernel modules and
data. This goes into memory in this moment.

Multiboot program is executable file and takes control from GRUB.

Multiboot program reads boot archive and assembles core kernel modules
into memory.

GRUB functional components are:

stage-1 installed on first sector of fdisk partition.

stage-2 installed in reserved area in fdisk partition. This is the core image of
GRUB.

/boot/grub/menu.list

The boot behavior can be modified using command eeprom which will edit file
/boot/solaris/bootenv.rc. See the file for more info.
Update a corrupt boot archive
Well, sooner or later you will have to do this, trust me :(

Boot "Solaris failsafe"

You may get prompt to sync out-of-date boot archive on say


/dev/dsk/c0t0d0s0 - do it with Y

Mount device with corrupted boot archive on /a

And forcibly update corrupted boot archive on alternate root: bootadm


update-archive -f -R /a

umount /a

init 6

Good luck!

Tip: setup cron job to run bootadm update-archive on regular basis and do it
manually after system upgrade or patch install.

Primary boot archive has files below (if any of them is updated, rebuild boot archive
with "bootadm update-archive".
boot/solaris/bootenv.rc
boot/solaris.xpm
etc/dacf.conf
etc/devices
etc/driver_aliases
etc/driver_classes
etc/match
etc/name_to_sysnum
etc/path_to_inst

etc/rtc_config
etc/system
kernel
platform/i86pc/biosint
platform/i86pc/kernel

Installing GRUB
This is also something you may need to do, say you are mirroring two disks using
SVM and want to install GRUB on second disk in case you need to boot from there.
So to install GRUB in master boot sector run (replace c1t0d0s0 with yours if
needed):
installgrub -fm /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c1t0d0s0
Actually, beside primary boot archive, there is one more - Failsafe boot archive. It
can boot on its own, require no maintenance and is created during OS installation.

SPARC: Installing a Boot Block on a System Disk

The following example shows how to install the boot block on a UFS root file system.

# installboot /usr/platform/sun4u/lib/fs/ufs/bootblk
/dev/rdsk/c0t0d0s0

The following example shows how to install the boot block on a ZFS root file system.

# installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk


/dev/rdsk/c0t1d0s0

SPARC EXAMPLES
The ufs bootblock is in /usr/lib/fs/ufs/bootblk. To install
the bootblock on slice 0 of target 0 on controller 1, use:
example# /usr/sbin/installboot /usr/lib/fs/ufs/bootblk \
/dev/rdsk/c1t0d0s0

x86 EXAMPLES
The ufs bootblock is in /usr/lib/fs/ufs/pboot. To install
the bootblock on slice 2 of target 0 on controller 1, use:
example# /usr/sbin/installboot /usr/lib/fs/ufs/pboot \
/usr/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2

##################################################
###########################
Searchable Keywords: bootblock bootsector boot bootblk

installboot - install bootblocks in a disk partition

FILES
/usr/platform/platform-name/lib/fs/ufs
Directory where ufs
boot objects reside.
SPARC SYNOPSIS
installboot bootblk raw-disk-device

To install a ufs bootblock on slice 0 of target 0 on controller 1 of the platform where the command is being run,
use:

Solaris 2.x

example# /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk \


/dev/rdsk/c1t0d0s0

Solaris 2.8
# installboot /usr/platform/sun4u/lib/fs/ufs/bootblk /dev/rdsk/cXtXdXsX

Here is the doc from sun:

"The file just loaded does not appear to be executable."

This error occurs when the default bootblock has been corrupted. To overcome
this problem do the following:

1. Get down to the ok prompt by either typing init 0, hitting <stop> a,


hitting <L1> a or unplugging the keyboard.

2. Boot to single usermode from the Solaris 2.6 OS CDROM:


OK boot cdrom -s

3. Run an fsck on your disk:


#fsck /dev/rdsk/c#t#d#s#

4. Mount to your root file system:


#mount /dev/dsk/c#t#d#s# /a

5. Make sure that your the restoresymtable file does not exist. If so,
remove it.

6. Next install the bootblock on your disk:


#installboot /usr/platform/sun4u/lib/fs/ufs/bootblk /dev/rdsk/c#t#d#s#

7. Unmount your root file system


#unmount /a

8. Run another fsck on your disk:


#fsck /dev/rdsk/c#t#d#s#

9. Reboot your machine:


#init 6

Note:
* Loaction of bootblk file may differ depending on the type of hardware
platformexample: /usr/platform/sun4u vs /usr/platform/sun4m
* Not sure you really need to mount the partition ? I guess to verify you have
the right one.

----------------- Other CPU type examples ------------------x86 EXAMPLES


To install the ufs bootblock and partition boot program on
slice 2 of target 0 on controller 1 of the platform where
the command is being run, use:

example# installboot /usr/platform/`uname -i`/lib/fs/ufs/pboot \

/usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2

PowerPC Edition EXAMPLES


To install the ufs bootblock and openfirmware program on
target 0 on controller 1 of the platform where the command
is being run, use:

example# installboot -f /platform/`uname -i`/openfirmware.x41 \


/usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s2

installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk \

/usr/platform/platform-name/lib/fs/ufs
Directory where ufs
boot objects reside.

SunOs-------SunOs---------SunOs-------------SunOs

OPTIONS
-h Leave the a.out header on the bootblock when installed
on disk.

-l Print out the list of block numbers of the boot pro-

gram.

-t Test. Display various internal test messages.

-v Verbose. Display detailed information about the size


of the boot program, etc.

EXAMPLE
To install the bootblocks onto the root partition on a Xylogics disk:
example% cd /usr/kvm/mdec
example% installboot -vlt /boot bootxy /dev/rxy0a

For an SD disk, you would use bootsd and /dev/rsd0a, respectively, in place of bootxy and /dev/rxy0a.

example: /usr/kvm/mdec/installboot /boot bootsd /dev/rsd3a


or
% cd /usr/kvm/mdec
./installboot /boot bootsd /dev/rsd3a

NOTE:The "/boot" is the boot file that resides in the root directory

# redo boot block


echo 'doing boot block install'
cd /usr/kvm/mdec

./installboot /kahuna/boot bootsd /dev/rsd2a

NOTE: inside the /usr/kvm/mdec dir there must be bootsd(for scsi devices)
and bootfd(for floppys) if these aren't there installboot
isn't going to work.

If all goes well you shoulsee something like this......


Secondary boot: /mnt/boot
Boot device: /dev/rsd0a
Block locations:
startblk size
720

10

730

10

740

10

750

10

760

10

770

10

780

10

790

10

7a0

10

7b0

10

7c0

10

7d0

10

7f0

10

800

10

Bootblock will contain a.out header

Boot size: 0x1af10


Boot checksum: 0x3a5a8103
Boot block installed

The four lines at the bottom are the real tellers ! ! ! ! ! ! !

example detail:

/usr/kvm/mdec/installboot -vlt /mnt2/boot bootsd /dev/rsd1a


|

(raw device - mounted on /mnt2)

(scsi boot device file)

(where sd1a is mounted and boot resides))

Options:
-l Print out the list of block numbers of the boot program.

-t Test. Display various internal test messages.

-v Verbose. Display detailed information about the size


of the boot program, etc.

---- More Sunos ------- More Sunos ------- More Sunos ------- More Sunos ---

So you have some older Sunos 4.X.x dump images you want to put on
another machine.

# mount
/dev/sd0a on / type 4.2 (rw)
/dev/sd0g on /usr type 4.2 (rw)
diastolic:/systems/cs712a_dumpimages on /mnt type nfs (rw) <--- image
/dev/sd1a on /mnt2 type 4.2 (rw,noquota) <--- new disk

# cat /mnt/cs712a.sd0a.01-04-02.Z |uncompress |restore xf

How to recover/reset root password in Sun solaris (SPARC)


General, Solaris 10, Solaris 7, Solaris 8, Solaris 9 Add comments
Jan 212008

There is every little chance that one loses or rather forgets the root password of his Sun Solaris
servers. In the event, this happens, there is a way out of it. Well the way and infact the only way
is to reset the password as there is no way to recover it. Recovering/restting the password
involves booting the server in Single User mode and mounting the root file system.

Ofcourse, it is recommeded that the security for the physical access to the server is restricted so
as to ensure that there is no unauthorized access and anyone who follows this routine is an
authorized personnel.
Boot the server with a Sun Solaris Operating System CD (Im using a Solaris 10 CD but doesnt
matter really) or a network boot with a JumpStart server from the OBP OK prompt.
OK boot cdrom -s
or
OK boot net -s
This will boot the server from the CD or Jumpstart server and launch a single user mode (No
Password).
Mount the root file system (assume /dev/dsk/c0t0d0s0 here) onto /a
solaris# mount /dev/dsk/c0t0d0s0 /a
NOTE: /a is a temporary mount point that is available when you boot from CD or a JumpStart
server
Now, with the root file system mounted on /a. All you need to do is to edit the shadow file and
remove the encrypted password for root.
solaris# vi /a/etc/shadow
Now, exit the mounted filesystem, unmount the root filesystem and reboot the system to singleuser mode booting of the disk.
solaris# cd /
solaris# umount /a
solaris# init s
This should boot of the disk and take you to the single-user mode. Press enter at the prompt to
enter a password for root.
This should allow you to login to the system. Once in, set the password and change to multi-user
mode.

NOTE: Single-User mode is only to ensure that the root user without password is not exposed to
others if started in multi-user mode before being set with a new password.
solaris# passwd root
solaris# reboot
This should do. You should now be able to logon with the new password set for root
Posted by admin at 10:42 am Tagged with: password, recovery, root, solaris, sun
6 Responses to How to recover/reset root password in Sun solaris
(SPARC)

Solaris Cluster 3.X: Cluster node paniced with


rgmd, rpc.fed, pmfd daemon died some 30 or
35 seconds ago message. Resolution Path
(Doc ID 1020514.1)

To
Bottom

Applies to:
Solaris Cluster - Version 3.0 to 3.3 [Release 3.0 to 3.3]
All Platforms

Symptoms
This document provides the basic steps to resolve the following failfast panics.
Failfast:
Failfast:
Failfast:
Failfast:
Failfast:
Failfast:

Aborting
Aborting
Aborting
Aborting
Aborting
Aborting

zone "global" (zone ID 0) because "pmfd" died 35 seconds ago.


because "rgmd" died 30 seconds ago.
because "rpc.fed" died 30 seconds ago.
because "rpc.pmfd" died 30 seconds ago.
because "clexecd" died 30 seconds ago.
because "globalrgmd" died 30 seconds ago

Cause
Solaris Cluster node panics due to a cluster daemon exiting.

Solution
Why does it happen?
The panic message indicates that a cluster-specific daemon shown in the message dies. It is a
recovery action taken by failfast mechanism of the cluster monitoring a critical problem. As
those processes are critical processes and cannot be restarted, the cluster shuts down the node
using the failfast driver. Critical daemons are registered with failfast ff() driver with some time
interval. If a daemon does not report back to failfast driver within the registered time interval (eg,
30sec) the driver will trigger a Solaris kernel panic.

Troubleshooting steps

To find the root cause of the problem, you need to find out why a cluster-specific daemon shown
in the messages dies. The followings are steps how to identify the root cause.
1. Check out the /var/adm/messages system log file for messages for system or
operation system errors
indicating that memory resources may have been limited, such as in the following
example. If those messages appears before the panic messages, the root cause would be

memory exhaustion since a process may get application core dumping and dies when a
system has lack of memory resource. If you find messages indicating lack of memory
resource, you will need to find out why the system had lack of or was low in memory and
or swap and address it to avoid this panic. If cluster daemon can not fork a new process
or can not allocate memory (malloc() ) it will likely to exit and trigger a panic.

Apr 2 18:05:13 sun-server1 cl_runtime: [ID 661778 kern.warning]


WARNING: clcomm: memory low: freemem 0xfb
Another indication is to check for messages reporting that swap space
was limited, such as in the following example.
Apr 2 18:05:12 sun-server1 genunix: [ID 470503 kern.warning] WARNING:
Sorry, no swap space to grow stack for pid 25825 (in.ftpd)
Apr 2 18:05:03 sun-server1 tmpfs: [ID 518458 kern.warning] WARNING:
/tmp: File system full, swap space limit exceeded
Example when a daemon or process

can not fork processes:

Apr 2 18:05:10 sun-server1 Cluster.PMF.pmfd: [ID 837760 daemon.error]


monitored processes forked failed(errno=12)

For additional information, from a kernel core file, you can see messages given before the
panic using the mdb. Check out those messages as well as the /var/adm/messages file.

# cd /var/crash/`uname -n`
# echo "::msgbuf -v" | mdb -k unix.0 vmcore.0

2. There are some bugs causing cluster daemons to exit (thus this panic) were fixed in
the Core Patch or Core/Sys admin those patches, check out if your system still has old
patch installed. Check out a the README of patch installed on your machine if there are
any relevant bugs fixed between a patch installed and the latest one.
Solaris Cluster 3.x update releases and matching / including Solaris Cluster 3.x core
patches (Doc ID 1368494.1)
The following is the list of some bugs that could be causing this panic and their patches.
Note that this is not a comprehensive list, always check MOS for most current bugs:

Bug 15529411: SUNBT6784007-OSC RUNNING SCSTAT(1M) CAUSES MEMORY


TO BE LEAKED IN RGMD
Bug 15507769: SUNBT6747452-3.2U2_FCS RU FAILED- FAILFAST: ABORTING
BECAUSE "GLOBALRGMD" DIED 3
Bug 15384429: SUNBT6535144 FAILFAST: ABORTING ZONE "GLOBAL" (ZONE
ID 0)BECAUSE "RGMD" DIED 30
Bug 15507579: SUNBT6739317-OSC DURING HA_FRMWK_FI, FAILFAST:
ABORTING ZONE "GLOBAL" (ZONE ID 0
Bug 15207664: SUNBT5035341-3.2_FCS FAILFAST: ABORTING BECAUSE
"CLEXECD" DIED 30 SECONDS AGO
Bug 15108998: SUNBT4690244 FAILFAST: ABORTING BECAUSE "RGMD" DIED
30 SECONDS AGO
Bug 15335430: SUNBT6438132 RGMD DUMPED CORE WHILE RESOURCES
WERE BEING DISABLED
Bug 15345802: SUNBT6460419-3.2_FCS SYNTAX ERROR IN SCSWITCH KILLS
RGMD
Bug 15282517: SUNBT6312828-3.2_FCS CLUSTER PANICS WITH 'RGMD DIED'
PANIC WHEN LD_PRELOAD IS SE
Bug 15263959: SUNBT6192133-3.1U4 RGMD CORE DUMPED DURING
FUNCTIONAL TESTS ON SC32/SC31U4 CLUST
Bug 15273568: SUNBT6290248-3.1U4 RGMD DUMPED CORE WHILE RS STOP
FAILED FLAG WAS BEING CLEARED
Bug 15335430: SUNBT6438132 RGMD DUMPED CORE WHILE RESOURCES
WERE BEING DISABLED
Bug 15345802: SUNBT6460419-3.2_FCS SYNTAX ERROR IN SCSWITCH KILLS
RGMD
Bug 15273568: SUNBT6290248-3.1U4 RGMD DUMPED CORE WHILE RS STOP
FAILED FLAG WAS BEING CLEARED
Bug 15126659: SUNBT4756973 RGMD USES IDL OBJECT AFTER FAILED IDL
CALL IN SCHA CONTROL GIVEOVER

1. The failfast panic will generate a kernel core file, however, in general, it does
not help you find a reason why a process dies. But in most of causes, when this
panic happens, a process dies due to an application core dumping and this
application core file will help you find the root cause. To collect an application
core file, use the coreadm command to get core files that are uniquely named and
are stored in a consistent place. Run the following commands on each cluster

node.
mkdir -p /var/cores
coreadm -g /var/cores/%f.%n.%p.%t.core \
-e global \
-e global-setid \
-e log \
-d process \
-d proc-setid

How to Use Pkgapp to Gather Libraries to Debug Core/Gcore/Crash Files in


Solaris and Linux (Doc ID 1274584.1)
How to use coreadm to name core files in Solaris (Doc ID 1001674.1) ("-e
process" option enables per process coredump for coreadm)

If you mad modification to dumpadm test to make sure that your settings are
working and you can collect a core:
# ps -ef|grep rgmd
root 1829 1 0 Dec 24 ? 0:00 /usr/cluster/lib/sc/rgmd
root 1833 744 0 Dec 24 ? 3195:27 /usr/cluster/lib/sc/rgmd -z global
# gcore -g 1829
gcore: /var/cores/rgmd.clnode1.1829.1393620353.core dumped
This will leave rgmd running only collects it's core.
# ps -ef|grep rgmd
root 1829 1 0 Dec 24 ? 0:00 /usr/cluster/lib/sc/rgmd

root 1833 744 0 Dec 24 ? 3195:30 /usr/cluster/lib/sc/rgmd -z global


# ls -l /var/cores/rgmd.clnode1.1829.1393620353.core
-rw------- 1 root root 22978997 Feb 28 20:45
/var/cores/rgmd.emclu2n1.1829.1393620353.core
# file /var/cores/rgmd.clnode1.1829.1393620353.core
/var/cores/rgmd.clnode1.1829.1393620353.core: ELF 32-bit MSB core file
SPARC Version 1, from 'rgmd'
3. If you have an application core file of cluster-specific daemon, you may want to
analyze it.
To analyze core file, you can start with this document Solaris Application Core Analysis
Product Page (Doc ID 1384397.1). But for quick analysis, use the pstack command. It
gives you stack trace of core file and this can be used for searching existing bug from the
SunSolve[SM]. It is also good ideal to give your Sun[TM] Services representative a
command output given by the pstack command for further analysis.
Example:
# /usr/bin/pstack /var/cores/rgmd.halab1.7699.1242026038.core

How to Enable Method and/or System Coredumps for Cluster Methods That Timeout
(Doc ID 1019001.1)

hostname:displaynumber.screennumbe

You might also like