Type: OS Fiche 3-Annexe 3 Sujet D'étude: Taking Advantage of Linux Capabilities. Durée: 20 Min. Auteur: Michael Bacarella Résumé: Linux Capabilities
Type: OS Fiche 3-Annexe 3 Sujet D'étude: Taking Advantage of Linux Capabilities. Durée: 20 Min. Auteur: Michael Bacarella Résumé: Linux Capabilities
Type: OS Fiche 3-Annexe 3 Sujet D'étude: Taking Advantage of Linux Capabilities. Durée: 20 Min. Auteur: Michael Bacarella Résumé: Linux Capabilities
A common topic of discussion nowadays is security, and for good reason. Security is becoming more
important as the world becomes further networked. Like all good systems, Linux is evolving in order
to address increasingly important security concerns.
One aspect of security is user privileges. UNIX-style user privileges come in two varieties, user and
root. Regular users are absolutely powerless; they cannot modify any processes or files but their own.
Access to hardware and most network specifications also are denied. Root, on the other hand, can do
anything from modifying all processes and files to having unrestricted network and hardware access.
In some cases root can even physically damage hardware.
Sometimes a middle ground is desired. A utility needs special privileges to perform its function, but
unquestionable god-like root access is overkill. The ping utility is setuid root simply so it can send and
receive ICMP messages. The danger lies in the fact that ping can be exploited before it has dropped its
root privileges, giving the attacker root access to your server.
Fortunately, such a middle ground now exists, and it's called POSIX capabilities. Capabilities divide
system access into logical groups that may be individually granted to, or removed from, different
processes. Capabilities allow system administrators to fine-tune what a process is allowed to do, which
may help them significantly reduce security risks to their system. The best part is that your system
already supports it. If you're lucky, no patching should be necessary.
A list of all the capabilities that your system is, well, capable of, is available in
/usr/include/linux/capability.h, starting with CAP_CHOWN. They're pretty self-explanatory and well
commented. Capability checks are sprinkled throughout the kernel source, and grepping for them can
make for some fun midnight reading.
Each capability is nothing more than a bit in a bitmap. With 32 bits in a capability set, and 28 sets
currently defined, there are currently discussions as to how to expand this number. Some purists
believe that additional capabilities would be too confusing, while others argue that there should be
many more, even a capability for each system call. Time and Linus will ultimately decide how this
exciting feature develops.
As of kernel 2.4.17, the file /proc/sys/kernel/cap-bound contains a single 32-bit integer that defines the
current global capability set. The global capability set determines what every process on the system is
allowed to do. If a capability is stripped from the system, it is impossible for any process, even root
processes, to regain them.
For example, many crackers' rootkits (a set of tools that cover up their activities and install backdoors
into the system) will load kernel modules that hide illicit processes and files from the system
administrator. To counter this, the administrator could simply remove the CAP_SYS_MODULE
capability from the system as the last step in the system startup process. This step would prevent any
kernel modules from being loaded or unloaded. Once a capability has been removed, it cannot be re-
added. The system must be restarted (which means you might have to use the power button if you've
removed the CAP_SYS_BOOT capability) to regain the full-capability set.
init can re-add capabilities, in theory; there's no actual implementation to my knowledge. This is to
facilitate capability-aware systems in the event that init needs to change runlevels.
Editing cap-bound by hand is kind of tedious. Fortunately for you, there's a utility called lcap that
provides a friendlier interface to cap-bound. Here's how one would remove CAP_SYS_CHOWN:
lcap CAP_SYS_CHOWN
Once done, it becomes impossible to change a file's owner:
chown nobody test.txt
chown: changing ownership of `test.txt':
Operation not permitted
Here's how you would remove all capabilities except CAP_SYS_BOOT, CAP_SYS_KILL and
CAP_SYS_NICE:
lcap -z CAP_SYS_BOOT CAP_SYS_KILL CAP_SYS_NICE
One thing to note: modifying cap-bound restricts the capabilities of future processes only. Okay, not
exactly future processes but any process that calls exec(2) (see the function compute_creds in the
kernel source file fs/exec.c). Currently running processes keep the capabilities with which they started.
Modifying the capabilities of an existing process leads us into the next section, and here's the catch I
spoke about above. Running lcap with no arguments lists what your system is capable of. If you see
that CAP_SETPCAP is disabled, you need to make a change to your kernel. It's simple enough to
describe here. In the kernel source tree, edit include/linux/capability.h. You're changing the lines:
#define CAP_INIT_EFF_SET
to_cap_t(~0 & ~CAP_TO_MASK(CAP_SETPCAP))
#define CAP_INIT_INH_SET to_cap_t(0)
There's actually a reason that CAP_SETPCAP is disabled by default: it's deemed a security risk to
leave it enabled on a production system (a patch exists for this condition but has yet to be applied as of
this writing). To be on the safe side, make sure to remove this capability when you're done playing.
As of this writing, the syscalls capset and capget manipulate capabilities for a process. There are no
guarantees that this interface won't change. Portable applications are encouraged to use libcap
(www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.4) instead.
If pid is -1, you will modify the capabilities of all currently running processes. Less than -1 and you
modify the process group equal to pid times -1. The semantics are similar to those of kill(2).
The DATA argument allows you to choose which capability sets you plan to modify. There are three:
typedef struct __user_cap_data_struct {
__u32 effective;
__u32 permitted;
__u32 inheritable;
} *cap_user_data_t;
The permitted set contains all of the capabilities that a process is ultimately capable of realizing.
The effective set is the capabilities a process has elected to utilize from its permitted set. It's as if you
had a huge arsenal of poetry (permitted set) but chose only to arm yourself with Allen Ginsberg for the
task at hand (effective set).
The inheritable set defines which capabilities may be passed on to any programs that replace the
current process image via exec(2). Please note that fork(2) does nothing special with capabilities. The
child simply receives an exact copy of all three capabilities sets.
Only capabilities in the permitted set can be added to the effective or inheritable set. Capabilities
cannot be added to the permitted set of a process unless CAP_SETPCAP is set.
Once fully supported, permitting the ping utility to open raw sockets could be as simple as:
chattr +CAP_NET_RAW /bin/ping
Unfortunately, more pressing kernel issues have delayed work in this area.
If you're so inclined, you can use libcap to hack your favorite services so that they are capability-aware
and drop the privileges they no longer need at startup. Several patches exist for xntpd that do just this;
some even provide their modified version as an RPM. Try a Google search if you're interested in a
capability-aware version of some root-level process you find yourself often shaking a fist at.
setpcap can be used to modify the capability set of an existing process. For example, if the PID of a
regular user's shell is 4235, here's how you can give that user's shell the ability to send signals to any
process:
setpcaps 'cap_kill=ep' 4235
An example use of this would be to allow a friend who is using your machine to debug a CGI script to
kill any Apache processes that get stuck in infinite loops. You'd run it against their login shell once
and forget about them.
Here's an example that utilizes execcap and sucap to run ping as the user “nobody”, with only the
CAP_NET_RAW capability. Our target of choice for ping iswww.yahoo.com:
execcap 'cap_net_raw=ep' /sbin/sucap nobody
nobody /bin/ping www.yahoo.com
This sample isn't terribly useful because you need to be root to execute it, but it does illustrate what is
possible. Despite some of these shortcomings, system administrators still can take measures to
increase the security of their system. A system without CAP_SYS_BOOT, CAP_SYS_RAWIO and
CAP_SYS_MODULE is extremely difficult for an intruder to modify. They cannot hack kernel
memory, install new modules or restart the system so that it runs a backdoored kernel.
If your system logs are append-only and your core system utilities immutable (see chattr(3) for
details), removing the CAP_LINUX_IMMUTABLE capability will make it virtually impossible for
intruders to erase their tracks or install compromised utilities. Traffic sniffers like tcpdump become
unusable once CAP_NET_RAW is removed. Remove CAP_SYS_PTRACE and you've turned off
program debugging. Such a hostile environment is a script kiddy's worst nightmare, and there is no
choice but to disconnect and wait for the intrusion to be discovered.
Conclusion
Capabilities can provide sophisticated, fine-grained access control over all aspects of a Linux system.
At last, security paranoids will have some tools they so desperately need in their endless fight against
“them”.
Resources
Michael Bacarella ([email protected]) is president of Netgraft Corporation, a firm specializing in
web system development and information security analysis. He shares an apartment in New York with
his wonderful fiancée and a most fearsome green iguana (the iguana's name is Kang.
CAPABILITIES(7) Linux Programmer's Manual CAPABILITIES(7)
NAME top
DESCRIPTION top
Capabilities list
The following list shows the capabilities implemented on Linux, and
the operations or behaviors that each capability permits:
CAP_CHOWN
Make arbitrary changes to file UIDs and GIDs (see chown(2)).
CAP_DAC_OVERRIDE
Bypass file read, write, and execute permission checks. (DAC
is an abbreviation of "discretionary access control".)
CAP_DAC_READ_SEARCH
* Bypass file read permission checks and directory read and
execute permission checks;
* Invoke open_by_handle_at(2).
CAP_FOWNER
* Bypass permission checks on operations that normally require
the filesystem UID of the process to match the UID of the
file (e.g., chmod(2), utime(2)), excluding those operations
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH;
* set extended file attributes (see chattr(1)) on arbitrary
files;
* set Access Control Lists (ACLs) on arbitrary files;
* ignore directory sticky bit on file deletion;
* specify O_NOATIME for arbitrary files in open(2) and
fcntl(2).
CAP_FSETID
Don't clear set-user-ID and set-group-ID permission bits when
a file is modified; set the set-group-ID bit for a file whose
GID does not match the filesystem or any of the supplementary
GIDs of the calling process.
CAP_IPC_LOCK
Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)).
CAP_IPC_OWNER
Bypass permission checks for operations on System V IPC
objects.
CAP_KILL
Bypass permission checks for sending signals (see kill(2)).
This includes use of the ioctl(2) KDSIGACCEPT operation.
CAP_LINUX_IMMUTABLE
Set the FS_APPEND_FL and FS_IMMUTABLE_FL inode flags (see
chattr(1)).
CAP_NET_ADMIN
Perform various network-related operations:
* interface configuration;
* administration of IP firewall, masquerading, and accounting;
* modify routing tables;
* bind to any address for transparent proxying;
* set type-of-service (TOS)
* clear driver statistics;
* set promiscuous mode;
* enabling multicasting;
* use setsockopt(2) to set the following socket options:
SO_DEBUG, SO_MARK, SO_PRIORITY (for a priority outside the
range 0 to 6), SO_RCVBUFFORCE, and SO_SNDBUFFORCE.
CAP_NET_BIND_SERVICE
Bind a socket to Internet domain privileged ports (port
numbers less than 1024).
CAP_NET_BROADCAST
(Unused) Make socket broadcasts, and listen to multicasts.
CAP_NET_RAW
* use RAW and PACKET sockets;
* bind to any address for transparent proxying.
CAP_SETGID
Make arbitrary manipulations of process GIDs and supplementary
GID list; forge GID when passing socket credentials via UNIX
domain sockets; write a group ID mapping in a user namespace
(see user_namespaces(7)).
CAP_SETPCAP
If file capabilities are not supported: grant or remove any
capability in the caller's permitted capability set to or from
any other process. (This property of CAP_SETPCAP is not
available when the kernel is configured to support file
capabilities, since CAP_SETPCAP has entirely different
semantics for such kernels.)
CAP_SYS_ADMIN
* Perform a range of system administration operations
including: quotactl(2), mount(2), umount(2), swapon(2),
setdomainname(2);
* perform privileged syslog(2) operations (since Linux 2.6.37,
CAP_SYSLOG should be used to permit such operations);
* perform VM86_REQUEST_IRQ vm86(2) command;
* perform IPC_SET and IPC_RMID operations on arbitrary System
V IPC objects;
* override RLIMIT_NPROC resource limit;
* perform operations on trusted and security Extended
Attributes (see attr(5));
* use lookup_dcookie(2);
* use ioprio_set(2) to assign IOPRIO_CLASS_RT and (before
Linux 2.6.25) IOPRIO_CLASS_IDLE I/O scheduling classes;
* forge PID when passing socket credentials via UNIX domain
sockets;
* exceed /proc/sys/fs/file-max, the system-wide limit on the
number of open files, in system calls that open files (e.g.,
accept(2), execve(2), open(2), pipe(2));
* employ CLONE_* flags that create new namespaces with
clone(2) and unshare(2) (but, since Linux 3.8, creating user
namespaces does not require any capability);
* call perf_event_open(2);
* access privileged perf event information;
* call setns(2) (requires CAP_SYS_ADMIN in the target
namespace);
* call fanotify_init(2);
* perform KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2)
operations;
* perform madvise(2) MADV_HWPOISON operation;
* employ the TIOCSTI ioctl(2) to insert characters into the
input queue of a terminal other than the caller's
controlling terminal;
* employ the obsolete nfsservctl(2) system call;
* employ the obsolete bdflush(2) system call;
* perform various privileged block-device ioctl(2) operations;
* perform various privileged filesystem ioctl(2) operations;
* perform administrative operations on many device drivers.
CAP_SYS_BOOT
Use reboot(2) and kexec_load(2).
CAP_SYS_CHROOT
Use chroot(2).
CAP_SYS_MODULE
Load and unload kernel modules (see init_module(2) and
delete_module(2)); in kernels before 2.6.25: drop capabilities
from the system-wide capability bounding set.
CAP_SYS_NICE
* Raise process nice value (nice(2), setpriority(2)) and
change the nice value for arbitrary processes;
* set real-time scheduling policies for calling process, and
set scheduling policies and priorities for arbitrary
processes (sched_setscheduler(2), sched_setparam(2),
shed_setattr(2));
* set CPU affinity for arbitrary processes
(sched_setaffinity(2));
* set I/O scheduling class and priority for arbitrary
processes (ioprio_set(2));
* apply migrate_pages(2) to arbitrary processes and allow
processes to be migrated to arbitrary nodes;
* apply move_pages(2) to arbitrary processes;
* use the MPOL_MF_MOVE_ALL flag with mbind(2) and
move_pages(2).
CAP_SYS_PACCT
Use acct(2).
CAP_SYS_PTRACE
* Trace arbitrary processes using ptrace(2);
* apply get_robust_list(2) to arbitrary processes;
* transfer data to or from the memory of arbitrary processes
using process_vm_readv(2) and process_vm_writev(2).
* inspect processes using kcmp(2).
CAP_SYS_RAWIO
* Perform I/O port operations (iopl(2) and ioperm(2));
* access /proc/kcore;
* employ the FIBMAP ioctl(2) operation;
* open devices for accessing x86 model-specific registers
(MSRs, see msr(4))
* update /proc/sys/vm/mmap_min_addr;
* create memory mappings at addresses below the value
specified by /proc/sys/vm/mmap_min_addr;
* map files in /proc/bus/pci;
* open /dev/mem and /dev/kmem;
* perform various SCSI device commands;
* perform certain operations on hpsa(4) and cciss(4) devices;
* perform a range of device-specific operations on other
devices.
CAP_SYS_RESOURCE
* Use reserved space on ext2 filesystems;
* make ioctl(2) calls controlling ext3 journaling;
* override disk quota limits;
* increase resource limits (see setrlimit(2));
* override RLIMIT_NPROC resource limit;
* override maximum number of consoles on console allocation;
* override maximum number of keymaps;
* allow more than 64hz interrupts from the real-time clock;
* raise msg_qbytes limit for a System V message queue above
the limit in /proc/sys/kernel/msgmnb (see msgop(2) and
msgctl(2));
* override the /proc/sys/fs/pipe-size-max limit when setting
the capacity of a pipe using the F_SETPIPE_SZ fcntl(2)
command.
* use F_SETPIPE_SZ to increase the capacity of a pipe above
the limit specified by /proc/sys/fs/pipe-max-size;
* override /proc/sys/fs/mqueue/queues_max limit when creating
POSIX message queues (see mq_overview(7));
* employ prctl(2) PR_SET_MM operation;
* set /proc/PID/oom_score_adj to a value lower than the value
last set by a process with CAP_SYS_RESOURCE.
CAP_SYS_TIME
Set system clock (settimeofday(2), stime(2), adjtimex(2)); set
real-time (hardware) clock.
CAP_SYS_TTY_CONFIG
Use vhangup(2); employ various privileged ioctl(2) operations
on virtual terminals.
1. For all privileged operations, the kernel must check whether the
thread has the required capability in its effective set.
Before kernel 2.6.24, only the first two of these requirements are
met; since kernel 2.6.24, all three requirements are met.
Permitted:
This is a limiting superset for the effective capabilities
that the thread may assume. It is also a limiting superset
for the capabilities that may be added to the inheritable set
by a thread that does not have the CAP_SETPCAP capability in
its effective set.
Inheritable:
This is a set of capabilities preserved across an execve(2).
It provides a mechanism for a process to assign capabilities
to the permitted set of the new program during an execve(2).
Effective:
This is the set of capabilities used by the kernel to perform
permission checks for the thread.
Using capset(2), a thread may manipulate its own capability sets (see
below).
File capabilities
Since kernel 2.6.24, the kernel supports associating capability sets
with an executable file using setcap(8). The file capability sets
are stored in an extended attribute (see setxattr(2)) named
security.capability. Writing to this extended attribute requires the
CAP_SETFCAP capability. The file capability sets, in conjunction
with the capability sets of the thread, determine the capabilities of
a thread after an execve(2).
Effective:
This is not a set, but rather just a single bit. If this bit
is set, then during an execve(2) all of the new permitted
capabilities for the thread are also raised in the effective
set. If this bit is not set, then after an execve(2), none of
the new permitted capabilities is in the new effective set.
where:
Note that the bounding set masks the file permitted capabilities, but
not the inherited capabilities. If a thread maintains a capability
in its inherited set that is not in its bounding set, then it can
still gain that capability in its permitted set by executing a file
that has the capability in its inherited set.
Only the init process may set capabilities in the capability bounding
set; other than that, the superuser (more precisely: programs with
the CAP_SYS_MODULE capability) may only clear capabilities from this
set.
On a standard system the capability bounding set always masks out the
CAP_SETPCAP capability. To remove this restriction (dangerous!),
modify the definition of CAP_INIT_EFF_SET in
include/linux/capability.h and rebuild the kernel.
Removing a capability from the bounding set does not remove it from
the thread's inherited set. However it does prevent the capability
from being added back into the thread's inherited set in the future.
1. If one or more of the real, effective or saved set user IDs was
previously 0, and as a result of the UID changes all of these IDs
have a nonzero value, then all capabilities are cleared from the
permitted and effective capability sets.
If a thread that has a 0 value for one or more of its user IDs wants
to prevent its permitted capability set being cleared when it resets
all of its user IDs to nonzero values, it can do so using the
prctl(2) PR_SET_KEEPCAPS operation.
1. If the caller does not have the CAP_SETPCAP capability, the new
inheritable set must be a subset of the combination of the
existing inheritable and permitted sets.
4. The new effective set must be a subset of the new permitted set.
SECBIT_KEEP_CAPS
Setting this flag allows a thread that has one or more 0 UIDs
to retain its capabilities when it switches all of its UIDs to
a nonzero value. If this flag is not set, then such a UID
switch causes the thread to lose all capabilities. This flag
is always cleared on an execve(2). (This flag provides the
same functionality as the older prctl(2) PR_SET_KEEPCAPS
operation.)
SECBIT_NO_SETUID_FIXUP
Setting this flag stops the kernel from adjusting capability
sets when the threads's effective and filesystem UIDs are
switched between zero and nonzero values. (See the subsection
Effect of User ID Changes on Capabilities.)
SECBIT_NOROOT
If this bit is set, then the kernel does not grant
capabilities when a set-user-ID-root program is executed, or
when a process with an effective or real UID of 0 calls
execve(2). (See the subsection Capabilities and execution of
programs by root.)
The securebits flags can be modified and retrieved using the prctl(2)
PR_SET_SECUREBITS and PR_GET_SECUREBITS operations. The CAP_SETPCAP
capability is required to modify the flags.
An application can use the following call to lock itself, and all of
its descendants, into an environment where the only way of gaining
capabilities is by executing a program with associated file
capabilities:
prctl(PR_SET_SECUREBITS,
SECBIT_KEEP_CAPS_LOCKED |
SECBIT_NO_SETUID_FIXUP |
SECBIT_NO_SETUID_FIXUP_LOCKED |
SECBIT_NOROOT |
SECBIT_NOROOT_LOCKED);
CONFORMING TO top
No standards govern capabilities, but the Linux capability
implementation is based on the withdrawn POSIX.1e draft standard; see
⟨http://wt.tuxomania.net/publications/posix.1e/⟩.
NOTES top
COLOPHON top