[Apparmor-dev] rlimit resource limit policies

Crispin Cowan crispin at novell.com
Tue Apr 17 00:17:29 MDT 2007


Sarah Smith wrote:
> On Monday 09 April 2007 19:00, Crispin Cowan wrote:
>   
>> Naturally, Linux already has resource limiting features in the form of
>> rlimit and ulimit. Unfortunately, similar to POSIX.1e capabilities, the
>> management interface sucks, to say the least. Also similar to
>> Capabilities, we can ease this situation by providing easy-to-manage
>> resource limit policies in AppArmor profiles.
>>
>> See man 2 getrlimit and setrlimit, and man 1 ulimit. Using the system
>> calls getrlimit() and setrlimit(), a process can voluntarily set its
>> resource limits for a bunch of attributes, and the kernel will enforce
>> these limits if the process is not privileged. The proposed design is
>> that you be able to write such resource limits into a profile, e.g.
>>     
> A process can lower its hard limit, raise it if it has CAP_SYS_RESOURCE |
> CAP_SYS_ADMIN, and raise or lower its softlimit. - I guess the policy values 
> would be setting the hard limit:
>   
Yes, hard limits.

>> /usr/bin/foo {
>>     capability chown, # a classic POSIX.1e Capability limit
>>     rlimit_as 1048576, # set address space limit to 1MB
>>     rlimit_nofile 6, # can have 6 file descriptors
>>     rlimit_nproc 10, # can have 10 instance processes in this profile
>> .
>> }
>>
>> The man page has the full set of attributes. Most of these attributes
>> are naturally applied per process, e.g. rlimit_as is the size of the
>> address space, and rlimit_nofile is the number of open file descriptors
>> a process may have. Limiting these resources is performed by the kernel:
>> all AppArmor has to do is set the limits when the process is instantiated.
>>     
> This looks good and seems straightforward. 
>
> Interestingly it makes AppArmour a security system and a resource control 
> system - a subtle point which perhaps we dont have to worry about.
>   
I think of it as AppArmor providing security for certain kinds of
denial-of-service attacks. That is the only justification for adding
this kind of feature to AppArmor; without it, an attacker can fork-bomb
your machine to death.


> But for example with open (2) the return value would be -EMFILE rather 
> than -EPERM, so does it get logged as a breach of policy?  Perhaps its a 
> breach of policy of the current value is set to the hard limit at the time of 
> the denied call.
>
> This allows processes the current semantics where it can consume resources up 
> to the soft limit, be notified of such by the system call return value or 
> signal, and then act to curtail its resource use so it doesnt hit the hard 
> limit.
>   
Since you have the first actual use case, I bet you know better than I
do how this should work.

>> rlimit_nproc is special: this is the number of processes possible. In
>> the classic parlance of rlimit, this is the number of processes in the
>> current real user_ID. I propose that for AppArmor policy purposes, we
>> change it to be the number of processes that can be instantiated under
>> this profile. If the limit is exceeded, then fork or exec will fail on
>> attempts to create another process via ix permissions. However, the
>> process is still free to launch Px, px, and Ux children, subject to the
>> rlimit_nproc policy of the corresponding profiles for Px and px.
>>     
> Here the aim would be to prevent either a malicious or malfunctioning piece of 
> code running in a confined process forking so much that the kernel can't 
> schedule the children/handle the task structs or otherwise grinds to a halt.
>   
Yes.

> The above looks good for a process which forks new copies of itself, or maybe 
> creates lots of threads.
>
> Just wondering about the inheritance for such a policy.  
>   
I would think inheritance should follow the current AppArmor inheritance
model, and the number of permitted instances corresponds to which
profile the proposed new child would run in. So rlimit_nproc refers to
the number of threads that can run in the current profile, which becomes
a limit on how many times you can ix into this profile itself (plus
other instances that might have been started elsewhere), and that the
process could limit for some other profile that you would access with Px
would be determined by the rlimit_nproc specification in that other profile.

> The 90% case is a fork then an exec.
>   
.. and because of that, you need at least one spare instance in your
current profile even to be able to Px to some other profile.

> If the inheritance policy is that the process limit is that of the parent 
> process, all is good - a badly written server /sbin/malsrv which fork-bombs 
> itself by calling binary /bin/foo a lot via system (3) will be controlled. 
>
> But if /bin/foo always is unlimited, and /sbin/malsrv is not prevented from 
> exec'ing it then it can DoS by forking to somewhere under its limit and in 
> each copy doing
>    char *fork_lots_arg[] = { "--fork", "10000", '\0' };
>    execve( "/bin/foo", fork_lots_arg, environ );
> Or if the shell is unlimited 'system( "bomb() { bomb | bomb }; bomb" )'
>   
It breaks down by cases:

    * if /bin/foo is permitted to execute via ix, then the limit is
      determined by /sbin/malserv
    * if /bin/foo is permitted to execute via Px, then the limit is
      determined by the /bin/foo profile
    * if you want a different limit on /bin/foo then you create a
      hardlink from /bin/foo to /bin/bar, and then create a separate
      profile for /bin/bar that has its own rlimit_nproc specification


> Anyway, maybe that can't be reasonably prevented.  Maybe limits should be 
> applied to anything the confined process can exec.
>   
That kind of fork bombing is my dominant concern in this project.

>> Does this design sound useful?
>>
>> So why am I posting this now? Two reasons:
>>
>>    1. A member of the AppArmor community privately expressed interest in
>>       building this feature, because the SUSE AppArmor team is doing
>>       other stuff right now.
>>    2. John Johansen (part of the SUSE team) has actually implemented a
>>       partial prototype of rlimit code. He is critical path on some
>>       other fun features we are planning for 10.3, so he can't work on
>>       it in time for 10.3.
>>
>> I've asked JJ to post his prototype work in response to this thread.
>>     
> Interested to see that code.
>   
JJ has been pre-occupied with the up-streaming effort; hopefully he will
be able to post his rlimit prototype code some time soon.

Crispin

-- 
Crispin Cowan, Ph.D.               http://crispincowan.com/~crispin/
Director of Software Engineering   http://novell.com




More information about the Apparmor-dev mailing list