[Apparmor-dev] rlimit resource limit policies

John Johansen jjohansen at suse.de
Fri May 4 02:37:48 MDT 2007


On Thu, May 03, 2007 at 08:46:53PM -0700, Crispin Cowan wrote:
> Thanks JJ!
> 
> Comments inline.
> 
> John Johansen wrote:
> > the current code tracks the number of processes attached to a profile
> > at the clone/fork point.  So it is tracking the number of threads/processes
> > in a profile.  This could be easily be modified to processes.
> >   
> I am indifferent to whether it is threads or processes.
> 
> > currently you need extra space of 1 process to do px, ux transition.  A
> > fork clone is accounted for before the exec is done.  Once the exec happens
> > the count will be increase for the profile being transitioned to, and
> > reduced for the profile the fork was done under.
> >   
> Doesn't this argue for having an implicit "+1" on num_proc? So that a
> user can intuitively say "I want 10 Apache daemons, so I will set this
> limit to 10"?
> 
perhaps - but then that would also allow 11 processes if something wasn't
execing.  I don't think it really matters either way.  I was more attempting
to describe the current state of things.

> >>> If the inheritance policy is that the process limit is that of the parent 
> >>> process, all is good - a badly written server /sbin/malsrv which fork-bombs 
> >>> itself by calling binary /bin/foo a lot via system (3) will be controlled. 
> >>>
> >>> But if /bin/foo always is unlimited, and /sbin/malsrv is not prevented from 
> >>> exec'ing it then it can DoS by forking to somewhere under its limit and in 
> >>> each copy doing
> >>>    char *fork_lots_arg[] = { "--fork", "10000", '\0' };
> >>>    execve( "/bin/foo", fork_lots_arg, environ );
> >>> Or if the shell is unlimited 'system( "bomb() { bomb | bomb }; bomb" )'  
> >>>       
> >> It breaks down by cases:
> >>
> >>     * if /bin/foo is permitted to execute via ix, then the limit is
> >>       determined by /sbin/malserv
> >>     * if /bin/foo is permitted to execute via Px, then the limit is
> >>       determined by the /bin/foo profile
> >>     * if you want a different limit on /bin/foo then you create a
> >>       hardlink from /bin/foo to /bin/bar, and then create a separate
> >>       profile for /bin/bar that has its own rlimit_nproc specification
> >>     
> > bleah, we seriesly need to fix this.  making hardlinks to do things is
> > nifty and all but there are better solutions.
> >   
> That seems orthogonal to the ulimit question. If/when we add an explicit
> feature for running a program under a different profile, the ulimit code
> should pick it up along with everything else.
> 
yeah its orthogonal, just ranting

> >>> Anyway, maybe that can't be reasonably prevented.  Maybe limits should be 
> >>> applied to anything the confined process can exec.  
> >>>       
> >> That kind of fork bombing is my dominant concern in this project.
> >>     
> > Another possibility (which the prototype does NOT do) is to inherit
> > limits across px if current profile has set limits and the profile
> > being transitioned to does not.
> >   
> I think I'd rather not do that.
> 
I'm not terribly fond of it either but it may have its uses, it is
certainly worth talking about, so that we can arrive at the exact semantics
we want.

> Each profile should describe how many units of that profile get to run.
> If you don't want to grant access to that, then specify px for a
> different profile (using the hardlink hack or something like it).
> 
> This makes it relatively easy to reason about what your system will do
> under load or under attack. If you want a given entry point to be
> num_proc limited, then ensure that every px leads to a profile that is
> also num_proc limited.
> 
> If you provide for inheritance across px, then the limits on a profile
> become a complex combination of the profile and its context.
> 
yeah that is what really makes me dislike it.

> > - resolve semantics around a task setting its rlimits
> >   - should CAP_SYS_RESOURCE allow a task to override the profiles HARD_LIMIT
> >   - should CAP_SYS_RESOURCE allow a task to ovveride the profiles SOFT_LIMIT
> >   
> I say "yes" to both of these, and conversely, the process does not get
> to mess with profile limits without this capability.
> 
the soft limit I think is an obvious yes.  In fact I think the application
should be able to adjust its soft limit as normal, even without
CAP_SYS_RESOURCE.  Giving the profile the ability to set the soft limit
is just a convience.

The hard limit I am less sure on but why I am thinking it shouldn't be
able to.  Why would you be setting a hardlimit in the profile if the
application can override it?

> >   - should a task be able to lower its limits if limits are set by profile
> >   - should a task be able to raise its limits as long as the limits
> >     are not > the values set by the profile
> >   
> I'd say no, mostly for simplicity and because I don't think it is necessary.
> 
Actually I think its a definate yes for soft limits, up to the hard limit
value (no extra work).

And I think yes for the hard limit if the profile has CAP_SYS_RESOURCE.
This comes about because of the differences in = and <= (see below)

> > /foo {
> >
> >   rlimit nproc = 100,		# set maximum tasks using this profile
> >   rlimit nofile <= 50,		# set maximum files to min(current->limit,50)
> >
> >   soft_rlimit rss = 500K,	# set soft RSS limit to 500K
> > }
> >   
> You never specified what the difference is between = and <=. If there is
> no difference, then I suggest deleting the symbol entirely and going
> with "rlimit nproc 100,"
> 
well actually I do in the comment to the right, but yeah I should have
explained that better.

The prototype currently supports 2 forms for setting the limit.
= sets the limit to the specified value
<= sets the limit to be less than or equal (<=) to the specified value
   specifically it takes the minimum of the tasks current limit and
   the profile specified limit.
   This allows a profile to specify a maximum value without raising
   an existing limit that could be lower.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://forge.novell.com/pipermail/apparmor-dev/attachments/20070504/b752479b/attachment.pgp


More information about the Apparmor-dev mailing list