kernel:control_groups
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
kernel:control_groups [2020/07/22 11:47] – old revision restored (2017/04/06 10:27) 204.12.220.106 | kernel:control_groups [2020/07/22 18:05] (current) – old revision restored (2020/07/20 16:09) 207.244.157.10 | ||
---|---|---|---|
Line 78: | Line 78: | ||
Control Groups extends the kernel as follows: | Control Groups extends the kernel as follows: | ||
- | * Each task in the system has a reference-counted pointer to a css_set. | + | * Each task in the system has a reference-counted pointer to a **css_set**. |
- | * A css_set contains a set of reference-counted pointers to cgroup_subsys_state objects, one for each cgroup subsystem registered in the system. | + | * A **css_set** contains a set of reference-counted pointers to **cgroup_subsys_state** objects, one for each cgroup subsystem registered in the system. |
* A cgroup hierarchy filesystem can be mounted | * A cgroup hierarchy filesystem can be mounted | ||
Line 95: | Line 95: | ||
If an active hierarchy with exactly the same set of subsystems already exists, it will be reused for the new mount. | If an active hierarchy with exactly the same set of subsystems already exists, it will be reused for the new mount. | ||
- | matches, and any of the requested subsystems are in use in an existing hierarchy, the mount will fail with -EBUSY. Otherwise, a new hierarchy | + | matches, and any of the requested subsystems are in use in an existing hierarchy, the mount will fail with **-EBUSY**. Otherwise, a new hierarchy is activated, associated with the requested subsystems. |
- | is activated, associated with the requested subsystems. | + | |
- | It's not currently possible to bind a new subsystem to an active cgroup hierarchy, or to unbind a subsystem from an active cgroup hierarchy. This may be possible in future, but is fraught with nasty error-recovery issues. | + | It's not currently possible to bind a new subsystem to an active cgroup hierarchy, or to unbind a subsystem from an active cgroup hierarchy. |
When a cgroup filesystem is unmounted, if there are any child cgroups created below the top-level cgroup, that hierarchy will remain active even though unmounted; if there are no child cgroups then the hierarchy will be deactivated. | When a cgroup filesystem is unmounted, if there are any child cgroups created below the top-level cgroup, that hierarchy will remain active even though unmounted; if there are no child cgroups then the hierarchy will be deactivated. | ||
Line 111: | Line 110: | ||
* **cgroup.procs**: | * **cgroup.procs**: | ||
* **notify_on_release** flag: run the release agent on exit? | * **notify_on_release** flag: run the release agent on exit? | ||
- | * **release_agent**: | + | * **release_agent**: |
- | Other subsystems such as cpusets may add additional files in each cgroup dir. | + | Other subsystems such as [[Kernel: |
New cgroups are created using the mkdir system call or shell command. | New cgroups are created using the mkdir system call or shell command. | ||
Line 121: | Line 120: | ||
The attachment of each task, automatically inherited at fork by any children of that task, to a cgroup allows organizing the work load on a system into related sets of tasks. | The attachment of each task, automatically inherited at fork by any children of that task, to a cgroup allows organizing the work load on a system into related sets of tasks. | ||
- | When a task is moved from one cgroup to another, it gets a new css_set pointer - if there' | + | When a task is moved from one cgroup to another, it gets a new **css_set** pointer - if there' |
- | To allow access from a cgroup to the css_sets (and hence tasks) that comprise it, a set of cg_cgroup_link objects form a lattice; each cg_cgroup_link is linked into a list of cg_cgroup_links for a single cgroup on its cgrp_link_list field, and a list of cg_cgroup_links for a single css_set on its cg_link_list. | + | To allow access from a cgroup to the css_sets (and hence tasks) that comprise it, a set of **cg_cgroup_link** objects form a lattice; each cg_cgroup_link is linked into a list of **cg_cgroup_links** for a single cgroup on its **cgrp_link_list** field, and a list of cg_cgroup_links for a single css_set on its **cg_link_list**. |
Thus the set of tasks in a cgroup can be listed by iterating over each css_set that references the cgroup, and sub-iterating over each css_set' | Thus the set of tasks in a cgroup can be listed by iterating over each css_set that references the cgroup, and sub-iterating over each css_set' | ||
Line 132: | Line 131: | ||
===== What does notify_on_release do? ===== | ===== What does notify_on_release do? ===== | ||
- | If the notify_on_release flag is enabled (1) in a cgroup, then whenever the last task in the cgroup leaves (exits or attaches to some other cgroup) and the last child cgroup of that cgroup is removed, then the kernel runs the command specified by the contents of the " | + | If the **notify_on_release** flag is enabled (1) in a cgroup, then whenever the last task in the cgroup leaves (exits or attaches to some other cgroup) and the last child cgroup of that cgroup is removed, then the kernel runs the command specified by the contents of the **" |
===== What does clone_children do? ===== | ===== What does clone_children do? ===== | ||
- | If the clone_children flag is enabled (1) in a cgroup, then all cgroups created beneath will call the post_clone callbacks for each subsystem of the newly created cgroup. Usually when this callback is implemented for a subsystem, it copies the values of the parent subsystem, this is the case for the cpuset. | + | If the **clone_children** flag is enabled (1) in a cgroup, then all cgroups created beneath will call the post_clone callbacks for each subsystem of the newly created cgroup. Usually when this callback is implemented for a subsystem, it copies the values of the parent subsystem, this is the case for the cpuset. |
Line 144: | Line 143: | ||
To start a new job that is to be contained within a cgroup, using the " | To start a new job that is to be contained within a cgroup, using the " | ||
- | | + | - mkdir / |
- | 2) mount -t cgroup -ocpuset cpuset / | + | |
- | 3) Create the new cgroup by doing mkdir' | + | |
- | | + | |
- | 4) Start a task that will be the " | + | |
- | 5) Attach that task to the new cgroup by writing its pid to the | + | |
- | | + | |
- | 6) fork, exec or clone the job tasks from this founding father task. | + | |
For example, the following sequence of commands will setup a cgroup named " | For example, the following sequence of commands will setup a cgroup named " | ||
Line 182: | Line 179: | ||
</ | </ | ||
- | The " | + | The " |
To mount a cgroup hierarchy with just the cpuset and memory subsystems, type: | To mount a cgroup hierarchy with just the cpuset and memory subsystems, type: | ||
Line 215: | Line 212: | ||
Note that changing the set of subsystems is currently only supported when the hierarchy consists of a single (root) cgroup. | Note that changing the set of subsystems is currently only supported when the hierarchy consists of a single (root) cgroup. | ||
- | Then under /dev/cgroup you can find a tree that corresponds to the tree of the cgroups in the system. For instance, /dev/cgroup is the cgroup that holds the whole system. | + | Then under **/dev/cgroup** you can find a tree that corresponds to the tree of the cgroups in the system. For instance, /dev/cgroup is the cgroup that holds the whole system. |
If you want to change the value of release_agent: | If you want to change the value of release_agent: | ||
Line 242: | Line 239: | ||
<code bash> | <code bash> | ||
ls | ls | ||
- | < | + | |
cgroup.procs notify_on_release tasks | cgroup.procs notify_on_release tasks | ||
(plus whatever files added by the attached subsystems) | (plus whatever files added by the attached subsystems) | ||
Line 276: | Line 273: | ||
</ | </ | ||
- | Note that it is PID, not PIDs. You can only attach ONE task at a time. If you have several tasks to attach, you have to do it one after another: | + | <WRAP info> |
+ | **NOTE: | ||
+ | </ | ||
<code bash> | <code bash> | ||
Line 300: | Line 299: | ||
When passing a name=< | When passing a name=< | ||
- | The name of the subsystem appears as part of the hierarchy description in / | + | The name of the subsystem appears as part of the hierarchy description in **/ |
==== Notification API ==== | ==== Notification API ==== | ||
Line 317: | Line 316: | ||
To unregister notification handler just close eventfd. | To unregister notification handler just close eventfd. | ||
- | NOTE: Support of notifications should be implemented for the control file. See documentation for the subsystem. | + | <WRAP info> |
+ | **NOTE**: Support of notifications should be implemented for the control file. See documentation for the subsystem. | ||
+ | </ | ||
===== Kernel API ===== | ===== Kernel API ===== | ||
Line 324: | Line 324: | ||
==== Overview ==== | ==== Overview ==== | ||
- | Each kernel subsystem that wants to hook into the generic cgroup system needs to create a cgroup_subsys object. | + | Each kernel subsystem that wants to hook into the generic cgroup system needs to create a **cgroup_subsys** object. |
Other fields in the cgroup_subsys object include: | Other fields in the cgroup_subsys object include: | ||
Line 337: | Line 337: | ||
==== Synchronization ==== | ==== Synchronization ==== | ||
- | There is a global mutex, cgroup_mutex, | + | There is a global mutex, |
See kernel/ | See kernel/ | ||
- | Subsystems can take/ | + | Subsystems can take/ |
Accessing a task's cgroup pointer may be done in the following ways: | Accessing a task's cgroup pointer may be done in the following ways: |
kernel/control_groups.1595418435.txt.gz · Last modified: 2020/07/22 11:47 by 204.12.220.106