|
| 1 | +# Control Plane Behaviour |
| 2 | + |
| 3 | +This document describes the types of behaviour that the control plane will exhibit under various situations. By |
| 4 | +providing a high-level view it is hoped that the reader will be able to more easily reason about the control plane. \ |
| 5 | +<br> |
| 6 | + |
| 7 | +## REST API Idempotency |
| 8 | + |
| 9 | +Idempotency is a term used a lot but which is often misconstrued. The following definition is taken from |
| 10 | +the [Mozilla Glossary](https://developer.mozilla.org/en-US/docs/Glossary/Idempotent): |
| 11 | + |
| 12 | +> An [HTTP](https://developer.mozilla.org/en-US/docs/Web/HTTP) method is **idempotent** if an identical request can be |
| 13 | +> made once or several times in a row with the same effect while leaving the server in the same state. In other words, |
| 14 | +> an idempotent method should not have any side-effects (except for keeping statistics). Implemented correctly, the `GET`, |
| 15 | +`HEAD`,`PUT`, and `DELETE` methods are idempotent, but not the `POST` method. |
| 16 | +> All [safe](https://developer.mozilla.org/en-US/docs/Glossary/Safe) methods are also ***idempotent***. |
| 17 | +
|
| 18 | +OK, so making multiple identical requests should produce the same result ***without side effects***. Great, so does the |
| 19 | +return value for each request have to be the same? The article goes on to say: |
| 20 | + |
| 21 | +> To be idempotent, only the actual back-end state of the server is considered, the status code returned by each request |
| 22 | +> may differ: the first call of a `DELETE` will likely return a `200`, while successive ones will likely return a`404`. |
| 23 | +
|
| 24 | +The control plane will behave exactly as described above. If, for example, multiple `create volume` calls are made for |
| 25 | +the same volume, the first will return success (`HTTP 200` code) while subsequent calls will return a failure status |
| 26 | +code (`HTTP 409` code) indicating that the resource already exists. \ |
| 27 | +<br> |
| 28 | + |
| 29 | +## Handling Failures |
| 30 | + |
| 31 | +There are various ways in which the control plane could fail to satisfy a `REST` request: |
| 32 | + |
| 33 | +- Control plane dies in the middle of an operation. |
| 34 | +- Control plane fails to update the persistent store. |
| 35 | +- A gRPC request to Mayastor fails to complete successfully. \ |
| 36 | + <br> |
| 37 | + |
| 38 | +Regardless of the type of failure, the control plane has to decide what it should do: |
| 39 | + |
| 40 | +1. Fail the operation back to the callee but leave any created resources alone. |
| 41 | + |
| 42 | +2. Fail the operation back to the callee but destroy any created resources. |
| 43 | + |
| 44 | +3. Act like kubernetes and keep retrying in the hope that it will eventually succeed. \ |
| 45 | +<br> |
| 46 | + |
| 47 | +Approach 3 is discounted. If we never responded to the callee it would eventually timeout and probably retry itself. |
| 48 | +This would likely present even more issues/complexity in the control plane. |
| 49 | + |
| 50 | +So the decision becomes, should we destroy resources that have already been created as part of the operation? \ |
| 51 | +<br> |
| 52 | + |
| 53 | +### Keep Created Resources |
| 54 | + |
| 55 | +Preventing the control plane from having to unwind operations is convenient as it keeps the implementation simple. A |
| 56 | +separate asynchronous process could then periodically scan for unused resources and destroy them. |
| 57 | + |
| 58 | +There is a potential issue with the above described approach. If an operation fails, it would be reasonable to assume |
| 59 | +that the user would retry it. Is it possible for this subsequent request to fail as a result of the existing unused |
| 60 | +resources lingering (i.e. because they have not yet been destroyed)? If so, this would hamper any retry logic |
| 61 | +implemented in the upper layers. |
| 62 | + |
| 63 | +### Destroy Created Resources |
| 64 | + |
| 65 | +This is the optimal approach. For any given operation, failure results in newly created resources being destroyed. The |
| 66 | +responsibility lies with the control plane tracking which resources have been created and destroying them in the event |
| 67 | +of a failure. |
| 68 | + |
| 69 | +However, what happens if destruction of a resource fails? It is possible for the control plane to retry the operation |
| 70 | +but at some point it will have to give up. In effect the control plane will do its best, but it cannot provide any |
| 71 | +guarantee. So does this mean that these resources are permanently leaked? Not necessarily. Like in |
| 72 | +the [Keep Created Resources](#keep-created-resources) section, there could be a separate process which destroys unused |
| 73 | +resources. \ |
| 74 | +<br> |
| 75 | + |
| 76 | +## Use of the Persistent Store |
| 77 | + |
| 78 | +For a control plane to be effective it must maintain information about the system it is interacting with and take |
| 79 | +decision accordingly. An in-memory registry is used to store such information. |
| 80 | + |
| 81 | +Because the registry is stored in memory, it is volatile - meaning all information is lost if the service is restarted. |
| 82 | +As a consequence critical information must be backed up to a highly available persistent store (for more detailed |
| 83 | +information see [persistent-store.md](./persistent-store.md)). |
| 84 | + |
| 85 | +The types of data that need persisting broadly fall into 3 categories: |
| 86 | + |
| 87 | +1. Desired state |
| 88 | + |
| 89 | +2. Actual state |
| 90 | + |
| 91 | +3. Control plane specific information \ |
| 92 | + <br> |
| 93 | + |
| 94 | +### Desired State |
| 95 | + |
| 96 | +This is the declarative specification of a resource provided by the user. As an example, the user may request a new |
| 97 | +volume with the following requirements: |
| 98 | + |
| 99 | +- Replica count of 3 |
| 100 | + |
| 101 | +- Size |
| 102 | + |
| 103 | +- Preferred nodes |
| 104 | + |
| 105 | +- Number of nexuses |
| 106 | + |
| 107 | +Once the user has provided these constraints, the expectation is that the control plane should create a resource that |
| 108 | +meets the specification. How the control plane achieves this is of no concern. |
| 109 | + |
| 110 | +So what happens if the control plane is unable to meet these requirements? The operation is failed. This prevents any |
| 111 | +ambiguity. If an operation succeeds, the requirements have been met and the user has exactly what they asked for. If the |
| 112 | +operation fails, the requirements couldn’t be met. In this case the control plane should provide an appropriate means of |
| 113 | +diagnosing the issue i.e. a log message. |
| 114 | + |
| 115 | +What happens to resources created before the operation failed? This will be dependent on the chosen failure strategy |
| 116 | +outlined in [Handling Failures](#handling-failures). |
| 117 | + |
| 118 | +### Actual State |
| 119 | + |
| 120 | +This is the runtime state of the system as provided by Mayastor. Whenever this changes, the control plane must reconcile |
| 121 | +this state against the desired state to ensure that we are still meeting the users requirements. If not, the control |
| 122 | +plane will take action to try to rectify this. |
| 123 | + |
| 124 | +Whenever a user makes a request for state information, it will be this state that is returned (Note: If necessary an API |
| 125 | +may be provided which returns the desired state also). \ |
| 126 | +<br> |
| 127 | + |
| 128 | +## Control Plane Information |
| 129 | + |
| 130 | +This information is required to aid the control plane across restarts. It will be used to store the state of a resource |
| 131 | +independent of the desired or actual state. |
| 132 | + |
| 133 | +The following sequence will be followed when creating a resource: |
| 134 | + |
| 135 | +1. Add resource specification to the store with a state of “creating” |
| 136 | + |
| 137 | +2. Create the resource |
| 138 | + |
| 139 | +3. Mark the state of the resource as “complete” |
| 140 | + |
| 141 | +If the control plane then crashes mid-operation, on restart it can query the state of each resource. Any resource not in |
| 142 | +the “complete” state can then be destroyed as they will be remnants of a failed operation. The expectation here will be |
| 143 | +that the user will reissue the operation if they wish to. |
| 144 | + |
| 145 | +Likewise, deleting a resource will look like: |
| 146 | + |
| 147 | +1. Mark resources as “deleting” in the store |
| 148 | + |
| 149 | +2. Delete the resource |
| 150 | + |
| 151 | +3. Remove the resource from the store. |
| 152 | + |
| 153 | +For complex operations like creating a volume, all resources that make up the volume will be marked as “creating”. Only |
| 154 | +when all resources have been successfully created will their corresponding states be changed to “complete”. This will |
| 155 | +look something like: |
| 156 | + |
| 157 | +1. Add volume specification to the store with a state of “creating” |
| 158 | + |
| 159 | +2. Add nexus specifications to the store with a state of “creating” |
| 160 | + |
| 161 | +3. Add replica specifications to the store with a state of “creating” |
| 162 | + |
| 163 | +4. Create replicas |
| 164 | + |
| 165 | +5. Create nexus |
| 166 | + |
| 167 | +6. Mark replica states as “complete” |
| 168 | + |
| 169 | +7. Mark nexus states as “complete” |
| 170 | + |
| 171 | +8. Mark volume state as “complete” |
0 commit comments