wip

2024-12-28 14:46:41 +00:00 · 2023-01-20 18:16:53 +01:00 · 2023-01-20 18:16:53 +01:00 · 91152d1dac
commit 91152d1dac
parent 3a1bec498e
24 changed files with 314 additions and 38 deletions
--- a/README.md
+++ b/README.md
@ -1,23 +1,41 @@
-# Use Cases
+# libre.sh

-This is a repo to list use cases that we try to achieve with libre.sh
+libre.sh is a platform to manage many instances of different applications at scale.

-## Glossaire
+## Use Cases

-Application: an application is a web application that is usable by an end user (For instance: HedgeDoc, Discourse, ...).
+The use cases directory lists things we try to achieve with libre.sh.
+
+## Glossary
+
+Application: an application is a web application that is usable by an end user (For instance: HedgeDoc, Discourse, …).
+Object Store (S3 API “standard”): An http API to store and retrieve objects.
+PITR: Point in Time Recovery

 ## Personas

 ### Cluster Operator

-A Cluser Operator is a System Administrator, or Site Reliability Engineer that is transforming raw machines (physical, virtual) into a production kubernetes cluster.
-This person is typically root on servers and on kubernetes API.
+A Cluster Operator is a System Administrator, or Site Reliability Engineer that is transforming raw machines (physical, virtual) into a production Kubernetes cluster.
+This person is typically root on servers and on Kubernetes API.

 ### Application Operator

-An Application Operator is a person that is less technical than Cluster Operator, and doesn't necesssarly understand command line interface.
-But this person, through nice User interface is able to manipulate high level objects that represents application.
+An Application Operator is a person that is less technical than a Cluster Operator, and doesn’t necessarily understand the command line interface.
+But this person, through a nice User interface, is able to manipulate high level objects that represent the application.

 ### End User

 A user that will interact only with an application.
+
+## Architecture decision records
+
+## Systems
+
+### libre.sh runtime
+
+A collection of controllers and services that are required to deploy applications instances.
+
+### libre.sh runtime manager
+
+The controller in charge of installing/configuring/upgrading the runtime.
--- a/Lifecycle.md
+++ b/Lifecycle.md
@ -1,15 +0,0 @@
-As an Application Operator, I want to be able to manage applications so that I can be autonomous in this task, without interrupting the technical team.
-
-Manage in this context means:
- - create (Create an HedgeDoc instance at this URL for this organization)
- - read/list (List all HedgeDoc instance, List all the different instances of this organization)
- - update (Change some high level/Infrastructure configuration that is accessible to Application Operator)
- - delete (An Organization doesn't need any more his instance, so we need to delete it)
-
-Other Benefits:
-If the operator manages the application with a standard system, it is less likely that there is a drift in the different applications instances deployed.
-
-## Solution
-
-Kubernetes API with the use of CRD and RBAC (authz) on these CRDs allows to expose a beautiful API to manage these applications.
-If you couple Kubernetes authn with an OIDC, you have what we consider the best API to build this system.
--- a/Lifecycle.md
+++ b/Lifecycle.md
@ -1,14 +0,0 @@
-system: libre.sh runtime
-
-As most of applications need an ObjectStore bucket, and to accomplish UC1, the libre.sh runtime needs to be able to manage the lifecycle of the applications bucket.
-
-Requirements:
- be able to manage buckets on various cloud provider
-  - scaleway
-  - minio
- be able to manage bucket policies in high level fashion
- create an owner user for the application be able to interact with this bucket
-
-## Solution
-
-A CRD to describe the bucket object.
--- a/architecture-decision-records/0000-template.md
+++ b/architecture-decision-records/0000-template.md
@ -0,0 +1,74 @@
+First decision, use this [template](https://raw.githubusercontent.com/joelparkerhenderson/architecture-decision-record/main/templates/decision-record-template-madr/index.md) for [ADR](https://adr.github.io/).
+
+# [short title of solved problem and solution]
+
+* Status: [proposed | rejected | accepted | deprecated | … | superseded by [ADR-0005](0005-example.md)] <!-- optional -->
+* Deciders: [list everyone involved in the decision] <!-- optional -->
+* Date: [YYYY-MM-DD when the decision was last updated] <!-- optional -->
+
+Technical Story: [description | ticket/issue URL] <!-- optional -->
+
+## Context and Problem Statement
+
+[Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question.]
+
+## Decision Drivers <!-- optional -->
+
+* [driver 1, e.g., a force, facing concern, …]
+* [driver 2, e.g., a force, facing concern, …]
+* … <!-- numbers of drivers can vary -->
+
+## Considered Options
+
+* [option 1]
+* [option 2]
+* [option 3]
+* … <!-- numbers of options can vary -->
+
+## Decision Outcome
+
+Chosen option: ”[option 1]”, because [justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force force | … | comes out best (see below)].
+
+### Positive Consequences <!-- optional -->
+
+* [e.g., improvement of quality attribute satisfaction, follow-up decisions required, …]
+* …
+
+### Negative Consequences <!-- optional -->
+
+* [e.g., compromising quality attribute, follow-up decisions required, …]
+* …
+
+## Pros and Cons of the Options <!-- optional -->
+
+### [option 1]
+
+[example | description | pointer to more information | …] <!-- optional -->
+
+* Good, because [argument a]
+* Good, because [argument b]
+* Bad, because [argument c]
+* … <!-- numbers of pros and cons can vary -->
+
+### [option 2]
+
+[example | description | pointer to more information | …] <!-- optional -->
+
+* Good, because [argument a]
+* Good, because [argument b]
+* Bad, because [argument c]
+* … <!-- numbers of pros and cons can vary -->
+
+### [option 3]
+
+[example | description | pointer to more information | …] <!-- optional -->
+
+* Good, because [argument a]
+* Good, because [argument b]
+* Bad, because [argument c]
+* … <!-- numbers of pros and cons can vary -->
+
+## Links <!-- optional -->
+
+* [Link type] [Link to ADR] <!-- example: Refined by [ADR-0005](0005-example.md) -->
+* … <!-- numbers of links can vary -->
--- a/architecture-decision-records/0001-kubernetes.md
+++ b/architecture-decision-records/0001-kubernetes.md
@ -0,0 +1,34 @@
+# Kubernetes
+
+## Status
+
+Accepted
+
+## Context
+
+We search a general solution to solve UC-0001.
+
+### Problem to solve
+
+Find a cloud API that is free/libre to be able to run our applications.
+
+### Alternatives considered
+
+#### Swarm
+
+This is not really supported anymore.
+
+#### Nomad
+
+This is not that standard.
+
+## Decision
+
+Our platform is an extension of Kubernetes made with CRD and controllers.
+
+## Consequences
+
+Kubernetes API with the use of CRD and RBAC (authZ) on these CRDs allows exposing a beautiful API to manage these applications.
+If you couple Kubernetes authN with an OIDC, you have what we consider the best API to build this system.
+
+If the application operator manages the application with a standard system, it is less likely that there is a drift in the different applications instances deployed.
--- a/architecture-decision-records/0002-object-store.md
+++ b/architecture-decision-records/0002-object-store.md
@ -0,0 +1,28 @@
+# Object Store
+
+## Status
+
+Accepted
+
+## Context
+
+In the context of uc-0001, applications need to store objects.
+As a system administrator, I want a highly available, easily scalable storage.
+
+## Alternatives considered
+
+We believe that CSI or distributed file system are complex systems to manage and object store tend to be lighter in terms of management.
+[Object Store alleviates POSIX constraints](https://blog.min.io/kubernetes-storage-patterns/).
+
+## Decision Outcome
+
+Use only applications that rely on S3.
+
+## Negative Consequences
+
+Not all applications support S3.
+We need to have an Object Store.
+
+### Positive Consequences
+
+Object Store implements versioning that we can use as a PITR.
--- a/architecture-decision-records/0003-minio.md
+++ b/architecture-decision-records/0003-minio.md
@ -0,0 +1,66 @@
+# Minio as Object Store
+
+* Status: accepted
+* Deciders: Hugo, Pierre, Tim
+* Date: 20/01/2023
+
+Technical Story: adr-0002
+
+## Context and Problem Statement
+
+We need an object store for our applications objects/files.
+
+## Decision Drivers
+
+* Simplicity
+* Self Hosting
+* Free Software
+* Cloud/IaaS agnostic
+
+## Considered Options
+
+* Ceph
+* Scaleway/OVH/Cloud Provider Object Store
+* SeeWeedFS
+* Minio
+
+## Decision Outcome
+
+We decided to go with Minio because of its simplicity and cloud native architecture. It is developed to run on Kubernetes.
+
+### Positive Consequences
+
+* Hyper Converged Infrastructure, so resources are even more used
+* Simplicity of management
+* Deploy integrates options like monitoring
+
+### Negative Consequences
+
+* Doesn’t grow organically, if you want to scale, you have to add one erasure zone at a time
+* Needs identical hardware, more difficult with second hand hardware
+* Atomic unit is an erasure zone
+* If you want to have better performance for a reasonable cost, infrastructure becomes complex
+  * add cache
+  * add tiering cluster 
+
+## Pros and Cons of the Options
+
+### Ceph
+
+* Good, because you can easily recycle second hardware
+* Good, because you can organically scale it up or down or change parts
+* Good, because atomic unit is the disk
+* Good, because as a byproduct of the object store, we can get a real CSI for k8s
+* Bad, because it is a complex piece of software that is not really cloud native
+* Bad, because hyper converged mode is hard to achieve, it is computing/network/ram intensive and is a noisy neighbor
+
+### Scaleway
+
+* Good, because it is a managed service
+* Good, because it is not that expensive
+* Bad, because it is proprietary
+* Bad, because it is non-standard and requires extra complexity to manage standard s3 and scaleway in a bucket controller
+
+### SeeWeedFS
+
+* Bad, because it is not that much used
--- a/architecture-decision-records/0004-zalando-pg.md
+++ b/architecture-decision-records/0004-zalando-pg.md
--- a/architecture-decision-records/0005-velero.md
+++ b/architecture-decision-records/0005-velero.md
@ -0,0 +1,8 @@
+
+
+
+
+
+Config in this context means:
+- application configuration
+- application version
--- a/architecture-decision-records/0006-naming-convention.md
+++ b/architecture-decision-records/0006-naming-convention.md
@ -0,0 +1,11 @@
+fqdn
+
+resource kind (i.e. nextcloud)
+resource name (i.e. nuage)
+component (i.e. app)
+
+resource-kind-resource-name(if different)-component
+
+
+nextcloud-redis
+nextcloud-nuage-redis
--- a/architecture-decision-records/0007-flux-image-automation.md
+++ b/architecture-decision-records/0007-flux-image-automation.md
--- a/components/bucket-controller.md
+++ b/components/bucket-controller.md
--- a/components/flux.md
+++ b/components/flux.md
@ -0,0 +1,3 @@
+helm release
+
+image automation
--- a/components/observability.md
+++ b/components/observability.md
--- a/components/organization-controller.md
+++ b/components/organization-controller.md
@ -0,0 +1,11 @@
+name:
+default-domain-name:
+
+automation:
+sub domain: instance name
+
+creates ns and realm
+
+read only pull user for organization
+
+every resource that are org dependent
--- a/components/pg-controller.md
+++ b/components/pg-controller.md
@ -0,0 +1,2 @@
+
+
--- a/components/redis-controller.md
+++ b/components/redis-controller.md
--- a/components/runtime-manager.md
+++ b/components/runtime-manager.md
@ -0,0 +1,23 @@
+# libre.sh runtime manager (foundation)
+
+## Minio operator
+
+## Observability
+
+### Loki
+
+### Prometheus Stack
+
+## Nginx ingress
+
+## Cert manager
+
+## Flux
+
+## bucket controller
+
+## pg controller
+
+### pg Zalando operator
+
+## redis controller
--- a/components/upgrade-controller.md
+++ b/components/upgrade-controller.md
--- a/use-cases/0001-applications-lifecycle.md
+++ b/use-cases/0001-applications-lifecycle.md
@ -0,0 +1,9 @@
+As an Application Operator,
+I want to be able to manage applications
+so that I can be autonomous in this task, without interrupting the technical team.
+
+Manage in this context means:
+- create (Create an HedgeDoc instance at this URL for this organization)
+- read/list (List all HedgeDoc instance, List all the different instances of this organization)
+- update (Change some high level/Infrastructure configuration that is accessible to Application Operator)
+- delete (An Organization doesn’t need any more his instance, so we need to delete it)
--- a/use-cases/0002-pitr.md
+++ b/use-cases/0002-pitr.md
@ -0,0 +1,11 @@
+As an Application Operator,
+I want to be able to roll back data at any point in time,
+So that in case of error, I can go back to a last known healthy state.
+
+Data in this context means:
+- databases
+- files
+
+Examples:
+- An upgrade failed, and I want to roll back.
+- I made a mistake and deleted important data, and want to roll back
--- a/use-cases/0003-disaster-recovery.md
+++ b/use-cases/0003-disaster-recovery.md
@ -0,0 +1,4 @@
+As a Cluster Operator,
+in a case of major incident, a disaster,
+I want to be able to restore an application,
+so that end users stop complaining about non-availability of their application.
--- a/use-cases/0004-import.md
+++ b/use-cases/0004-import.md
@ -0,0 +1,3 @@
+As an Application Operator,
+I want to be able to import a full instance from an archive,
+So that I can import my standard application into the wonderful world of Kubernetes \o/
--- a/use-cases/0005-application-upgrade.md
+++ b/use-cases/0005-application-upgrade.md