Commit Graph

131 Commits

Author SHA1 Message Date
8e1f5eb682 chore(cert-manager): upgrade to 1.15.3 2024-09-03 03:22:18 +07:00
c17aa9e165 chore(argocd): upgrade Helm chart to v7.5.2 2024-09-03 02:24:23 +07:00
a65ecc2a43 chore(nginx): upgrade Helm chart to v4.11.2 2024-09-02 14:34:28 +07:00
3fbe47be79 feat: deploy VolSync and external snapshotter 2024-04-20 02:28:36 +07:00
22312e1606 refactor(cloudflared)!: use app-template instead of custom chart 2024-04-18 17:52:11 +07:00
8d00d55eb1 refactor(argocd)!: merge bootstrap and system
This is a breaking change and requires cluster rebuild (carefully
replacing the ApplicationSets may should work but I didn't bother at the
current alpha stage):

- ApplicationSets are merged into a single root one
  to use the progressive sync feature when it's ready.
- Switched to server side apply to avoid CRDs not ready issues.

Also replace the apply script with Ansible, since the Ansible Helm
dependency update feature was released.
2024-04-17 15:21:11 +07:00
9438fe32d7 feat(alertmanager): add more info in notifications
Map status, priority, tags and runbook.
2024-03-28 17:07:26 +07:00
1dc01c2a82 refactor!: remove k8up-operator
Upcoming rewrite for backups
2024-03-28 14:52:57 +07:00
5dc86c77c6 fix(rook-ceph): auto remove OSD if safe to remove
Useful when replacing nodes.
2024-03-13 12:16:38 +07:00
32a9aa94d5 refactor(rook-ceph): customize configuration 2024-03-13 10:11:25 +07:00
b4ba7ea6e2 refactor!: replace Longhorn with Rook Ceph
Longhorn is too unreliable for some reason.
2024-03-12 07:55:24 +07:00
4a4828f20b chore(deps): update all non-major dependencies 2024-03-03 00:20:54 +07:00
a7cdb00550 refactor!: move alert setup from Grafana to Alertmanager 2024-03-02 14:32:55 +07:00
169f24fed3 chore: update kube-prometheus-stack and grafana 2024-01-21 15:50:02 +07:00
77c5fe2113 refactor: remove descheduler
It's kinda... unnecessary for a home cluster?
2024-01-06 22:35:30 +07:00
65af4ff8e6 refactor!: remove MetalLB
Replaced by Cilium L2 Aware LB.

Additionally, the default Zerotier route was changed to match the
LB IP pool rather than the entire home subnet. This makes it easier
to manage in the configure script and can be updated to any value
later if needed.
2023-12-22 00:34:23 +07:00
de22314b0a perf(external-dns): trigger DNS update based on k8s events
- Reduce polling from 1m (default) to 5m
- More responsive updates
2023-12-21 12:11:42 +07:00
54e071e0f2 refactor(k3s): remove system upgrade controller
More trouble than it's worth.
Update Ansible to upgrade k3s instead.
2023-11-19 12:50:36 +07:00
a5ecaafe50 refactor(metallb)!: use CRD instead of ConfigMap
Deprecated https://metallb.universe.tf/configuration/migration_to_crds
2023-05-19 11:54:07 +07:00
177bac6345 Revert "fix(system): downgrade MetalLB to 0.12"
This reverts commit 084942ab84.
2023-05-19 11:09:35 +07:00
084942ab84 fix(system): downgrade MetalLB to 0.12
ConfigMap is deprecated, need to migrate first https://metallb.universe.tf/configuration/migration_to_crds
2023-05-19 02:59:04 +07:00
4d904592c4 fix(system): downgrade kube-prometheus-stack to 45.28.0
Due to an issue on 45.28.1
2023-05-19 02:38:41 +07:00
cc1d4ab2f7 chore(system): upgrade charts to latest 2023-05-19 02:27:37 +07:00
99651ecb2f fix: sync k3s version in system upgrade controller and k3d 2023-05-19 02:17:05 +07:00
b1a716dae9 refactor!: move Grafana to platform
Grafana depends on secret created by ExternalSecret, with the values
pulled from Vault, causing circular dependency problem: system requires
platform components but platform requires system components.
2023-05-19 01:36:47 +07:00
6f7bff689a fix(k3s): go back to v1.24
Longhorn does not support v1.25 yet
2022-12-29 10:32:42 +07:00
18bee6dd0a refactor(system-upgrade): pin k3s version
Only use it for rolling upgrade, the automatic upgrade is a little
annoying for now.
2022-12-24 14:23:45 +07:00
8391d54ca5 chore(kube-prometheus-stack): upgrade to latest version
Since ArgoCD server side apply is enabled
2022-12-24 13:25:16 +07:00
5a3aabbbbb chore(longhorn): upgrade to latest v1.3.0 for bug fixes 2022-07-20 23:44:41 +07:00
c52c439fac refactor(cert-manager): remove email
- Use Prometheus to monitor the certs instead of mail
- Cloudflare API token doesn't require email like API key
2022-07-07 13:44:21 +07:00
cd41343580 refactor(docs): migrate to mkdocs (#68)
* refactor(docs): migrate to mkdocs

* More markdown

* Admonitions
2022-07-06 12:33:35 +07:00
a7f91505a5 feat(external-dns)!: add cluster name as owner ID
Need to replace DNS records
2022-06-29 08:42:41 +07:00
c726a0ae20 style: fix YAML lint 2022-05-14 21:36:41 +07:00
e710e5814b fix(dex): remove hard coded values 2022-05-14 12:20:16 +07:00
5b410ceb1d refactor(platform): replace Authentik with Dex 2022-05-07 11:55:29 +07:00
71b0217a54 feat: add app name and icon for all ingress 2022-05-04 09:17:42 +07:00
86807062b2 chore(deps): update all non-major dependencies helm releases 2022-03-23 15:07:13 +00:00
46fe72cfe4 refactor(grafana): use random admin password 2022-02-26 01:01:03 +07:00
12e5a55bb9 refactor(kured): annotate nodes and change timezone 2022-02-24 21:56:29 +07:00
20731cdcda style: YAML format 2022-02-23 20:50:30 +07:00
9302deb7b5 refactor(system): remove Rocky upgrade from system upgrade controller
Use kured instead
2022-02-23 20:16:07 +07:00
463a36e251 fix(kured): try another sentinel command 2022-02-23 02:47:15 +07:00
0835658730 fix(kured): fix quotes
'--reboot-sentinel-command=! needs-restarting -r' to '--reboot-sentinel-command="! needs-restarting -r"'
2022-02-23 02:21:45 +07:00
dc92c4d8fd fix(kured): update sentinel command 2022-02-23 02:16:09 +07:00
a5f0c70b5c fix(kured): update sentinel command 2022-02-23 01:27:59 +07:00
3ad371d475 feat(kured): add sentinel command 2022-02-23 00:38:44 +07:00
e3d5943c1a Revert "refactor(system): remove Kured"
This reverts commit 88ab559806.
2022-02-23 00:30:31 +07:00
4eee1a5e12 Revert "feat(system): install Rook Ceph for testing"
This reverts commit 868e6bf7ae.

- Uses too much resources (or needs more tweaks)
- Needs raw partition/disk
2022-02-22 21:54:52 +07:00
868e6bf7ae feat(system): install Rook Ceph for testing
Potential replacement for Longhorn
2022-02-21 02:17:47 +07:00
6fd1ba1a6c fix(loki): fix value ref 2022-02-13 08:50:52 +07:00