Commit Graph

132 Commits

Author SHA1 Message Date
Khue Doan
8d4f52cff4 fix(volsync): enable privileged movers by default
This might not be the best approach, but for now, it's the option with
the least hassle. I may refactor it in the future for greater
granularity.
2024-11-24 20:17:27 +07:00
Khue Doan
8e1f5eb682 chore(cert-manager): upgrade to 1.15.3 2024-09-03 03:22:18 +07:00
Khue Doan
c17aa9e165 chore(argocd): upgrade Helm chart to v7.5.2 2024-09-03 02:24:23 +07:00
Khue Doan
a65ecc2a43 chore(nginx): upgrade Helm chart to v4.11.2 2024-09-02 14:34:28 +07:00
Khue Doan
3fbe47be79 feat: deploy VolSync and external snapshotter 2024-04-20 02:28:36 +07:00
Khue Doan
22312e1606 refactor(cloudflared)!: use app-template instead of custom chart 2024-04-18 17:52:11 +07:00
Khue Doan
8d00d55eb1 refactor(argocd)!: merge bootstrap and system
This is a breaking change and requires cluster rebuild (carefully
replacing the ApplicationSets may should work but I didn't bother at the
current alpha stage):

- ApplicationSets are merged into a single root one
  to use the progressive sync feature when it's ready.
- Switched to server side apply to avoid CRDs not ready issues.

Also replace the apply script with Ansible, since the Ansible Helm
dependency update feature was released.
2024-04-17 15:21:11 +07:00
Khue Doan
9438fe32d7 feat(alertmanager): add more info in notifications
Map status, priority, tags and runbook.
2024-03-28 17:07:26 +07:00
Khue Doan
1dc01c2a82 refactor!: remove k8up-operator
Upcoming rewrite for backups
2024-03-28 14:52:57 +07:00
Khue Doan
5dc86c77c6 fix(rook-ceph): auto remove OSD if safe to remove
Useful when replacing nodes.
2024-03-13 12:16:38 +07:00
Khue Doan
32a9aa94d5 refactor(rook-ceph): customize configuration 2024-03-13 10:11:25 +07:00
Khue Doan
b4ba7ea6e2 refactor!: replace Longhorn with Rook Ceph
Longhorn is too unreliable for some reason.
2024-03-12 07:55:24 +07:00
Khue Doan
4a4828f20b chore(deps): update all non-major dependencies 2024-03-03 00:20:54 +07:00
Khue Doan
a7cdb00550 refactor!: move alert setup from Grafana to Alertmanager 2024-03-02 14:32:55 +07:00
Khue Doan
169f24fed3 chore: update kube-prometheus-stack and grafana 2024-01-21 15:50:02 +07:00
Khue Doan
77c5fe2113 refactor: remove descheduler
It's kinda... unnecessary for a home cluster?
2024-01-06 22:35:30 +07:00
Khue Doan
65af4ff8e6 refactor!: remove MetalLB
Replaced by Cilium L2 Aware LB.

Additionally, the default Zerotier route was changed to match the
LB IP pool rather than the entire home subnet. This makes it easier
to manage in the configure script and can be updated to any value
later if needed.
2023-12-22 00:34:23 +07:00
Khue Doan
de22314b0a perf(external-dns): trigger DNS update based on k8s events
- Reduce polling from 1m (default) to 5m
- More responsive updates
2023-12-21 12:11:42 +07:00
Khue Doan
54e071e0f2 refactor(k3s): remove system upgrade controller
More trouble than it's worth.
Update Ansible to upgrade k3s instead.
2023-11-19 12:50:36 +07:00
Khue Doan
a5ecaafe50 refactor(metallb)!: use CRD instead of ConfigMap
Deprecated https://metallb.universe.tf/configuration/migration_to_crds
2023-05-19 11:54:07 +07:00
Khue Doan
177bac6345 Revert "fix(system): downgrade MetalLB to 0.12"
This reverts commit 084942ab84.
2023-05-19 11:09:35 +07:00
Khue Doan
084942ab84 fix(system): downgrade MetalLB to 0.12
ConfigMap is deprecated, need to migrate first https://metallb.universe.tf/configuration/migration_to_crds
2023-05-19 02:59:04 +07:00
Khue Doan
4d904592c4 fix(system): downgrade kube-prometheus-stack to 45.28.0
Due to an issue on 45.28.1
2023-05-19 02:38:41 +07:00
Khue Doan
cc1d4ab2f7 chore(system): upgrade charts to latest 2023-05-19 02:27:37 +07:00
Khue Doan
99651ecb2f fix: sync k3s version in system upgrade controller and k3d 2023-05-19 02:17:05 +07:00
Khue Doan
b1a716dae9 refactor!: move Grafana to platform
Grafana depends on secret created by ExternalSecret, with the values
pulled from Vault, causing circular dependency problem: system requires
platform components but platform requires system components.
2023-05-19 01:36:47 +07:00
Khue Doan
6f7bff689a fix(k3s): go back to v1.24
Longhorn does not support v1.25 yet
2022-12-29 10:32:42 +07:00
Khue Doan
18bee6dd0a refactor(system-upgrade): pin k3s version
Only use it for rolling upgrade, the automatic upgrade is a little
annoying for now.
2022-12-24 14:23:45 +07:00
Khue Doan
8391d54ca5 chore(kube-prometheus-stack): upgrade to latest version
Since ArgoCD server side apply is enabled
2022-12-24 13:25:16 +07:00
Khue Doan
5a3aabbbbb chore(longhorn): upgrade to latest v1.3.0 for bug fixes 2022-07-20 23:44:41 +07:00
Khue Doan
c52c439fac refactor(cert-manager): remove email
- Use Prometheus to monitor the certs instead of mail
- Cloudflare API token doesn't require email like API key
2022-07-07 13:44:21 +07:00
Khue Doan
cd41343580
refactor(docs): migrate to mkdocs (#68)
* refactor(docs): migrate to mkdocs

* More markdown

* Admonitions
2022-07-06 12:33:35 +07:00
Khue Doan
a7f91505a5 feat(external-dns)!: add cluster name as owner ID
Need to replace DNS records
2022-06-29 08:42:41 +07:00
Khue Doan
c726a0ae20 style: fix YAML lint 2022-05-14 21:36:41 +07:00
Khue Doan
e710e5814b fix(dex): remove hard coded values 2022-05-14 12:20:16 +07:00
Khue Doan
5b410ceb1d refactor(platform): replace Authentik with Dex 2022-05-07 11:55:29 +07:00
Khue Doan
71b0217a54 feat: add app name and icon for all ingress 2022-05-04 09:17:42 +07:00
Renovate Bot
86807062b2 chore(deps): update all non-major dependencies helm releases 2022-03-23 15:07:13 +00:00
Khue Doan
46fe72cfe4 refactor(grafana): use random admin password 2022-02-26 01:01:03 +07:00
Khue Doan
12e5a55bb9 refactor(kured): annotate nodes and change timezone 2022-02-24 21:56:29 +07:00
Khue Doan
20731cdcda style: YAML format 2022-02-23 20:50:30 +07:00
Khue Doan
9302deb7b5 refactor(system): remove Rocky upgrade from system upgrade controller
Use kured instead
2022-02-23 20:16:07 +07:00
Khue Doan
463a36e251 fix(kured): try another sentinel command 2022-02-23 02:47:15 +07:00
Khue Doan
0835658730 fix(kured): fix quotes
'--reboot-sentinel-command=! needs-restarting -r' to '--reboot-sentinel-command="! needs-restarting -r"'
2022-02-23 02:21:45 +07:00
Khue Doan
dc92c4d8fd fix(kured): update sentinel command 2022-02-23 02:16:09 +07:00
Khue Doan
a5f0c70b5c fix(kured): update sentinel command 2022-02-23 01:27:59 +07:00
Khue Doan
3ad371d475 feat(kured): add sentinel command 2022-02-23 00:38:44 +07:00
Khue Doan
e3d5943c1a Revert "refactor(system): remove Kured"
This reverts commit 88ab559806.
2022-02-23 00:30:31 +07:00
Khue Doan
4eee1a5e12 Revert "feat(system): install Rook Ceph for testing"
This reverts commit 868e6bf7ae.

- Uses too much resources (or needs more tweaks)
- Needs raw partition/disk
2022-02-22 21:54:52 +07:00
Khue Doan
868e6bf7ae feat(system): install Rook Ceph for testing
Potential replacement for Longhorn
2022-02-21 02:17:47 +07:00