8e1f5eb682
chore(cert-manager): upgrade to 1.15.3
2024-09-03 03:22:18 +07:00
c17aa9e165
chore(argocd): upgrade Helm chart to v7.5.2
2024-09-03 02:24:23 +07:00
a65ecc2a43
chore(nginx): upgrade Helm chart to v4.11.2
2024-09-02 14:34:28 +07:00
3fbe47be79
feat: deploy VolSync and external snapshotter
2024-04-20 02:28:36 +07:00
22312e1606
refactor(cloudflared)!: use app-template instead of custom chart
2024-04-18 17:52:11 +07:00
8d00d55eb1
refactor(argocd)!: merge bootstrap and system
...
This is a breaking change and requires cluster rebuild (carefully
replacing the ApplicationSets may should work but I didn't bother at the
current alpha stage):
- ApplicationSets are merged into a single root one
to use the progressive sync feature when it's ready.
- Switched to server side apply to avoid CRDs not ready issues.
Also replace the apply script with Ansible, since the Ansible Helm
dependency update feature was released.
2024-04-17 15:21:11 +07:00
9438fe32d7
feat(alertmanager): add more info in notifications
...
Map status, priority, tags and runbook.
2024-03-28 17:07:26 +07:00
1dc01c2a82
refactor!: remove k8up-operator
...
Upcoming rewrite for backups
2024-03-28 14:52:57 +07:00
5dc86c77c6
fix(rook-ceph): auto remove OSD if safe to remove
...
Useful when replacing nodes.
2024-03-13 12:16:38 +07:00
32a9aa94d5
refactor(rook-ceph): customize configuration
2024-03-13 10:11:25 +07:00
b4ba7ea6e2
refactor!: replace Longhorn with Rook Ceph
...
Longhorn is too unreliable for some reason.
2024-03-12 07:55:24 +07:00
4a4828f20b
chore(deps): update all non-major dependencies
2024-03-03 00:20:54 +07:00
a7cdb00550
refactor!: move alert setup from Grafana to Alertmanager
2024-03-02 14:32:55 +07:00
169f24fed3
chore: update kube-prometheus-stack and grafana
2024-01-21 15:50:02 +07:00
77c5fe2113
refactor: remove descheduler
...
It's kinda... unnecessary for a home cluster?
2024-01-06 22:35:30 +07:00
65af4ff8e6
refactor!: remove MetalLB
...
Replaced by Cilium L2 Aware LB.
Additionally, the default Zerotier route was changed to match the
LB IP pool rather than the entire home subnet. This makes it easier
to manage in the configure script and can be updated to any value
later if needed.
2023-12-22 00:34:23 +07:00
de22314b0a
perf(external-dns): trigger DNS update based on k8s events
...
- Reduce polling from 1m (default) to 5m
- More responsive updates
2023-12-21 12:11:42 +07:00
54e071e0f2
refactor(k3s): remove system upgrade controller
...
More trouble than it's worth.
Update Ansible to upgrade k3s instead.
2023-11-19 12:50:36 +07:00
a5ecaafe50
refactor(metallb)!: use CRD instead of ConfigMap
...
Deprecated https://metallb.universe.tf/configuration/migration_to_crds
2023-05-19 11:54:07 +07:00
177bac6345
Revert "fix(system): downgrade MetalLB to 0.12"
...
This reverts commit 084942ab84
.
2023-05-19 11:09:35 +07:00
084942ab84
fix(system): downgrade MetalLB to 0.12
...
ConfigMap is deprecated, need to migrate first https://metallb.universe.tf/configuration/migration_to_crds
2023-05-19 02:59:04 +07:00
4d904592c4
fix(system): downgrade kube-prometheus-stack to 45.28.0
...
Due to an issue on 45.28.1
2023-05-19 02:38:41 +07:00
cc1d4ab2f7
chore(system): upgrade charts to latest
2023-05-19 02:27:37 +07:00
99651ecb2f
fix: sync k3s version in system upgrade controller and k3d
2023-05-19 02:17:05 +07:00
b1a716dae9
refactor!: move Grafana to platform
...
Grafana depends on secret created by ExternalSecret, with the values
pulled from Vault, causing circular dependency problem: system requires
platform components but platform requires system components.
2023-05-19 01:36:47 +07:00
6f7bff689a
fix(k3s): go back to v1.24
...
Longhorn does not support v1.25 yet
2022-12-29 10:32:42 +07:00
18bee6dd0a
refactor(system-upgrade): pin k3s version
...
Only use it for rolling upgrade, the automatic upgrade is a little
annoying for now.
2022-12-24 14:23:45 +07:00
8391d54ca5
chore(kube-prometheus-stack): upgrade to latest version
...
Since ArgoCD server side apply is enabled
2022-12-24 13:25:16 +07:00
5a3aabbbbb
chore(longhorn): upgrade to latest v1.3.0 for bug fixes
2022-07-20 23:44:41 +07:00
c52c439fac
refactor(cert-manager): remove email
...
- Use Prometheus to monitor the certs instead of mail
- Cloudflare API token doesn't require email like API key
2022-07-07 13:44:21 +07:00
cd41343580
refactor(docs): migrate to mkdocs ( #68 )
...
* refactor(docs): migrate to mkdocs
* More markdown
* Admonitions
2022-07-06 12:33:35 +07:00
a7f91505a5
feat(external-dns)!: add cluster name as owner ID
...
Need to replace DNS records
2022-06-29 08:42:41 +07:00
c726a0ae20
style: fix YAML lint
2022-05-14 21:36:41 +07:00
e710e5814b
fix(dex): remove hard coded values
2022-05-14 12:20:16 +07:00
5b410ceb1d
refactor(platform): replace Authentik with Dex
2022-05-07 11:55:29 +07:00
71b0217a54
feat: add app name and icon for all ingress
2022-05-04 09:17:42 +07:00
86807062b2
chore(deps): update all non-major dependencies helm releases
2022-03-23 15:07:13 +00:00
46fe72cfe4
refactor(grafana): use random admin password
2022-02-26 01:01:03 +07:00
12e5a55bb9
refactor(kured): annotate nodes and change timezone
2022-02-24 21:56:29 +07:00
20731cdcda
style: YAML format
2022-02-23 20:50:30 +07:00
9302deb7b5
refactor(system): remove Rocky upgrade from system upgrade controller
...
Use kured instead
2022-02-23 20:16:07 +07:00
463a36e251
fix(kured): try another sentinel command
2022-02-23 02:47:15 +07:00
0835658730
fix(kured): fix quotes
...
'--reboot-sentinel-command=! needs-restarting -r' to '--reboot-sentinel-command="! needs-restarting -r"'
2022-02-23 02:21:45 +07:00
dc92c4d8fd
fix(kured): update sentinel command
2022-02-23 02:16:09 +07:00
a5f0c70b5c
fix(kured): update sentinel command
2022-02-23 01:27:59 +07:00
3ad371d475
feat(kured): add sentinel command
2022-02-23 00:38:44 +07:00
e3d5943c1a
Revert "refactor(system): remove Kured"
...
This reverts commit 88ab559806
.
2022-02-23 00:30:31 +07:00
4eee1a5e12
Revert "feat(system): install Rook Ceph for testing"
...
This reverts commit 868e6bf7ae
.
- Uses too much resources (or needs more tweaks)
- Needs raw partition/disk
2022-02-22 21:54:52 +07:00
868e6bf7ae
feat(system): install Rook Ceph for testing
...
Potential replacement for Longhorn
2022-02-21 02:17:47 +07:00
6fd1ba1a6c
fix(loki): fix value ref
2022-02-13 08:50:52 +07:00