This article recorded how to install and configure Log System based on Loki developing by Grafana.
kafka
Kakfa k8s
It’s complex, please refer to Kakfa Config: kafka.zip
Kakfa without zookeeper:
- https://learnk8s.io/kafka-ha-kubernetes#deploying-a-3-node-kafka-cluster-on-kubernetes
- https://stackoverflow.com/questions/73380791/kafka-kraft-replication-factor-of-3
- https://github.com/IBM/kraft-mode-kafka-on-kubernetes
Dockerfile:
1 | FROM openjdk:17-bullseye |
entrypoint.sh:
1 |
|
docker building:
1 | docker build -t "registry.zerofinance.net/xpayappimage/kafka:3.3.2" . |
kafka-kraft.yml:
1 | #部署 Service Headless,用于Kafka间相互通信 |
kafka-ui.yml:
1 | apiVersion: v1 |
Test:
1 | > kubectl -n zero-logs run kafka-client --rm -ti --image bitnami/kafka:3.1.0 -- bash |
Zookeeper
https://www.qikqiak.com/k8strain/controller/statefulset/
https://www.jianshu.com/p/f0b0fc3d192f
https://itopic.org/kafka-in-k8s.html
https://itopic.org/zookeeper-in-k8s.html
Need to modify resource from: https://github.com/31z4/zookeeper-docker/tree/master/3.8.1
docker-entrypoint.sh
1 |
|
builds images:
1 | docker build -t "registry.zerofinance.net/xpayappimage/zookeeper:3.8.1" . |
Loki
What’s the Grafana Loki?
Loki is a log aggregation system designed to store and query logs from all your applications and infrastructure.
Documents located in: https://grafana.com/docs/loki/latest/
Configurations of loki k8s: loki-k8s.zip
Installation
https://grafana.com/docs/loki/latest/fundamentals/overview/#overview
There are losts of way to install Loki, here show it by docker. the other ways please refer to: https://grafana.com/docs/loki/latest/installation/
Docker
If you clients are distributed on individual machines, you can use docker:
Installing:
1 | #loki |
reload alertmanager: curl -XPOST http://am-test.zerofinance.net/-/reload
Cluster Installation:
1 | #Loki(multiple machines): |
Uninstalling:
1 | #loki |
Configuration:
loki-config.yaml:
For local file:
1 | auth_enabled: false |
For Aliyun OSS:
1 | auth_enabled: false |
promtail-config.yaml:
For log files:
1 | server: |
Recoverying local files automatically:
1 | ... |
config/pipeline_stages.yaml
1 | - targets: |
For Kafka:
1 | server: |
/etc/grafana/grafana.ini
1 | ... |
fluent-bit/fluent-bit.conf
1 | [SERVICE] |
fluent-bit/parsers_multiline.conf(if need)
1 | [MULTILINE_PARSER] |
Kakfa docker-compose.yml
1 | #https://segmentfault.com/a/1190000021746086 |
alertmanager-config.yaml
https://yunlzheng.gitbook.io/prometheus-book/parti-prometheus-ji-chu/alert/alert-manager-config
1 | global: |
loki/rules/fake/rules.yaml
1 | groups: |
config/WebCom-resolved.tmpl
1 | {{ define "wechat.default.message" }} |
config/WebCom.tmpl
1 | {{ define "wechat.default.message" }} |
config/Email-resolved.tmpl
1 | {{ define "email.to.html" }} |
config/Email.tmpl
1 | {{ define "email.to.html" }} |
迁移grafana:
https://www.jianshu.com/p/bc37e2fc15e7
Collecting type
There is 2 way to collect logs:
1 | 1: fluent bit--->kafka--->promtail--->loki |
Recommended: fluent bit—>kafka—>promtail—>loki
docker-compose
Not recommend, just for local study.
1 | #depend on Linux: https://grafana.com/docs/loki/latest/installation/docker/ |
The modified docker-compose.yaml as follows:
docker-compose.yaml:
1 | version: "3" |
Starting:
1 | #Starting: |
When it’s started, you can check the status using the following url:
1 | http://localhost:3100/ready |
Grafana URL is: http://localhost:3000/, default account is admin/admin
Grafana Configuration
1 | env: |
promtail
Promtail is an agent which ships the contents of local logs to a private Grafana Loki instance or Grafana Cloud. It is usually deployed to every machine that has applications needed to be monitored.
More details: https://grafana.com/docs/loki/latest/clients/promtail/
Configuration
Promtail Config
All of the rule of collecting logs will be configured in the “promtail-config.yaml”
1 | scrape_configs: |
Notice:
If multline fields are configured, it won’t appear in the lables of seaching. it conflict witch regex stage. For example:
“loglevel” field configured in regex stage, if you the “loglevel” contain multiline, you wan to search by: “{loglevel=”ERROR”}”, it won’t display the multiline logs, just single log, althought “loglevel” contain multiline logs.
Grafana Config
Grafana 6.0 and more recent versions have built-in support for Grafana Loki. Use Grafana 6.3 or a more recent version to take advantage of LogQL functionality.
Log into your Grafana instance. If this is your first time running Grafana, the username and password are both defaulted to admin.
In Grafana, go to Configuration > Data Sources via the cog icon on the left sidebar.
Click the big + Add data source button.
Choose Loki from the list.
The http URL field should be the address of your Loki server. For example, when running locally or with Docker using port mapping, the address is likely http://localhost:3100. When running with docker-compose or Kubernetes, the address is likely http://loki:3100.
To see the logs, click Explore on the sidebar, select the Loki datasource in the top-left dropdown, and then choose a log stream using the Log labels button.
Variables
Creating a new dashboard named “Loki”(just first time), entering “dashboards settings”(gear icon):
Env:1
Query: label_values(env)
System:1
2Query: label_values({belongs="company", filename=~".*${env}.*"}, filename)
Regex: /works\/log\/.+?\/.+?\/(.+?)\/.*/
Hostname:1
Query: label_values({belongs="company", filename=~".*${env}/${system}.*"}, hostname)
Filename:1
2Query: label_values({belongs="company", filename=~".*${env}/${system}.*", filename!~".*(?:error|tmlog).*"}, filename)
Regex: /.*\/(.+\.log)/
Search:
Log Panel
Log browser:1
{env="${env}", app_name="${system}", hostname=~".*${hostname}.*", filename=~".*${filename}.*"}|~"(?i)$search"
Kubernetes
Using helm to install loki on the k8s environment easyly, but recommend it by customed congratulation:
Notice: It’s weird the way of Kubernetes couldn’t collection the logs completely, finally I used the docker to deploy.
Heml
Installing heml:
1 | #Linux: |
Pulling repositories:
1 | #https://grafana.com/docs/loki/latest/installation/helm/ |
Configure:
1 | #Create PersistentVolume |
Installation
Installing the revlant components:
1 | cd /works/loki/ |
Uninstallation
Uninstalling the revlant components:
1 | helm uninstall loki -n loki |
Optimize
Troubleshooting
error: code = ResourceExhausted desc = trying to send message larger than max
429 Too Many Requests Ingestion rate limit exceeded
Maximum active stream limit exceeded
- https://izsk.me/2021/03/18/Loki-Prombles/
- https://www.bboy.app/2020/07/08/%E4%BD%BF%E7%94%A8loki%E8%BF%9B%E8%A1%8C%E6%97%A5%E5%BF%97%E6%94%B6%E9%9B%86/
Loki: Bad Request. 400. invalid query, through
insane quantity of files in chunks directory
Searching data slowly
This reason may occur by some inappropriate configured labels, using the following command to diagnose:
1 | logcli series --analyze-labels '{app_name="hkcash-server"}' |
You can this article to see how to avoid this issue:
https://grafana.com/docs/loki/latest/best-practices/
Alarmmanager
https://www.bilibili.com/read/cv17329220
Configuration backup
Loki Config: loki.zip
AlertManager Config: AlertManager.zip
Grafana Config: grafana.tgz
Reference
- https://grafana.com/docs/loki/latest/getting-started/get-logs-into-loki/
- https://grafana.com/docs/loki/latest/fundamentals/labels/
- https://grafana.com/docs/loki/latest/logql/log_queries/
- https://grafana.com/docs/loki/latest/clients/promtail/stages/multiline/
- https://grafana.com/docs/loki/latest/clients/promtail/stages/regex/
- https://github.com/google/re2/wiki/Syntax
- https://grafana.com/docs/grafana/latest/variables/
- https://grafana.com/docs/grafana/latest/datasources/loki/
- https://www.jianshu.com/p/474a5034a501
- https://www.jianshu.com/p/259a1d656745
- https://www.jianshu.com/p/672173b609f7
- https://www.cnblogs.com/ssgeek/p/11584870.html
- https://grafana.com/docs/loki/latest/installation/helm/
- https://blog.csdn.net/weixin_49366475/article/details/114384817
- https://blog.luxifan.com/blog/post/lucifer/1.%E5%88%9D%E8%AF%86Loki-%E4%B8%80
- https://blog.csdn.net/bluuusea/article/details/104619235
- https://blog.51cto.com/u_14205795/4561323
- https://www.cnblogs.com/punchlinux/p/17035742.html
- https://kebingzao.com/2022/11/29/prometheus-4-alertmanager/
- https://blog.csdn.net/wang7531838/article/details/107809870
- https://blog.51cto.com/u_12965094/2690336
- https://blog.csdn.net/qq_42883074/article/details/115544031
- https://blog.csdn.net/bluuusea/article/details/104619235
- http://www.mydlq.club/article/126/
- https://www.orchome.com/10106
- https://blog.51cto.com/u_14320361/2461666
- https://chenzhonzhou.github.io/2020/07/17/alertmanager-de-gao-jing-mo-ban/
- https://blog.csdn.net/weixin_44911287/article/details/124149964
- https://blog.csdn.net/easylife206/article/details/127581630
kubectl create ns zero-loki
kubectl -n zero-loki create configmap –from-file configmap/loki-config-cluster.yaml loki-config
kubectl -n zero-loki create configmap –from-file configmap/rules.yaml loki-rules
kubectl -n zero-loki describe configmap loki-config
kubectl -n zero-loki describe configmap loki-rules
kubectl -n zero-loki apply -f zero-loki.yml
kubectl -n zero-loki get po,svc -owide
#kubectl -n zero-loki logs -f loki-cluster-57777d6d6-vkbc5
#kubectl -n zero-loki describe po loki-cluster-57777d6d6-8tfgd
kubectl.exe -n zero-loki exec -it kafka-0 bash
kafka-topics.sh –create –zookeeper “zookeeper-headless:2181” –replication-factor 2 –partitions 3 –topic uat
kafka-console-producer.sh –broker-list “192.168.80.99:9192,192.168.80.99:9292,192.168.80.99:9392” –topic uat
kafka-console-consumer.sh –bootstrap-server “192.168.80.99:9192,192.168.80.99:9292,192.168.80.99:9392” –topic uat –from-beginning
kafka-topics.sh –list –zookeeper “zookeeper-headless:2181”
kafka-run-class.sh kafka.tools.GetOffsetShell –broker-list “192.168.80.99:9192,192.168.80.99:9292,192.168.80.99:9392” –topic uat
kubectl -n xpay-logs run -ti –rm centos-test –image=centos:7 –overrides=’{“spec”: { “nodeSelector”: {“xpay-env”: “logs”}}}’