使用 Ansible 部署 Elasticsearch 集群
1. 集群规划
1.1 服务器列表
IP | 主机名 | 内存(GB) | CPU核数 | 磁盘 | 操作系统 | CPU 架构 | 备注 |
---|---|---|---|---|---|---|---|
11.0.0.11 | arc-dev-dc01 | 8 | 1 | 500GB | CentOS 7.9.2009 | x86_64 | Ansible 管理机,无法访问互联网 |
11.0.0.14 | arc-dev-dc04 | 16 | 2 | 500GB | CentOS 7.9.2009 | x86_64 | 受控机器,无法访问互联网 |
11.0.0.15 | arc-dev-dc05 | 16 | 2 | 500GB | CentOS 7.9.2009 | x86_64 | 受控机器,无法访问互联网 |
11.0.0.16 | arc-dev-dc06 | 16 | 2 | 500GB | CentOS 7.9.2009 | x86_64 | 受控机器,无法访问互联网 |
说明:
- 每个服务器的 IP 均是静态的;
- 每个服务器的防火墙都已关闭;
- 每个服务器的 SELINUX 已经禁用;
- 每个服务器均存在一个管理员用户 admin,该用户可以免密码执行 sudo 命令;
- 在 arc-dev-dc01 机器上,可以使用 admin 用户免密码 ssh 到其他机器;
- 在 arc-dev-dc01 机器上,已经安装好了 Ansible 命令;
- 所有操作均使用 admin 用户完成,ES 集群的所属用户为 admin;
- 服务器之间的时间同步。
为使集群满足以上要求,参考下列文章进行配置:
- 使用 VMware Workstation 安装 CentOS-7 虚拟机
- 用 Ansible 批量完成 CentOS 7 操作系统基础配置
1.2 集群规划
IP | 主机名 | 部署服务/角色 | ES/Kibana 版本 |
---|---|---|---|
11.0.0.11 | arc-dev-dc01 | Kibana | 7.17.29 |
11.0.0.14 | arc-dev-dc04 | ES 不区分角色 | 7.17.29 |
11.0.0.15 | arc-dev-dc05 | ES 不区分角色 | 7.17.29 |
11.0.0.16 | arc-dev-dc06 | ES 不区分角色 | 7.17.29 |
2. 准备物料包
Elasticsearch 下载地址:https://www.elastic.co/downloads/past-releases#elasticsearch
Kibana 下载地址:https://www.elastic.co/downloads/past-releases#kibana
IK 分词器下载地址:https://release.infinilabs.com/analysis-ik/stable/
安装包:
- elasticsearch-7.17.29-linux-x86_64.tar.gz
- kibana-7.17.29-linux-x86_64.tar.gz
- elasticsearch-analysis-ik-7.17.29.zip
3. Ansible 文件
3.1 Ansible 目录结构
说明:在 arc-dev-dc01 机器上,执行 ansible 命令的基础目录为 /home/admin/ansible下载好 3 个文件(elasticsearch-7.17.29-linux-x86_64.tar.gz、kibana-7.17.29-linux-x86_64.tar.gz、elasticsearch-analysis-ik-7.17.29.zip)上传到:arc-dev-dc01 机器的 /home/admin/ansible/elastic/packages 目录下。
$ tree /home/admin/ansible
/home/admin/ansible
├── ansible.cfg
├── elastic
│ ├── packages
│ │ ├── elasticsearch-7.17.29-linux-x86_64.tar.gz
│ │ ├── elasticsearch-analysis-ik-7.17.29.zip
│ │ └── kibana-7.17.29-linux-x86_64.tar.gz
│ ├── playbook
│ │ ├── disable_swap.yml
│ │ ├── install_elasticsearch.yml
│ │ ├── install_kibana.yml
│ │ ├── limits.yml
│ │ └── sysctl.yml
│ └── templates
│ ├── elasticsearch.yml.j2
│ ├── es-manage.sh
│ ├── kibana.service.j2
│ ├── kibana.yml.j2
│ └── log4j2.properties
└── hosts
3.2 ansible.cfg
[defaults]
inventory=./hosts
host_key_checking=False
3.3 hosts
[kibana]
arc-dev-dc01[elasticsearch]
arc-dev-dc04
arc-dev-dc05
arc-dev-dc06
3.4 disable_swap.yml
---
- name: Disable swap on all cluster nodeshosts: elasticsearchgather_facts: falsebecome: yestasks:- name: Turn off all swap immediatelycommand: swapoff -aregister: swapoff_resultchanged_when: swapoff_result.rc == 0- name: Backup fstab before modifyingcopy:src: /etc/fstabdest: "/etc/fstab.backup"owner: rootgroup: rootmode: 0644- name: Comment out swap entries in fstab to disable on bootlineinfile:path: /etc/fstabregexp: '^([^#].*\s+swap\s+.*)$'line: '# \1'backrefs: yes- name: Check runtime swap statuscommand: swapon --summaryregister: swap_statuschanged_when: falsefailed_when: false- name: Show runtime swap statusdebug:msg: >Runtime swap is {{'disabled' if swap_status.stdout == '' else 'ENABLED'}}- name: Check if /etc/fstab contains swap entrycommand: awk '!/^#/ && $3=="swap" {print}' /etc/fstabregister: fstab_swapchanged_when: falsefailed_when: false- name: Show fstab swap statusdebug:msg: >fstab swap entry is {{'(NOT disabled permanently)' if fstab_swap.stdout != '' else '(disabled permanently)'}}
3.5 limits.yml
---
- hosts: elasticsearchgather_facts: falsebecome: yesvars:target_user: adminlimits_file: "/etc/security/limits.d/99-limits-elasticsearch.conf"limits_config:- { domain: "{{ target_user }}", type: "soft", item: "nofile", value: "65535" }- { domain: "{{ target_user }}", type: "hard", item: "nofile", value: "65535" }- { domain: "{{ target_user }}", type: "soft", item: "nproc", value: "4096" }- { domain: "{{ target_user }}", type: "hard", item: "nproc", value: "4096" }- { domain: "{{ target_user }}", type: "soft", item: "fsize", value: "unlimited" }- { domain: "{{ target_user }}", type: "hard", item: "fsize", value: "unlimited" }- { domain: "{{ target_user }}", type: "soft", item: "as", value: "unlimited" }- { domain: "{{ target_user }}", type: "hard", item: "as", value: "unlimited" }- { domain: "{{ target_user }}", type: "soft", item: "memlock", value: "unlimited" }- { domain: "{{ target_user }}", type: "hard", item: "memlock", value: "unlimited" }tasks:- name: Ensure limits file existsfile:path: "{{ limits_file }}"state: touchmode: '0644'- name: Truncate limits filecopy:content: ""dest: "{{ limits_file }}"owner: rootgroup: rootmode: '0644'- name: Configure limits (idempotent replace or append)lineinfile:path: "{{ limits_file }}"regexp: "^{{ item.domain }}\\s+{{ item.type }}\\s+{{ item.item }}\\s+.*$"line: "{{ item.domain }} {{ item.type }} {{ item.item }} {{ item.value }}"state: presentloop: "{{ limits_config }}"- name: Show the contents of {{ limits_file }}command: cat {{ limits_file }}register: limits_contentchanged_when: false- name: Display {{ limits_file }} contentdebug:msg: "{{ limits_content.stdout_lines }}"- name: Verify ulimit for {{ target_user }}shell: su - {{ target_user }} -c "ulimit -a | egrep '\-n|\-u|\-f|\-v|\-l'"register: ulimit_outputchanged_when: false- name: Display ulimit for {{ target_user }}debug:msg: "{{ ulimit_output.stdout_lines }}"
3.6 sysctl.yml
---
- hosts: elasticsearchgather_facts: falsebecome: yesvars:sysctl_config_file: /etc/sysctl.confsysctl_params:vm.max_map_count: 262144net.ipv4.tcp_retries2: 5tasks:- name: Ensure sysctl.conf parameterslineinfile:path: '{{ sysctl_config_file }}'regexp: '^{{ item.key }}\s*='line: '{{ item.key }}={{ item.value }}'state: presentloop: "{{ sysctl_params | dict2items }}"- name: Apply sysctl paramscommand: sysctl -p- name: Show modified sysctl.conf linesshell: "grep -E '^({{ sysctl_params.keys() | join('|') }})' {{ sysctl_config_file }}"register: sysctl_conf_check- name: Print modified sysctl.conf linesdebug:msg: "{{ sysctl_conf_check.stdout_lines }}"
3.7 install_elasticsearch.yml
---
- name: Deploy Elasticsearch Clusterhosts: elasticsearchbecome: yesgather_facts: truevars:es_version: "7.17.29"es_user: "admin"es_group: "admin"es_install_dir: "/opt/app/elasticsearch-7.17.29"es_data_dir: "/data/elasticsearch/data"es_log_dir: "/data/elasticsearch/log"es_heap_size: "4g"es_http_port: 9200es_pid_file: "/opt/app/elasticsearch-7.17.29/pid"es_tarball: "/home/admin/ansible/elastic/packages/elasticsearch-7.17.29-linux-x86_64.tar.gz"ik_plugin_tarball: "/home/admin/ansible/elastic/packages/elasticsearch-analysis-ik-7.17.29.zip"ik_plugin_name: "elasticsearch-analysis-ik-7.17.29.zip"cluster_name: "mycluster"tasks:- name: Kill Elasticsearch processshell: ps -ef | grep -i elasticsearch | grep -v grep | awk '{print $2}' | xargs kill -9- name: Clean Elasticsearch directoriesfile:path: "{{ item }}"state: absentloop:- "{{ es_install_dir }}"- "{{ es_data_dir }}"- "{{ es_log_dir }}"- name: Ensure Elasticsearch directory existsfile:path: "{{ item }}"state: directoryowner: "{{ es_user }}"group: "{{ es_group }}"mode: '0755'loop:- "{{ es_install_dir }}"- "{{ es_data_dir }}"- "{{ es_log_dir }}"- name: Extract Elasticsearch tarballunarchive:src: "{{ es_tarball }}"dest: "{{ es_install_dir }}"remote_src: noowner: "{{ es_user }}"group: "{{ es_group }}"extra_opts: [--strip-components=1]- name: Ensure IK plugin directory exists on ES nodesfile:path: "{{ es_install_dir }}/ik_plugin"state: directoryowner: "{{ es_user }}"group: "{{ es_group }}"mode: '0755'- name: Copy IK plugin to ES nodecopy:src: "{{ ik_plugin_tarball }}"dest: "{{ es_install_dir }}/ik_plugin/"owner: "{{ es_user }}"group: "{{ es_group }}"mode: '0644'- name: Install IK plugin (offline)command: "{{ es_install_dir }}/bin/elasticsearch-plugin install file://{{ es_install_dir }}/ik_plugin/{{ ik_plugin_name }} --batch"become: false- name: Clean IK plugin tmp directoriesfile:path: "{{ es_install_dir }}/ik_plugin"state: absent- name: Generate Elasticsearch CA certificatecommand: >{{ es_install_dir }}/bin/elasticsearch-certutil ca--out {{ es_install_dir }}/config/elastic-stack-ca.p12--pass ""delegate_to: "{{ groups['elasticsearch'][0] }}"run_once: truebecome: false- name: Generate node certificatecommand: >{{ es_install_dir }}/bin/elasticsearch-certutil cert--ca {{ es_install_dir }}/config/elastic-stack-ca.p12--out {{ es_install_dir }}/config/elastic-certificates.p12--pass "" --ca-pass ""delegate_to: "{{ groups['elasticsearch'][0] }}"run_once: truebecome: false- name: Fetch certificates from elasticsearch[0] to control nodefetch:src: "{{ es_install_dir }}/config/{{ item }}"dest: "/tmp/es_certs/"flat: yesloop:- "elastic-stack-ca.p12"- "elastic-certificates.p12"delegate_to: "{{ groups['elasticsearch'][0] }}"run_once: true- name: Copy certificates from control node to all ES nodescopy:src: "/tmp/es_certs/{{ item }}"dest: "{{ es_install_dir }}/config/{{ item }}"owner: "{{ es_user }}"group: "{{ es_group }}"mode: '0644'loop:- "elastic-stack-ca.p12"- "elastic-certificates.p12" - name: Delete certificates from control nodefile:path: "/tmp/es_certs/{{ item }}"state: absentdelegate_to: localhostloop: - "elastic-stack-ca.p12" - "elastic-certificates.p12"- name: Backup config filecopy:src: "{{ es_install_dir }}/config/{{ item }}"dest: "{{ es_install_dir }}/config/{{ item }}.{{ ansible_date_time.iso8601 }}"remote_src: yesowner: "{{ es_user }}"group: "{{ es_group }}"mode: '0644'loop: - "elasticsearch.yml" - "jvm.options"- "log4j2.properties"- name: Config jvm.optionslineinfile:path: "{{ es_install_dir }}/config/jvm.options"regexp: '^-Xms.*'line: "-Xms{{ es_heap_size }}"insertafter: EOF- lineinfile:path: "{{ es_install_dir }}/config/jvm.options"regexp: '^-Xmx.*'line: "-Xmx{{ es_heap_size }}"insertafter: EOF- name: Replace log path in jvm.optionsreplace:path: "{{ es_install_dir }}/config/jvm.options"regexp: '^-XX:ErrorFile=logs/hs_err_pid%p\.log'replace: "-XX:ErrorFile={{ es_log_dir }}/hs_err_pid%p.log"- replace:path: "{{ es_install_dir }}/config/jvm.options"regexp: '^8:-Xloggc:logs/gc\.log'replace: "8:-Xloggc:{{ es_log_dir }}/gc.log"- replace:path: "{{ es_install_dir }}/config/jvm.options"regexp: '^9-:-Xlog:gc\*,gc\+age=trace,safepoint:file=logs/gc\.log:utctime,pid,tags:filecount=32,filesize=64m'replace: "9-:-Xlog:gc*,gc+age=trace,safepoint:file={{ es_log_dir }}/gc.log:utctime,pid,tags:filecount=32,filesize=64m"- name: Configure elasticsearch.ymltemplate:src: ../templates/elasticsearch.yml.j2dest: "{{ es_install_dir }}/config/elasticsearch.yml"owner: "{{ es_user }}"group: "{{ es_group }}"mode: '0644'- name: Configure log4j2.propertiescopy:src: "../templates/log4j2.properties"dest: "{{ es_install_dir }}/config/log4j2.properties"remote_src: noowner: "{{ es_user }}"group: "{{ es_group }}"mode: '0644'- name: Start Elasticsearchshell: "nohup bin/elasticsearch -p {{ es_pid_file }} &"ignore_errors: yesargs:chdir: "{{ es_install_dir }}"become: false- name: Wait for ES HTTP port to be availablewait_for:port: "{{ es_http_port }}"host: "{{ inventory_hostname }}"delay: 5timeout: 60
3.8 elasticsearch.yml.j2
cluster.name: {{ cluster_name }}
node.name: {{ inventory_hostname }}
#node.attr.rack: r1
path.data: {{ es_data_dir }}
path.logs: {{ es_log_dir }}
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: {{ es_http_port }}
discovery.seed_hosts: {{ groups['elasticsearch'] | list | to_json }}
cluster.initial_master_nodes: {{ groups['elasticsearch'] | list | to_json }}
#action.destructive_requires_name: true
ingest.geoip.downloader.enabled: false
xpack.monitoring.collection.enabled: true
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: elastic-certificates.p12
3.9 log4j2.properties
status = errorappender.console.type = Console
appender.console.name = console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%n######## Server - old style pattern ###########
appender.rolling_old.type = RollingFile
appender.rolling_old.name = rolling_old
appender.rolling_old.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}.log
appender.rolling_old.layout.type = PatternLayout
appender.rolling_old.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%nappender.rolling_old.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}-%d{yyyy-MM-dd}-%i.log.gz
appender.rolling_old.policies.type = Policies
appender.rolling_old.policies.time.type = TimeBasedTriggeringPolicy
appender.rolling_old.policies.time.interval = 1
appender.rolling_old.policies.time.modulate = true
appender.rolling_old.strategy.type = DefaultRolloverStrategy
appender.rolling_old.strategy.max = 7
################################################rootLogger.level = info
rootLogger.appenderRef.console.ref = console
rootLogger.appenderRef.rolling_old.ref = rolling_oldappender.header_warning.type = HeaderWarningAppender
appender.header_warning.name = header_warning
######## Deprecation - old style pattern #######
appender.deprecation_rolling_old.type = RollingFile
appender.deprecation_rolling_old.name = deprecation_rolling_old
appender.deprecation_rolling_old.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_deprecation.log
appender.deprecation_rolling_old.layout.type = PatternLayout
appender.deprecation_rolling_old.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name] [%product_origin]%marker %m%n
appender.deprecation_rolling_old.filter.rate_limit.type = RateLimitingFilterappender.deprecation_rolling_old.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\_deprecation-%i.log.gz
appender.deprecation_rolling_old.policies.type = Policies
appender.deprecation_rolling_old.policies.size.type = SizeBasedTriggeringPolicy
appender.deprecation_rolling_old.policies.size.size = 1GB
appender.deprecation_rolling_old.strategy.type = DefaultRolloverStrategy
appender.deprecation_rolling_old.strategy.max = 4
#################################################
logger.deprecation.name = org.elasticsearch.deprecation
logger.deprecation.level = WARN
logger.deprecation.appenderRef.deprecation_rolling_old.ref = deprecation_rolling_old
logger.deprecation.appenderRef.header_warning.ref = header_warning
logger.deprecation.additivity = false######## Search slowlog - old style pattern ####
appender.index_search_slowlog_rolling_old.type = RollingFile
appender.index_search_slowlog_rolling_old.name = index_search_slowlog_rolling_old
appender.index_search_slowlog_rolling_old.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\_index_search_slowlog.log
appender.index_search_slowlog_rolling_old.layout.type = PatternLayout
appender.index_search_slowlog_rolling_old.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%nappender.index_search_slowlog_rolling_old.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\_index_search_slowlog-%i.log.gz
appender.index_search_slowlog_rolling_old.policies.type = Policies
appender.index_search_slowlog_rolling_old.policies.size.type = SizeBasedTriggeringPolicy
appender.index_search_slowlog_rolling_old.policies.size.size = 1GB
appender.index_search_slowlog_rolling_old.strategy.type = DefaultRolloverStrategy
appender.index_search_slowlog_rolling_old.strategy.max = 4
#################################################
logger.index_search_slowlog_rolling.name = index.search.slowlog
logger.index_search_slowlog_rolling.level = trace
logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling_old.ref = index_search_slowlog_rolling_old
logger.index_search_slowlog_rolling.additivity = false######## Indexing slowlog - old style pattern ##
appender.index_indexing_slowlog_rolling_old.type = RollingFile
appender.index_indexing_slowlog_rolling_old.name = index_indexing_slowlog_rolling_old
appender.index_indexing_slowlog_rolling_old.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\_index_indexing_slowlog.log
appender.index_indexing_slowlog_rolling_old.layout.type = PatternLayout
appender.index_indexing_slowlog_rolling_old.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] [%node_name]%marker %m%nappender.index_indexing_slowlog_rolling_old.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}\_index_indexing_slowlog-%i.log.gz
appender.index_indexing_slowlog_rolling_old.policies.type = Policies
appender.index_indexing_slowlog_rolling_old.policies.size.type = SizeBasedTriggeringPolicy
appender.index_indexing_slowlog_rolling_old.policies.size.size = 1GB
appender.index_indexing_slowlog_rolling_old.strategy.type = DefaultRolloverStrategy
appender.index_indexing_slowlog_rolling_old.strategy.max = 4
#################################################logger.index_indexing_slowlog.name = index.indexing.slowlog.index
logger.index_indexing_slowlog.level = trace
logger.index_indexing_slowlog.appenderRef.index_indexing_slowlog_rolling_old.ref = index_indexing_slowlog_rolling_old
logger.index_indexing_slowlog.additivity = falseappender.audit_rolling.type = RollingFile
appender.audit_rolling.name = audit_rolling
appender.audit_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_audit.json
appender.audit_rolling.layout.type = PatternLayout
appender.audit_rolling.layout.pattern = {\"type":"audit", \"timestamp":"%d{yyyy-MM-dd'T'HH:mm:ss,SSSZ}"\%varsNotEmpty{, "node.name":"%enc{%map{node.name}}{JSON}"}\%varsNotEmpty{, "node.id":"%enc{%map{node.id}}{JSON}"}\%varsNotEmpty{, "host.name":"%enc{%map{host.name}}{JSON}"}\%varsNotEmpty{, "host.ip":"%enc{%map{host.ip}}{JSON}"}\%varsNotEmpty{, "event.type":"%enc{%map{event.type}}{JSON}"}\%varsNotEmpty{, "event.action":"%enc{%map{event.action}}{JSON}"}\%varsNotEmpty{, "authentication.type":"%enc{%map{authentication.type}}{JSON}"}\%varsNotEmpty{, "user.name":"%enc{%map{user.name}}{JSON}"}\%varsNotEmpty{, "user.run_by.name":"%enc{%map{user.run_by.name}}{JSON}"}\%varsNotEmpty{, "user.run_as.name":"%enc{%map{user.run_as.name}}{JSON}"}\%varsNotEmpty{, "user.realm":"%enc{%map{user.realm}}{JSON}"}\%varsNotEmpty{, "user.run_by.realm":"%enc{%map{user.run_by.realm}}{JSON}"}\%varsNotEmpty{, "user.run_as.realm":"%enc{%map{user.run_as.realm}}{JSON}"}\%varsNotEmpty{, "user.roles":%map{user.roles}}\%varsNotEmpty{, "apikey.id":"%enc{%map{apikey.id}}{JSON}"}\%varsNotEmpty{, "apikey.name":"%enc{%map{apikey.name}}{JSON}"}\%varsNotEmpty{, "authentication.token.name":"%enc{%map{authentication.token.name}}{JSON}"}\%varsNotEmpty{, "authentication.token.type":"%enc{%map{authentication.token.type}}{JSON}"}\%varsNotEmpty{, "origin.type":"%enc{%map{origin.type}}{JSON}"}\%varsNotEmpty{, "origin.address":"%enc{%map{origin.address}}{JSON}"}\%varsNotEmpty{, "realm":"%enc{%map{realm}}{JSON}"}\%varsNotEmpty{, "url.path":"%enc{%map{url.path}}{JSON}"}\%varsNotEmpty{, "url.query":"%enc{%map{url.query}}{JSON}"}\%varsNotEmpty{, "request.method":"%enc{%map{request.method}}{JSON}"}\%varsNotEmpty{, "request.body":"%enc{%map{request.body}}{JSON}"}\%varsNotEmpty{, "request.id":"%enc{%map{request.id}}{JSON}"}\%varsNotEmpty{, "action":"%enc{%map{action}}{JSON}"}\%varsNotEmpty{, "request.name":"%enc{%map{request.name}}{JSON}"}\%varsNotEmpty{, "indices":%map{indices}}\%varsNotEmpty{, "opaque_id":"%enc{%map{opaque_id}}{JSON}"}\%varsNotEmpty{, "trace.id":"%enc{%map{trace.id}}{JSON}"}\%varsNotEmpty{, "x_forwarded_for":"%enc{%map{x_forwarded_for}}{JSON}"}\%varsNotEmpty{, "transport.profile":"%enc{%map{transport.profile}}{JSON}"}\%varsNotEmpty{, "rule":"%enc{%map{rule}}{JSON}"}\%varsNotEmpty{, "put":%map{put}}\%varsNotEmpty{, "delete":%map{delete}}\%varsNotEmpty{, "change":%map{change}}\%varsNotEmpty{, "create":%map{create}}\%varsNotEmpty{, "invalidate":%map{invalidate}}\}%n
appender.audit_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_audit-%d{yyyy-MM-dd}.json
appender.audit_rolling.policies.type = Policies
appender.audit_rolling.policies.time.type = TimeBasedTriggeringPolicy
appender.audit_rolling.policies.time.interval = 1
appender.audit_rolling.policies.time.modulate = true
appender.audit_rolling.strategy.type = DefaultRolloverStrategy
appender.audit_rolling.strategy.max = 30logger.xpack_security_audit_logfile.name = org.elasticsearch.xpack.security.audit.logfile.LoggingAuditTrail
logger.xpack_security_audit_logfile.level = info
logger.xpack_security_audit_logfile.appenderRef.audit_rolling.ref = audit_rolling
logger.xpack_security_audit_logfile.additivity = falselogger.xmlsig.name = org.apache.xml.security.signature.XMLSignature
logger.xmlsig.level = error
logger.samlxml_decrypt.name = org.opensaml.xmlsec.encryption.support.Decrypter
logger.samlxml_decrypt.level = fatal
logger.saml2_decrypt.name = org.opensaml.saml.saml2.encryption.Decrypter
logger.saml2_decrypt.level = fatal
3.10 kibana.service.j2
[Unit]
Description=Kibana
Documentation=https://www.elastic.co
Wants=network-online.target
After=network-online.target[Service]
Type=simple
User={{ kibana_user }}
Group={{ kibana_group }}Environment=KBN_HOME={{ kibana_install_dir }}
Environment=KBN_PATH_CONF={{ kibana_install_dir }}/config
Environment=RESTART_ON_UPGRADE=falseExecStart={{ kibana_install_dir }}/bin/kibana --logging.dest="{{ kibana_log_file }}" --pid.file="{{ kibana_install_dir }}/kibana.pid" --deprecation.skip_deprecated_settings[0]="logging.dest"Restart=on-failure
RestartSec=3StartLimitBurst=3
StartLimitInterval=60WorkingDirectory={{ kibana_install_dir }}StandardOutput=journal
StandardError=inherit[Install]
WantedBy=multi-user.target
3.11 kibana.yml.j2
server.port: {{ kibana_port }}
server.host: "{{ inventory_hostname }}"
server.name: "{{ inventory_hostname }}"
#server.basePath: "/kibana"
#server.rewriteBasePath: true
#server.publicBaseUrl: "http://{{ inventory_hostname }}:{{ kibana_port }}/kibana"
elasticsearch.hosts: {{ elasticsearch_hosts | list | to_json }}
elasticsearch.username: "kibana_system"
elasticsearch.password: "{{ kibana_system_password }}"
i18n.locale: "zh-CN"
pid.file: "{{ kibana_install_dir }}/kibana.pid"
path.data: "{{ kibana_data_dir }}"
monitoring.ui.enabled: true
logging:root:appenders: [default, file]level: infoappenders:file:type: filefileName: {{ kibana_log_file }}layout:type: patternpattern: "[%date][%level] %message"
3.12 install_kibana.yml
---
- name: Deploy Kibanahosts: kibanabecome: yesgather_facts: truevars:kibana_user: "admin"kibana_group: "admin"kibana_port: 5601kibana_data_dir: "/data/kibana/data"kibana_log_dir: "/data/kibana/log"kibana_log_file: "/data/kibana/log/kibana.log"kibana_install_dir: "/opt/app/kibana-7.17.29"kibana_tarball: "/home/admin/ansible/elastic/packages/kibana-7.17.29-linux-x86_64.tar.gz"logrotate_max_size: 100Mlogrotate_keep: 10kibana_system_password: "your_password"elasticsearch_hosts:- "http://arc-dev-dc04:9200"- "http://arc-dev-dc05:9200"- "http://arc-dev-dc06:9200"tasks:- name: Stop Kibana and ignore errorssystemd:name: kibanastate: stoppedignore_errors: yes- name: Clean Kibana directoriesfile:path: "{{ item }}"state: absentloop:- "{{ kibana_install_dir }}"- "{{ kibana_data_dir }}"- "{{ kibana_log_dir }}"- name: Ensure Kibana directory existsfile:path: "{{ item }}"state: directoryowner: "{{ kibana_user }}"group: "{{ kibana_group }}"mode: '0755'loop:- "{{ kibana_install_dir }}"- "{{ kibana_data_dir }}"- "{{ kibana_log_dir }}"- name: Extract Kibana tarballunarchive:src: "{{ kibana_tarball }}"dest: "{{ kibana_install_dir }}"remote_src: noowner: "{{ kibana_user }}"group: "{{ kibana_group }}"extra_opts: [--strip-components=1]- name: Backup kibana.ymlcopy:src: "{{ kibana_install_dir }}/config/kibana.yml"dest: "{{ kibana_install_dir }}/config/kibana.yml.{{ ansible_date_time.date }}.{{ ansible_date_time.hour }}{{ ansible_date_time.minute }}"remote_src: trueowner: "{{ kibana_user }}"group: "{{ kibana_group }}"mode: '0644'- name: Configure kibana.ymltemplate:src: ../templates/kibana.yml.j2dest: "{{ kibana_install_dir }}/config/kibana.yml"owner: "{{ kibana_user }}"group: "{{ kibana_group }}"mode: '0644'- name: Configure kibana.ymltemplate:src: ../templates/kibana.service.j2dest: "/etc/systemd/system/kibana.service"mode: '0664'- name: Ensure Kibana service is started and enabledsystemd:name: kibanastate: startedenabled: yes- name: Create logrotate configuration for Kibanacopy:dest: /etc/logrotate.d/kibanacontent: |{{ kibana_log_file }} {size {{ logrotate_max_size }}rotate {{ logrotate_keep }}compressmissingoknotifemptycopytruncatecreate 0640 {{ kibana_user }} {{ kibana_group }}}owner: rootgroup: rootmode: '0644'
3.13 es-manage.sh
#!/bin/bash
# es-manage.sh - 管理 Elasticsearch 集群
# 部署在 arc-dev-dc01 上
# 使用: sh es-manage.sh {start|stop|restart|status}# ---------------- 配置 ----------------
NODES=("arc-dev-dc04" "arc-dev-dc05" "arc-dev-dc06")
ES_PORT=9200
ES_HOME="/opt/app/elasticsearch-7.17.29"
PID_FILE="$ES_HOME/pid"
ES_USER="admin"
ES_AUTH="elastic:ytFfhVIoQTgTWAePiICX"# ---------------- 颜色与符号 ----------------
GREEN="\e[32m\e[0m"
RED="\e[31m\e[0m"
YELLOW="\e[33m\e[0m"# ---------------- 公共函数 ----------------
check_node_up() {local node=$1curl -s -u "$ES_AUTH" "http://$node:$ES_PORT/_cluster/health" | grep -q '"status"'
}wait_for_node_up() {local node=$1local retry=30for i in $(seq 1 $retry); doif check_node_up "$node"; thenecho -e "$GREEN $node 已启动并可用"return 0fisleep 2doneecho -e "$RED $node 启动失败或不可用"return 1
}wait_for_node_down() {local node=$1local retry=30for i in $(seq 1 $retry); doif ! check_node_up "$node"; thenecho -e "$GREEN $node 已停止"return 0fisleep 2doneecho -e "$RED $node 停止失败"return 1
}# ---------------- 操作函数 ----------------
start_cluster() {for node in "${NODES[@]}"; doif check_node_up "$node"; thenecho -e "$YELLOW $node 已经在运行,无需启动"elseecho -e "启动 $node ..."ssh ${ES_USER}@$node "nohup $ES_HOME/bin/elasticsearch -d -p $PID_FILE > /dev/null 2>&1"fidonefor node in "${NODES[@]}"; dowait_for_node_up "$node"done
}stop_cluster() {for node in "${NODES[@]}"; doif ! check_node_up "$node"; thenecho -e "$YELLOW $node 已经是停止状态"elseecho -e "停止 $node ..."ssh ${ES_USER}@$node "pkill -F $PID_FILE || true"wait_for_node_down "$node"fidone
}restart_cluster() {for node in "${NODES[@]}"; doif check_node_up "$node"; thenecho -e "停止 $node ..."ssh ${ES_USER}@$node "pkill -F $PID_FILE || true"wait_for_node_down "$node"elseecho -e "$YELLOW $node 已经是停止状态"fiecho -e "启动 $node ..."ssh ${ES_USER}@$node "nohup $ES_HOME/bin/elasticsearch -d -p $PID_FILE > /dev/null 2>&1"wait_for_node_up "$node"done
}status_cluster() {for node in "${NODES[@]}"; doif check_node_up "$node"; thenecho -e "$GREEN $node 正在运行"elseecho -e "$RED $node 未运行"fidone
}# ---------------- 主逻辑 ----------------
case "$1" instart)start_cluster;;stop)stop_cluster;;restart)restart_cluster;;status)status_cluster;;*)echo "用法: $0 {start|stop|restart|status}"exit 1;;
esac
4. 系统配置
4.1 关闭交换分区
在 arc-dev-dc01 操作
执行命令:
$ pwd
/home/admin/ansible$ ansible-playbook elastic/playbook/disable_swap.yml# 其中的两个输出步骤如下则代表已经禁用交换分区
TASK [Show runtime swap status] ***********************...
ok: [arc-dev-dc04] => {"msg": "Runtime swap is disabled\n"
}
ok: [arc-dev-dc05] => {"msg": "Runtime swap is disabled\n"
}
ok: [arc-dev-dc06] => {"msg": "Runtime swap is disabled\n"
}TASK [Show fstab swap status] *************************...
ok: [arc-dev-dc04] => {"msg": "fstab swap entry is (disabled permanently)\n"
}
ok: [arc-dev-dc05] => {"msg": "fstab swap entry is (disabled permanently)\n"
}
ok: [arc-dev-dc06] => {"msg": "fstab swap entry is (disabled permanently)\n"
}
4.2 用户资源限制修改
在 arc-dev-dc01 操作
$ pwd
/home/admin/ansible$ ansible-playbook elastic/playbook/limits.yml# 最后输出检查结果
TASK [Display ulimit for admin] ***************************...
ok: [arc-dev-dc01] => {"msg": ["file size (blocks, -f) unlimited", "max locked memory (kbytes, -l) unlimited", "open files (-n) 65535", "max user processes (-u) 4096", "virtual memory (kbytes, -v) unlimited"]
}
ok: [arc-dev-dc04] => {"msg": ["file size (blocks, -f) unlimited", "max locked memory (kbytes, -l) unlimited", "open files (-n) 65535", "max user processes (-u) 4096", "virtual memory (kbytes, -v) unlimited"]
}
ok: [arc-dev-dc05] => {"msg": ["file size (blocks, -f) unlimited", "max locked memory (kbytes, -l) unlimited", "open files (-n) 65535", "max user processes (-u) 4096", "virtual memory (kbytes, -v) unlimited"]
}
ok: [arc-dev-dc06] => {"msg": ["file size (blocks, -f) unlimited", "max locked memory (kbytes, -l) unlimited", "open files (-n) 65535", "max user processes (-u) 4096", "virtual memory (kbytes, -v) unlimited"]
}
执行完成后,退出所有远程连接,再重新连接一次。
然后执行:
$ pwd
/home/admin/ansible$ ansible-playbook elastic/playbook/sysctl.yml# 最后输出检查结果
TASK [Print modified sysctl.conf lines] ***************************...
ok: [arc-dev-dc01] => {"msg": ["vm.max_map_count=262144", "net.ipv4.tcp_retries2=5"]
}
ok: [arc-dev-dc05] => {"msg": ["vm.max_map_count=262144", "net.ipv4.tcp_retries2=5"]
}
ok: [arc-dev-dc04] => {"msg": ["vm.max_map_count=262144", "net.ipv4.tcp_retries2=5"]
}
ok: [arc-dev-dc06] => {"msg": ["vm.max_map_count=262144", "net.ipv4.tcp_retries2=5"]
}
5. 部署 ES 集群
5.1 部署集群
在 arc-dev-dc01 操作
$ pwd
/home/admin/ansible$ ansible-playbook elastic/playbook/install_elasticsearch.yml# 最后输出
TASK [Wait for ES HTTP port to be available] **********************...
ok: [arc-dev-dc04]
ok: [arc-dev-dc06]
ok: [arc-dev-dc05]PLAY RECAP ****************************************************...
arc-dev-dc04 : ok=20 changed=19 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
arc-dev-dc05 : ok=17 changed=15 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
arc-dev-dc06 : ok=17 changed=15 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
5.2 创建密码
在任意一个 ES 服务的节点执行:
[admin@arc-dev-dc04 ~]$ cd /opt/app/elasticsearch-7.17.29/[admin@arc-dev-dc04 elasticsearch-7.17.29]$ ./bin/elasticsearch-setup-passwords auto
Initiating the setup of passwords for reserved users elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user.
The passwords will be randomly generated and printed to the console.
Please confirm that you would like to continue [y/N]y # 输入y# 保存好屏幕输出的密码信息!!!
Changed password for user apm_system
PASSWORD apm_system = 3kDtQ2p1ObxojvA0noD4## 这个密码下一步要用到
Changed password for user kibana_system
PASSWORD kibana_system = TUpBOwU24DyIdbv42cW3Changed password for user kibana
PASSWORD kibana = TUpBOwU24DyIdbv42cW3Changed password for user logstash_system
PASSWORD logstash_system = R15Nv0TPPJeCyNGnSqxIChanged password for user beats_system
PASSWORD beats_system = rPQe2le0JQzby7S598b5Changed password for user remote_monitoring_user
PASSWORD remote_monitoring_user = Vtp4Cq7fTrdY5opXKhFtChanged password for user elastic
PASSWORD elastic = 5IIBwRpOrEgJjU5IDqtK
注意:保存好用户密码
用户 | 密码 |
---|---|
apm_system | 3kDtQ2p1ObxojvA0noD4 |
kibana_system | TUpBOwU24DyIdbv42cW3 |
kibana | TUpBOwU24DyIdbv42cW3 |
logstash_system | R15Nv0TPPJeCyNGnSqxI |
beats_system | rPQe2le0JQzby7S598b5 |
remote_monitoring_user | Vtp4Cq7fTrdY5opXKhFt |
elastic | 5IIBwRpOrEgJjU5IDqtK |
6. 部署 Kibana
在 arc-dev-dc01 操作
$ pwd
/home/admin/ansible# 替换密码
$ kibana_system_password=TUpBOwU24DyIdbv42cW3
$ sed -i "s#kibana_system_password: \"your_password\"#kibana_system_password: \"${kibana_system_password}\"#" elastic/playbook/install_kibana.yml$ ansible-playbook elastic/playbook/install_kibana.yml
7. 集群启动停脚本
Kibana:
sudo systemctl start kibana
sudo systemctl stop kibana
sudo systemctl restart kibana
sudo systemctl status kibana
ES:
在 arc-dev-dc01 操作
$ pwd
/home/admin/ansible
$ sudo cp elastic/templates/es-manage.sh /usr/local/bin/
$ sudo chmod +x /usr/local/bin/es-manage.sh# 用法: /usr/local/bin/es-manage.sh {start|stop|restart|status}