Skip to content

Latest commit

 

History

History
294 lines (264 loc) · 25.6 KB

README.md

File metadata and controls

294 lines (264 loc) · 25.6 KB

Euiyoung Hwang's Profile

Euiyoung Hwang ([email protected]) / Linkedin(https://www.linkedin.com/in/euiyoung-hwang/) : Search, Data & AI Software Engineer

I have ten years of experience in working with a modern search platform (Elasticsearch, Google Search Appliance with Google Apps) and in building data pipelines(e.g https://github.com/euiyounghwang/python-search) & rest api services around it as a search engineer/senior software engineer. Especially, I am an expert in the Search Engine with a bunch of api from elasticsearch and rest_api environment using Python Web Stacks with Docker because I have been handling the entire version of ES from 1.7 up to 7.9 (i.e building ES cluster from the scrach, handling index mappings with a lof of analyzers and resolving complex query based on domain needs)

In FiscalNote (2022.07 ~ 2023.07), I contributed to improve search relevance with complex queries such as func_score to adjust the weight to search results and query performance with clusters. In more detail, I did the following:

  • Improve search performance with reconsider sharding strategy and optimize Elasticsearch large clusters by fine-tuning configuration. It used to measure the search quality across multiple platforms using my performance tuning scripts based on Python.(Peformance Metrics - e.g: https://github.com/euiyounghwang/euiyounghwang.github.io/blob/master/screenshot/performance_results_example.png)
  • Experience in building Elasticsearch cluster index configuration options, sharding, percolation, ILM Configuration, and Elastic API Integration FN services (Query low latency, Index mapping changes with multilingual language, Redesign indexes mappings with dynamic templates)
  • Design & build an Elasticsearch-powered search service to allow users to search accurately on Omnisearch Service (Python, OAS API, Flask, Conda, Vagrant, Docker, RabbitMQ(Producer: https://github.com/euiyounghwang/python-fastapi-vector-search/blob/master/rmq_message_send.sh), Postgres, Elasticsearch/Percolator cluster, Git, CircleCI)
  • Monitoring ES Cluster with our production services by using Grafana, Datadog, Kibana and participating in On-Call Rotation (Provide timely response and resolution to production issues)

In particular, I remember that I have been building and implementing 'Enterprise Search Services' for seven years based on Elasticsearch in South Korea. As a result of my contributions that I was having a success story interview at the elastic on seoul conference(https://www.youtube.com/watch?v=qu0IXwi3Fq0). At that time, I participated in the Google search engine replacement project(https://www.linkedin.com/pulse/elastic-tour-seoul-posco-ict-euiyoung-hwang/) as project leader and senior software engineer.

The screenshots attached below are for the Success Story with Elasticsearch Interview, Elastic on Seoul Conference, 2018 (https://www.elastic.co/customers/posco) when i worked as Senior Software Engineer & Search/Data Engineer at POSC ICT, South Korea (Received an award in POST ICT, 2016, https://media.licdn.com/dms/image/C512DAQGqaGMRMAXk9w/profile-treasury-image-shrink_1920_1920/0/1597560813194?e=1694487600&v=beta&t=sYbj3Kip8j_opHS_GB2ECOQ0FVhoiv16Jgsb2dxHp1M)

Alt text

  • Handle with Elasticsearch 1.7.3 ~ 7.9.0 (Implement search service on entire version of Elaticsearch, gather all logs using Grok pattern using Logstash & Beat, Deploy Search Guard to ES cluster instead of X-Pack Shield)
  • 1'st Develop & Deploy the Elasticsearch with 24 Nodes (3 Masters. 2 clients, 19 Data Nodes) in South Korea - Monitoring with Spring Boot (https://github.com/euiyounghwang/Spring_Boot_Monitoring) instead of cerebro
  • Korean Analyzer called Analysis-Nori(https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-nori.html) developed from elastic after our requests because we have an issue with custom Korean Arirang, Mecab analyzers: I found the big issue on memory issue in the cluster)
  • Design & Develop a novel java library based on Apache Tika to extract full text from various of documents such as MS-OFFICE, HWP, PDF and Text Format (https://github.com/euiyounghwang/ES_Python_Project, https://github.com/euiyounghwang/DocumentsTextExtract) (Import java library into Python Environment for the unstructure texts in order to extract texts & index with meta datas into Elasticsearch) Alt text
  • Improve search relevance for client requirements with ranking weight
  • Design & Create Index about more than 4,000 index with settings & mappings and index template for the client requirements
  • Proper query implementation with Query DSL on Elasticsearch Cluster (https://github.com/euiyounghwang/GitHub_Guide)

if you want to watch the video, please go to this url (https://www.youtube.com/watch?v=qu0IXwi3Fq0) after set subtitled to English (I was in the middle of guys)

Alt text

Recently, I am personally implementing to Rest-Api Endpoint as test projects using python, flask/fastapi(https://github.com/euiyounghwang/python-fastapi-vector-search, https://github.com/euiyounghwang/python-flask-connexion-example-openapi3-master), and nestjs(https://github.com/euiyounghwang/nest-js-rest-api). The service allows you to search from a search engine (elasticsearch) and Postgres. It is also implemented based on Docker, and is being built, executed, and tested. Also I am interested with similary search such as huggingface embedding, vectorized search using Faiss and so on. (https://github.com/euiyounghwang/semantic-search-elasticsearch-openai-langchain)

Rest-API on OPEN API Specifiation(Swagger)

components:
  schemas:
    ..
    Search:
      type: object
      properties:
        query_string:
          type: string
          description: Full text search
          default: "Cryptocurrency"
          nullable: true
        start_date:
          type: string
          format: date
          description: Start date
          default: "2021 01-01 00:00:00"
        size:
          type: integer
          description: The number of size
          default: 20
        sort_order:
          type: string
          enum:
            - DESC
            - ASC
        include_basic_aggs:
          type: boolean
          description: Flag to enable/disabled aggregations which can slow down queries
        pit_id:
          type: string
          format: date
          description: pit_id
          example: ""
        ids_filter:
           type: array
           items:
            type: string
           default: ["*"]
    ..

Alt text

Docker in my local Environment

Alt text

I have set up & tested to monitor with alert throught slack such as search engines, restapi endpoints and other application's metrics using prometheus, alertmanager and grafana. Elasticsearch Prometheus Exporter is a builtin exporter from Elasticsearch to Prometheus. It collects all relevant metrics and makes them available to Prometheus via the Elasticsearch REST API. This is an open source project - Cluster status, Node Status such as JVM, Indices, Circuit Breaker : the feature to prevent OOM occurrence Alt text

# Alertmanager Configuration
alerting:
  alertmanagers:
    - static_configs:
      - targets: ['host.docker.internal:9093']

# loading at once and evaluate the rule periodically based on 'evaluation_interval'
rule_files:
  - "/alertmanager/alert.rules"
  # My local environment to install node-exporter Docker instance
  # docker run --rm -p 9100:9100 prom/node-exporter 
  # docker compose up -d node-exporter
  node_exporter:
    # http://localhost:9100/metrics
    image: prom/node-exporter
    container_name: node_exporter
    depends_on:
      - prometheus
    restart: always
    ports:
    - 9100:9100

Alt text

# Install Plugin
/opt/homebrew/opt/rabbitmq/sbin/rabbitmq-plugins enable rabbitmq_prometheus
brew services restart rabbitmq

# Prometheus.yml
- job_name: rabbitmq-exporter
  scrape_interval: 10s
  metrics_path: "/metrics"
  static_configs:
  - targets: ['host.docker.internal:15692']

Prometheus (Build Docker on my local environment with Elasticsearch-Exporter/Python-Export/FastAPI Plugin to gather all relevant metrics)

Elasticsearch Cluster monitoring

  • Monitoring all nodes in the cluster after creating the metrics from elasticsearch_exporter docker instance
  • See the metrics using elasticsearch exporter plugin after installing the library (http://localhost:9200/_prometheus/metrics) Alt text Alt text
 elasticsearch_exporter plugin
 ...
# HELP es_jvm_mem_heap_max_bytes Maximum used memory in heap
# TYPE es_jvm_mem_heap_max_bytes gauge
es_jvm_mem_heap_max_bytes{cluster="es-docker-cluster",node="es01",nodeid="ENbXGy5ASPevQ3A5MPnZJg",} 1.073741824E9
# HELP es_index_indexing_delete_current_number Current rate of documents deleted
# TYPE es_index_indexing_delete_current_number gauge
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".kibana_1",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".security-7",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".apm-custom-link",context="primaries",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".security-7",context="primaries",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".kibana_task_manager_1",context="primaries",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index="test_omnisearch_v2",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".kibana_1",context="primaries",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".kibana-event-log-7.9.0-000001",context="primaries",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".apm-agent-configuration",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".apm-custom-link",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".kibana-event-log-7.9.0-000001",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".apm-agent-configuration",context="primaries",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index=".kibana_task_manager_1",context="total",} 0.0
es_index_indexing_delete_current_number{cluster="es-docker-cluster",index="test_omnisearch_v2",context="primaries",} 0.0
# HELP es_index_recovery_current_number Current number of recoveries
...

Alt text

Python Webservice monitoring

 http://localhost:8081/metrics
 ...
# TYPE flask_exporter_info gauge
flask_exporter_info{version="0.22.4"} 1.0
# HELP flask_http_request_duration_seconds Flask HTTP request duration in seconds
# TYPE flask_http_request_duration_seconds histogram
flask_http_request_duration_seconds_bucket{le="0.005",method="POST",path="/v1/basic/search",status="200"} 0.0
flask_http_request_duration_seconds_bucket{le="0.01",method="POST",path="/v1/basic/search",status="200"} 0.0
flask_http_request_duration_seconds_bucket{le="0.025",method="POST",path="/v1/basic/search",status="200"} 0.0
flask_http_request_duration_seconds_bucket{le="0.05",method="POST",path="/v1/basic/search",status="200"} 0.0
flask_http_request_duration_seconds_bucket{le="0.075",method="POST",path="/v1/basic/search",status="200"} 0.0
flask_http_request_duration_seconds_bucket{le="0.1",method="POST",path="/v1/basic/search",status="200"} 0.0
flask_http_request_duration_seconds_bucket{le="0.25",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.5",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="0.75",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="1.0",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="2.5",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="5.0",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="7.5",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="10.0",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_bucket{le="+Inf",method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_count{method="POST",path="/v1/basic/search",status="200"} 1.0
flask_http_request_duration_seconds_sum{method="POST",path="/v1/basic/search",status="200"} 0.11475991699990118
...

Alt text

Elastic Stack Monitoring

  • Metricbeat is a lightweight shipper that you can install on your servers to periodically collect metrics from the operating system and from services running on the server
  • Monitoring system metrics to Prometheus or Kibana tools using Metribeat
wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-8.8.0-linux-arm64.tar.gz
tar -zxvf metricbeat-8.8.0-linux-arm64.tar.gz
cd metricbeat-8.8.0-linux-arm64/
./metricbeat setup -e
./metricbeat -e

Alt text

  • Add Variable to Prometheus Dashboard
{"find": "terms", "field": "host.name"}

Alt text

  • Build a dashboard using Grafana or Kibana like the above (Metricbeat to Elasticsearch with all relevant metrics)
- Metricbeat Service Registry
sudo cp /home/devuser/ES/metricbeat-8.8.0-linux-arm64/metricbeat  /usr/local/bin/metricbeat
sudo chown devuser metricbeat.yml
sudo chown devuser /usr/local/bin/

/usr/local/bin/metricbeat -e --path.home=/home/devuser/ES/metricbeat-8.8.0-linux-arm64

--
sudo vi /etc/systemd/system/metricbeat.service

[Unit]
Description=Metricbeat Service
After=multi-user.target

[Service]
Type=simple
User=devuser
Group=devuser
WorkingDirectory=/home/devuser/ES/metricbeat-8.8.0-linux-arm64
#ExecStart=/home/devuser/ES/metricbeat-8.8.0-linux-arm64b/start_metricbeat.sh
ExecStart=/usr/local/bin/metricbeat -e --path.home=/home/devuser/ES/metricbeat-8.8.0-linux-arm64
Restart=on-failure

[Install]
WantedBy=multi-user.target

-- Service Registry
sudo systemctl daemon-reload
# Autostart when rebooting
sudo systemctl enable metricbeat.service
# start to service
sudo systemctl start metricbeat.service
sudo systemctl status metricbeat.service

-- Log check
journalctl -u metricbeat.service

Alt text