[Java × distributed processing] Create your own log analysis platform! Realize high-speed data processing with Kafka + Elasticsearch

programming

"I don't know what to do with the large amount of logs."
"I'm not sure if I can create a log analysis platform using Java."

For those who have such concerns, this articleHow to create a log analysis platform that combines Java and distributed processing technologyWe will explain:

By linking with Apache Kafka and Elasticsearch,Capable of collecting, searching, and visualizing large-scale log data in real timeIt will be.

Along with sample code, it covers configuration, implementation, and error prevention, so anyone can get started on building with confidence.


The role and necessity of log analysis infrastructure

Learn why distributed processing is needed

Bottom line: log analysis needs to be real-time and scalable.

Modern web applications and IoT services generate millions of logs per day.
To efficiently collect, search, and visualize this vast amount of data, the following elements are required:

  • High-speed log reception (Apache Kafka, etc.)
  • Efficient storage and retrieval (Elasticsearch)
  • Real-time visualization (e.g. Kibana)

The Ministry of Economy, Trade and Industry is also promoting DX.The importance of utilizing data and analyzing it in real timeIt shows:
(source:Ministry of Economy, Trade and Industry DX Report)


Overview of the overall configuration and technology stack

Design your configuration around Java and Kafka

Conclusion: Make sure you understand the process from log collection → transfer → search → visualization.

Overall configuration:

1
[Application(Java)] → [Kafka] → [Log Consumer(Java)] → [Elasticsearch] → [Kibana]

Technology used:

  • Java: Used for log generation, transfer, and reception processing
  • Apache Kafka: Distributed log reception and buffering
  • Elasticsearch: Full-text search engine
  • Kibana: Data visualization dashboard

Implementing Kafka Producer in Java

Send logs from your application

Conclusion: Implement a Producer in Java that sends logs to Kafka.

Add dependency to pom.xml

1
org.apache.kafka kafka-clients 3.0.0

LogProducer.java

1
import org.apache.kafka.clients.producer.*; import java.util.Properties; public class LogProducer { public static void main(String[] args) { Properties props = new Properties(); props.put("bootstrap.servers", "localhost:9092"); props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer"); props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer"); Producer producer = new KafkaProducer<>(props); for (int i = 0; i < 10; i++) { producer.send(new ProducerRecord<>("logs", "log-" + i, "Access log" + i)); } producer.close(); } }

Implementing Kafka Consumer in Java and connecting to Elasticsearch

Receive from Kafka and save to Elasticsearch

Bottom line: The Consumer receives the logs and stores them in Elasticsearch.

LogConsumer.java (excerpt)

1
import org.apache.kafka.clients.consumer.*; import org.elasticsearch.client.*; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.action.index.*; public class LogConsumer { public static void main(String[] args) { // Kafka configuration Properties props = new Properties(); props.put("bootstrap.servers", KafkaConsumer consumer = new KafkaConsumer<>(props); consumer.subscribe(Collections.singletonList("logs")); // Elasticsearch configuration RestHighLevelClient client = new RestHighLevelClient( RestClient.builder(new HttpHost("localhost", 9200, "http"))); while (true) { for (ConsumerRecord record : consumer.poll(Duration.ofMillis(100))) { IndexRequest request = new IndexRequest("log-index") .source(Map.of("message", record.value())); client.index(request, RequestOptions.DEFAULT); } } } }

Common errors and solutions

Error Contentcausesolution
Connection refused: localhost:9092Kafka is not runningkafka-server-start.shCheck the startup with
IndexNotFoundExceptionNo index created in ElasticsearchAllow auto-creation, orPUT /log-indexCreated with
Java dependency resolution errorIncorrect Maven settingsAfter updating pom.xmlmvn clean installexecution

Visualization and Applications

Create a dashboard in Kibana

Conclusion: Kibana allows you to intuitively display log contents in graphs and tables.

  • Display the change in the number of logs by time
  • Filter for specific error messages
  • Response time heatmap

Application ideas:

  • Alert function (notification when a certain number of errors occur)
  • Utilizing user behavior logs for marketing

Completed configuration and summary

1
src/ ├─ LogProducer.java ├─ LogConsumer.java ├─ pom.xml ├─ Elasticsearch + Kibana └─ Kafka Server

Boot order:

  1. Start Kafka
  2. Start Elasticsearch
  3. Start Kibana
  4. Run Producer
  5. Consumer Execution

Summary: A distributed log infrastructure can be realized in Java

In this article, we will use Java and Kafka.Distributed processing-compatible log analysis platformWe explained the steps to build it.

What I learned:

  • How to send logs to Kafka from Java
  • Transferring data from Kafka to Elasticsearch
  • Common errors and solutions
  • How to visualize logs in Kibana

In the future,Security and access log monitoring systemBy applying this technology, it becomes possible to develop more practical systems.

Copied title and URL