"I don't know what to do with the large amount of logs."
"I'm not sure if I can create a log analysis platform using Java."
For those who have such concerns, this articleHow to create a log analysis platform that combines Java and distributed processing technologyWe will explain:
By linking with Apache Kafka and Elasticsearch,Capable of collecting, searching, and visualizing large-scale log data in real timeIt will be.
Along with sample code, it covers configuration, implementation, and error prevention, so anyone can get started on building with confidence.
- The role and necessity of log analysis infrastructure
- Overview of the overall configuration and technology stack
- Implementing Kafka Producer in Java
- Implementing Kafka Consumer in Java and connecting to Elasticsearch
- Common errors and solutions
- Visualization and Applications
- Completed configuration and summary
- Summary: A distributed log infrastructure can be realized in Java
The role and necessity of log analysis infrastructure
Learn why distributed processing is needed
Bottom line: log analysis needs to be real-time and scalable.
Modern web applications and IoT services generate millions of logs per day.
To efficiently collect, search, and visualize this vast amount of data, the following elements are required:
- High-speed log reception (Apache Kafka, etc.)
- Efficient storage and retrieval (Elasticsearch)
- Real-time visualization (e.g. Kibana)
The Ministry of Economy, Trade and Industry is also promoting DX.The importance of utilizing data and analyzing it in real timeIt shows:
(source:Ministry of Economy, Trade and Industry DX Report)
Overview of the overall configuration and technology stack
Design your configuration around Java and Kafka
Conclusion: Make sure you understand the process from log collection → transfer → search → visualization.
Overall configuration:
1 | [Application(Java)] → [Kafka] → [Log Consumer(Java)] → [Elasticsearch] → [Kibana] |
Technology used:
- Java: Used for log generation, transfer, and reception processing
- Apache Kafka: Distributed log reception and buffering
- Elasticsearch: Full-text search engine
- Kibana: Data visualization dashboard
Implementing Kafka Producer in Java
Send logs from your application
Conclusion: Implement a Producer in Java that sends logs to Kafka.
Add dependency to pom.xml
1 | org.apache.kafka kafka-clients 3.0.0 |
LogProducer.java
1 | import org.apache.kafka.clients.producer.*; import java.util.Properties; public class LogProducer { public static void main(String[] args) { Properties props = new Properties(); props.put( "bootstrap.servers" , "localhost:9092" ); props.put( "key.serializer" , "org.apache.kafka.common.serialization.StringSerializer" ); props.put( "value.serializer" , "org.apache.kafka.common.serialization.StringSerializer" ); Producer producer = new KafkaProducer<>(props); for ( int i = 0 ; i < 10 ; i++) { producer.send( new ProducerRecord<>( "logs" , "log-" + i, "Access log" + i)); } producer.close(); } } |
Implementing Kafka Consumer in Java and connecting to Elasticsearch
Receive from Kafka and save to Elasticsearch
Bottom line: The Consumer receives the logs and stores them in Elasticsearch.
LogConsumer.java (excerpt)
1 | import org.apache.kafka.clients.consumer.*; import org.elasticsearch.client.*; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.action.index.*; public class LogConsumer { public static void main(String[] args) { // Kafka configuration Properties props = new Properties(); props.put("bootstrap.servers", KafkaConsumer consumer = new KafkaConsumer<>(props); consumer.subscribe(Collections.singletonList("logs")); // Elasticsearch configuration RestHighLevelClient client = new RestHighLevelClient( RestClient.builder(new HttpHost("localhost", 9200, "http"))); while (true) { for (ConsumerRecord record : consumer.poll(Duration.ofMillis(100))) { IndexRequest request = new IndexRequest("log-index") .source(Map.of("message", record.value())); client.index(request, RequestOptions.DEFAULT); } } } } |
Common errors and solutions
Error Content | cause | solution |
---|---|---|
Connection refused: localhost:9092 | Kafka is not running | kafka-server-start.sh Check the startup with |
IndexNotFoundException | No index created in Elasticsearch | Allow auto-creation, orPUT /log-index Created with |
Java dependency resolution error | Incorrect Maven settings | After updating pom.xmlmvn clean install execution |
Visualization and Applications
Create a dashboard in Kibana
Conclusion: Kibana allows you to intuitively display log contents in graphs and tables.
- Display the change in the number of logs by time
- Filter for specific error messages
- Response time heatmap
Application ideas:
- Alert function (notification when a certain number of errors occur)
- Utilizing user behavior logs for marketing
Completed configuration and summary
1 | src/ ├─ LogProducer.java ├─ LogConsumer.java ├─ pom.xml ├─ Elasticsearch + Kibana └─ Kafka Server |
Boot order:
- Start Kafka
- Start Elasticsearch
- Start Kibana
- Run Producer
- Consumer Execution
Summary: A distributed log infrastructure can be realized in Java
In this article, we will use Java and Kafka.Distributed processing-compatible log analysis platformWe explained the steps to build it.
What I learned:
- How to send logs to Kafka from Java
- Transferring data from Kafka to Elasticsearch
- Common errors and solutions
- How to visualize logs in Kibana
In the future,Security and access log monitoring systemBy applying this technology, it becomes possible to develop more practical systems.