I am working on Kafka to get aggregate of some data. To start with I have one topic which is loaded with continuous data in form of JSON objects. These JSON objects are representing a Java Bean.
I want to group all objects using one of the attribute, under a topic for some calculation.
Example:
I have a topic called "activity
"
{
"id" : 2,
"name" : "Facebook",
"category" : "social",
"duration" : 10
}
There will be million of records/objects like the above mentioned. I have kept it short, eventually there will be many attributes.
From this topic activity
which contains bunch of records in JSON, I want to group all by attribute category
and calculate sum of attribute duration
.
I have tried using streams but not able to get this working for my object stored in JSON or a POJO class as mentioned above.