0

I am working on Kafka to get aggregate of some data. To start with I have one topic which is loaded with continuous data in form of JSON objects. These JSON objects are representing a Java Bean.

I want to group all objects using one of the attribute, under a topic for some calculation.

Example: I have a topic called "activity"

{
    "id" : 2,
    "name" : "Facebook",
    "category" : "social",
    "duration" : 10
}

There will be million of records/objects like the above mentioned. I have kept it short, eventually there will be many attributes. From this topic activity which contains bunch of records in JSON, I want to group all by attribute category and calculate sum of attribute duration.

I have tried using streams but not able to get this working for my object stored in JSON or a POJO class as mentioned above.

1
  • Please edit your question to include what you've tried so far. What kind of JSON errors are you getting? Kafka Streams will work fine... But ksqlDB or Spark, Flink, etc. will too Commented Feb 21, 2023 at 14:31

1 Answer 1

1

Based on what you've presented you need to do a KStream.groupBy followed by a reduce. Something similar to this:

stream.groupBy((key, value)-> KeyValue.pair(value.getCategory(), value))
      .reduce((currentValue, newValue) -> currentValue.getDuration() + newValue.getDuration())...

HTH

2
  • thanks @bbejeck, I tried the same, but somehow when I do KStream<String, Activity> leftSource = builder.stream("activity"), I try to print the stream. It doesn't show up anything on loggers or console, debugger also moves through this code without printing anything.
    – gpsingh
    Commented Feb 27, 2023 at 12:26
  • By default, Kafka Streams has a caching layer for stateful operations. The results of an aggregation or reduce wouldn't be seen until the cache is flushed. You can try setting statestore.cache.max.bytes=0 and you can see stateful results immediately.
    – bbejeck
    Commented Feb 27, 2023 at 22:46

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.