How to remove unicode in fluentd tail/s3 plugin

Question

I have fluentd configuration with source as tail type and destination as aws s3. I could able store the logs in S3.

we have enabled coloring in the application logs, based on log levels in winston logger, but while storing in S3 I'm getting the unicode value for colors like \u001b[34mdebug\u001b[39m. Same happening for special characters also(\u003c for >)

Fluentd Config
--------------
    <source>
      @type tail
      path /var/log/containers/abc-*.log
      pos_file /var/log/abc.log.pos
      tag abc.**
      <parse>
        @type none
      </parse>
      read_from_head true
    </source>
    
    <match abc.**>
       @type s3
    
       aws_key_id "#{ENV['AWS_ACCESS_KEY']}"
       aws_sec_key "#{ENV['AWS_SECRET_ACCESS_KEY']}"
       s3_bucket "#{ENV['S3_LOGS_BUCKET_NAME']}"
       s3_region "#{ENV['S3_LOGS_BUCKET_REGION']}"
       path "#{ENV['S3_LOGS_BUCKET_PREFIX']}"
       s3_object_key_format %{path}/abc/%Y%m%d/%{index}.json
    
       buffer_chunk_limit 20m
       buffer_path /var/log/fluentd-buffer
       store_as json
       flush_interval 600s
       time_slice_format %Y/%m/%d
       utc
    
       <format>
          @type single_value
       </format>
    
       <instance_profile_credentials>
         ip_address 169.254.169.254
         port       80
       </instance_profile_credentials>
    </match>

Current Log stored in S3:

{"log":"2021-04-10T12:34:51.050Z - \u001b[34mdebug\u001b[39m: \u003e\u003e\u003e\u003e testlog1 from app \n","stream":"stdout","time":"2021-04-10T12:34:51.050571552Z"}
{"log":"2021-04-10T12:34:51.067Z - \u001b[34mdebug\u001b[39m: \u003c\u003c\u003c\u003c testlog2 from app\n","stream":"stdout","time":"2021-04-10T12:34:51.068105637Z"}

Expected

{"log":"2021-04-10T12:34:51.050Z - debug: <<<< exec start from app \n","stream":"stdout","time":"2021-04-10T12:34:51.050571552Z"}
{"log":"2021-04-10T12:34:51.067Z - debug: <<<< exec end from app\n","stream":"stdout","time":"2021-04-10T12:34:51.068105637Z"}

Need help on how to print original values.

Azeem · Accepted Answer · 2021-04-17 07:17:48Z

0

Try fluentd record_transformer filter plugin like this:

<filter abc.**>
  @type record_transformer
  enable_ruby true
  <record>
    message ${ record["message"].gsub(/(\\u\d+b\[\d+m)|(\\u\d+e)/, '') }
  </record>
</filter>

answered Apr 17, 2021 at 7:17

Azeem

14.3k4 gold badges33 silver badges48 bronze badges

Have you tested it?
– Chen A.
Commented Dec 14, 2021 at 18:02
@ChenA.: Yes, it was tested for the above use-case. If your use-case is similar, you can expand on this one.
– Azeem
Commented Dec 15, 2021 at 5:47
My use-case is to drop / filter events with non-ascii characters. I went out with the grep with exclude filter
– Chen A.
Commented Dec 15, 2021 at 7:33
@ChenA.: Right. That should work too. 👍
– Azeem
Commented Dec 15, 2021 at 7:49

Add a comment |

Collectives™ on Stack Overflow

How to remove unicode in fluentd tail/s3 plugin

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged
amazon-web-services
amazon-s3
fluent
fluentd
winston
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged amazon-web-servicesamazon-s3fluentfluentdwinston or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
amazon-web-services
amazon-s3
fluent
fluentd
winston
or ask your own question.