To make it easier to perform both ad-hoc and daily analytics, it would be good to have Cirrus server-side logs in HDFS. Bob Flagg would be a great person to do this since he has an analytics background and it would provide an introduction to both our analytics infrastructure and our search infrastructure.
Tasks:
- Sit down with Oliver and work out what fields we want to log;
- Create a streaming format in Cirrus that outputs logs containing these fields in a way HDFS can consume;
- Work with Ottomata to integrate this stream into HDFS's input
Definition of "done" for this task:
- Cirrus logs are available in Hadoop.