Now that we have raw Action API requests created by T108618: Publish detailed Action API request information to Hadoop we need to design the ETL (extract, transform, load) process that will populate the aggregate reporting tables from T116065: Design aggregate tables to drive Action API reports. Populating the action_ua_hourly and action_action_hourly tables is straight forward, but the action_param_hourly requires making some decisions.
There are many request params we do not want to count distinct values of at all (eg maxlag, smaxage, maxage, requestid, origin, centralauthtoken, titles, pageids). We need to design either a whitelist or blacklist of parameter names. Whitelisting is probably the safest approach.