Question

I am very new to BigQuery by google

I want to parse time stamp (yyyy/mm/dd:hh:mm:ss) based on the day and the month wish to bucket days into weeks. I didn't find any BigQuery function which does this.

Hence, I was wondering if there was a way in which I can write a UDF and then access it in a BigQuery query

Was it helpful?

Solution

There are two questions here, so two answers:

  • BigQuery does support UDFs: docs. (It didn't when I first answered this.)

  • Even without UDFs, the date bucketing is still doable. BigQuery has one time parsing function, PARSE_UTC_USEC, which is expecting input in the form YYYY-MM-DD hh:mm:ss. You'll need to use REGEXP_REPLACE to get your date into the right format. Once you've done that, UTC_USEC_TO_WEEK will block things into weeks, and you can group by that. So tying all that together, if your table has a column called timestamp, you could get counts by week via something like

    SELECT week, COUNT(week)
    FROM (SELECT UTC_USEC_TO_WEEK(
                   PARSE_UTC_USEC(
                     REGEXP_REPLACE(
                       timestamp, 
                       r"(\d{4})/(\d{2})/(\d{2}):(\d{2}):(\d{2}):(\d{2})", 
                       r"\1-\2-\3 \4:\5:\6")), 0) AS week
          FROM mytable) 
    GROUP BY week;
    

    Note that the 0 here is the argument for which day of the week to use as the "beginning"; I've used Sunday, but for "business"-y things using 1 (i.e. Monday) would likely make more sense.

Just in case you need it, the section on timestamp functions in the docs is helpful.

OTHER TIPS

UDF support in BigQuery is now here! https://cloud.google.com/bigquery/user-defined-functions

Here is some code that will convert a string time specifier into a JavaScript Date object, and extract some properties from it; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date for information on properties available for JS dates.

QUERY (replace the nested select with your table):

SELECT day_of_week, month_date
FROM parseDate(select '2015/08/01 12:00:00' as date_string);

CODE:

function parsedate(row, emit) {
  var d = new Date(row.date_string);
  emit({day_of_week: d.getDay(),
        month_date: d.getDate()});
}

bigquery.defineFunction(
  'parseDate',                           // Name of the function exported to SQL
  ['date_string'],                    // Names of input columns
  [{'name': 'day_of_week', 'type': 'integer'},
   {'name': 'month_date', 'type': 'integer'}],
  parsedate
);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top