Question
I am very new to BigQuery by google
I want to parse time stamp (yyyy/mm/dd:hh:mm:ss
) based on the day and the month wish to bucket days into weeks.
I didn't find any BigQuery function which does this.
Hence, I was wondering if there was a way in which I can write a UDF
and then access it in a BigQuery query
Solution
There are two questions here, so two answers:
BigQuery does support UDFs: docs. (It didn't when I first answered this.)
Even without UDFs, the date bucketing is still doable. BigQuery has one time parsing function,
PARSE_UTC_USEC
, which is expecting input in the formYYYY-MM-DD hh:mm:ss
. You'll need to useREGEXP_REPLACE
to get your date into the right format. Once you've done that,UTC_USEC_TO_WEEK
will block things into weeks, and you can group by that. So tying all that together, if your table has a column calledtimestamp
, you could get counts by week via something likeSELECT week, COUNT(week) FROM (SELECT UTC_USEC_TO_WEEK( PARSE_UTC_USEC( REGEXP_REPLACE( timestamp, r"(\d{4})/(\d{2})/(\d{2}):(\d{2}):(\d{2}):(\d{2})", r"\1-\2-\3 \4:\5:\6")), 0) AS week FROM mytable) GROUP BY week;
Note that the
0
here is the argument for which day of the week to use as the "beginning"; I've used Sunday, but for "business"-y things using1
(i.e. Monday) would likely make more sense.
Just in case you need it, the section on timestamp functions in the docs is helpful.
OTHER TIPS
UDF support in BigQuery is now here! https://cloud.google.com/bigquery/user-defined-functions
Here is some code that will convert a string time specifier into a JavaScript Date object, and extract some properties from it; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date for information on properties available for JS dates.
QUERY (replace the nested select with your table):
SELECT day_of_week, month_date
FROM parseDate(select '2015/08/01 12:00:00' as date_string);
CODE:
function parsedate(row, emit) {
var d = new Date(row.date_string);
emit({day_of_week: d.getDay(),
month_date: d.getDate()});
}
bigquery.defineFunction(
'parseDate', // Name of the function exported to SQL
['date_string'], // Names of input columns
[{'name': 'day_of_week', 'type': 'integer'},
{'name': 'month_date', 'type': 'integer'}],
parsedate
);