There is no `MEDIAN()`

function in Google BigQuery, but we can still calculate the MEDIAN
with the
`PERCENTILE_CONT(x, 0.5)`

or
`PERCENTILE_DISC(x, 0.5)`

functions. The difference between those two functions is the linear interpolation that is applied
when using `PERCENTILE_CONT(x, 0.5)`

- so that's probably what you want when dealing with numeric
values. Take the numbers `1,2,3,4`

for example:

`PERCENTILE_CONT(x, 0.5)`

yields`2.5`

(as the 50% percentile is exactly between 2 and 3)`PERCENTILE_DISC(x, 0.5)`

yields`2`

(as the 50% percentile is >= 2)

## Example

```
SELECT
PERCENTILE_CONT(x, 0.5) OVER() AS median_cont,
PERCENTILE_DISC(x, 0.5) OVER() AS median_disc
FROM
UNNEST([1,2,3,4]) as x
LIMIT 1
```

**Result**

median_cont | median_disc |
---|---|

2.5 | 2 |

**Caution**: As of today (2020-06-20), BigQuery only supports `PERCENTILE_CONT`

and `PERCENTILE_DISC`

for window functions
(hence the `OVER()`

clause and the `LIMIT 1`

in the example above):

PERCENTILE_CONT is under development, and we will publish the documentation once it is GA. We will support it as analytic function first, and we plan to support it as aggregate function (allowing GROUP BY) later.

Source: SO: percentile functions with GROUPBY in BigQuery

The more common use case is probably to calculate the median as a result of a `GROUP BY`

statement.
I.e. I would *like* to write something like this to get the median of `quantity`

per `product_id`

.

```
SELECT
product_id,
PERCENTILE_CONT(quantity, 0.5) AS median
GROUP BY
product_id
```

Right now, that is only possible for the average via
`AVG()`

but not for the median. But we can still work around that limitation by using the
`PERCENTILE_CONT`

function on a window partitioned by `product_id`

, then group by the `product_id`

(to get only one row per `product_id`

)
and resolve a single median value via `ANY_VALUE()`

.

## Working Example

## Run on BigQuery

## Links

## Notes

There is also the `APPROX_QUANTILES()`

function (mentioned here) that *can* by applied to a `GROUP BY`

. I didn't have a practical use case for approximate functions
yet, though. Thus, I don't know the implications of "not using an exact calculation" and rather mention
this for the sake of completeness. Example:

```
SELECT
product_id,
APPROX_QUANTILES(quantity, 100)[OFFSET(50)] as approx_median
GROUP BY
product_id
```

### Wanna stay in touch?

Since you ended up on this blog, chances are pretty high that you're into Software Development (probably PHP, Laravel, Docker or Google Big Query) and I'm a big fan of feedback and networking.

So - if you'd like to stay in touch, feel free to shoot me an email with a couple of words about yourself and/or connect with me on LinkedIn or Twitter or simply subscribe to my RSS feed or go the crazy route and subscribe via mail and don't forget to leave a comment :)