Countif in Power Query or Powerbi using M Language

You know how to countif in Excel so why is Power Query so difficult?

me@jaykilleen.com wrote this about 7 years ago and it was last updated about 7 years ago.


← Back to the Posts

If you are familiar with the Excel COUNTIF formula then you might find it frustrating to try and replicate this in Power Query or PowerBI in M Language.

I've put together a simple Excel 2016 file to illustrate the Excel way vs the Power Query / PowerBi M Language equilvalent.

You can download this data in the 2016 Excel file here on Dropbox.

Let's create two tables:

  1. City
  2. Person

The two tables are on WorkSheet countif and I have inserted the seed data below as tables named City and People. You can see the table design by clicking the design tab and inspecting the Table Name field.

City is made up of two columns and an additional four of the alternatives for counting the population:

  1. Id
  2. Name
  3. Population(x) -> represents countif using excel
  4. Population(m) -> countif using M Language (option 1)
  5. Population(so) -> countif using M Language (option 2)
  6. Population(f) -> countif using M Language Customer Function

People is made up of three columns:

  1. Id
  2. Name
  3. CityId

Let's fill these tables with some seed data for our examples.


CITY

Id Name Population(x) Population(m) Population(so) Population(f)
1 Brisbane 2 2 2 2
2 Sydney 1 1 1 1
3 Melbourne 1 1 1 1

PEOPLE

Id Name CityId
1 Jay 1
2 Sam 2
3 Sarah 1
3 Holly 3

I asked a few questions about this over on StackOverflow with [Add Column - CountIF across two tables][1] and [Add Function CountIF][2].

After the great help and answers I received by Mike Honey, Imke Feldman, Carl Walsh, Alejandro Lopez-Lago.

I tried to break this question down as simple as possible. Two very basic tables and wanting to add a column that counts the occurences of the first tables Primary Keys in the second tables reference field.

This is pretty basic if you are across Excel Table Structured References or even just a basic Excel formulas.

To complete the field Population(x) we can simply add an extra column to the City table in Excel (as I have already done) and copy the formula =COUNTIF(People[CityId],[@Id]). Make sure this is pasted down the entire column and bam! it is done. How simple and easy :)

Now to replicate this in Power Query using the M Language.

I have already created the queries by adding the data from Excel Tables (Click in the Table -> Click the Data Tab -> Click From Table). We will then open the Advanced Editor and build out our Power Query M Language code.

Query Editor Group Directory Structure

  .  
  2016-02-05-easy-excel-countif-in-power-query-or-powerbi-m-language.xlsx  
  ├── worksheets  
  │   ├── countif  
  │   ├─  ...  
  │  
  ├── vba  
  └───queries
      ├── README.md
      ├───models
      │   ├── City.m
      │   ├── People.m
      │   ├─  ...
      │
      ├───loading
      │   ├─  ...
      │  
      ├───functions
      │   ├── f_countif.m
      │   ├─  ... 
      │
      └───Other Queries
          └── blank.m

All the magic of the Power Query M is happening in the City.m and the f_countif.m. First we'll inspect the City.m query.

Here you can see the 3 additional columns add through the M Table.AddColumn method. I'll add commmentary to the code to explain each step.

let
  Source = Excel.CurrentWorkbook(){[Name="City"]}[Content],
  change_type = Table.TransformColumnTypes(
    Source,
    {
      {"Id", Int64.Type}, 
      {"Name", type text}
    }
  ),
  
  add_pop_m = Table.AddColumn(         
    change_type,                       
    "Population(m)",                   
    (CityRow) =>                      
      Table.RowCount(
        Table.SelectRows( 
          People,                        
          each 
            [CityId] = CityRow[Id]))),

  add_pop_so = Table.AddColumn(
    add_pop_m, 
    "Population(so)", 
      each 
        let Id=[Id] 
        in Table.RowCount(
          Table.SelectRows(
            People, 
            each 
              [CityId] = Id))),
  
  add_pop_fm = Table.AddColumn(
    add_pop_so,
    "Population(f)", 
    each
      f_countif(People, "CityId",[Id]))

in
   add_pop_fm

add_pop_m

The steps of this portion of the M Query goes someting like this:

  1. add a new column to the previous table and name it "Population(m)"
  2. each row of the name column gets assigned the values of the current row of the City
  3. a function is then invoked on that variable
  4. this function counts rows within the People table
  5. the People table is filtered where each row of the CityId column matches the CityRow[Id] of the CityRow variable

add_pop_so

This is a similar alternative version from the StackOverflow answer given by Imke Feldman. It starts off the same way but with slightly different syntax that I don't yet full understand. But it works :)

add_pop_fm

This is another alternative that utilised a Custom Function. The benefits of this is that it is much shorter within the City query but required an additional Customer Function query to be created. This query can therefore be used in other queries as a Custom Function which can keep our code DRY.

let 
  countif = (tbl as table, col as text, value as any) as number =>
    let
      select_rows = Table.SelectRows(tbl, each Record.Field(_, col) = value),
      count_rows = Table.RowCount(select_rows)
    in
      count_rows
in
    countif

This function has a table, a column and a value passed in. It works much more like the traditional Excel formula.

Closing Thoughts

There is another alternative way where a table is created that groups the People table by the CityId and sums this into a new field called City.Count. This table can then be merged with the City table. This adds additional steps but could be completed with less queries on the database. What is unclear at this stage is which is the most efficient alternative. One issue I would like to investigate is whether running a function within an each block would trigger multiple queries to the database for each iteration.

This is a problem I have had in the past in MVC frameworks such as Ruby on Rails.

That is all for now. If you have anything you can add please post your comment below. Other sources of good information that I have used for this post include include:

Satalyst

Also thanks to the people that helped talk through this scenario on StackOverflow.

Mike Honey
Imke Feldman
Carl Walsh
Alejandro Lopez-Lago

Similar