June 14, 2020
Dynamic Removal of Rows in Power Query
In this article we’ll see how to remove appropriate number of irrelevant rows in Power Query editor.
We often work with data that starts with description like time of creation, location, author, etc. This metadata usually takes up first few rows in our data. We can remove it by opening the data in Power Query, selecting Remove Rows > Remove Top Rows and entering the desired number of rows to remove. This becomes time consuming when we have a lot of similar files to clean. Fortunately, we can create a query that detects and removes all the irrelevant rows.
Example: removing rows from a text file
In our example we’ll use a .csv list of all the Titanic passengers. We open the data in Power BI with Get Data and selecting the file. We get the following preview.
We choose Comma as Delimiter and open the Power Query editor with Transform Data. We duplicate the first step in Applied Steps and name it Base. We’ll refer to this step later.
We add an index column: we select Add Column > Index Column > From 1. We move the new column to the left end of the table, where we can easily see it.
We need the first cell where our data starts, in our case it’s the cell with the text PassengerId. We right-click on the second column and select filter, Text Filters > Equals, enter the text PassengerId and confirm with Ok. Table is filtered and we’re left with a single row. We right click on the cell in the first column and select Drill Down. We get the number value, in our case 5.
We found the row number where our data starts. We now call the original table: we add a new step with by right-clicking on the last step and selecting Insert Step After. We enter the name Base in the formula bar of this new step.
We have our Base table again. We remove couple of first rows of the table by selecting Remove Rows > Remove Top Rows. We enter any number, for example 7, and confirm with Ok. We then change the formula of the step, instead of 7 we type in Index-1 (since our data starts in row numbered Index, we want to remove Index-1 rows).
The step is now dynamic: it will always remove all the rows that are above the cell with the text PassengerId. We finish preparing our data with selecting Use First Row As Headers.
Function for Dynamic Removal of Data
We usually want to remove unnecessary rows from all the files. We can save all the steps we made in this article to a function and use the function on all the files we need. We’ll go through the process of making the function in our next article, Dynamic Removal of Rows in Power Query for All the Files in a Folder.