June 14, 2020
Dynamic Removal of Rows in Power Query for All the Files in a Folder
In this article we’ll remove unnecessary rows for all the files inside a folder and merge them in one single table.
Data files often contain metadata like time of creation, location, author, etc. Metadata usually takes up the first few rows of data. In Dynamic Removal of Rows in Power Query we showed how to create a query that detects and removes the unwanted rows. Now we can use the query we made and change it into a function. We’ll call this function on all the files in our folder. This is the way to clean and prepare data when we have a lot of similar files on hand.
Function that removes the unwanted rows
We’ll start with a query from the previous article. Query detects the number of unnecessary rows and removes them. Our query is called titanic. It removes redundant upper rows and returns a clean data table.
We’ll transform the query into a function. We first duplicate the query by right-clicking it and selecting Duplicate. We select the duplicate and open the Advanced Editor. Here we have all the steps of the query written in M language. We change the query into a function: we add the symbol ()=> to the start of the query. We add parameters inside the brackets. In our case we add parameter (FilePath), that stands for path of the input file to be cleaned. The function therefore starts with (FilePath) =>. We also change the actual file path in the function Source with the parameter FilePath, as seen on example below.
We confirm with Done. Query is now written in italic, meaning it’s now a function.
We rename the function to fnTXT. We can now enter filepath of any file we’d like to edit. Invoke will call the function and edit the file. We’ll get an edited table without the redundant rows. Since we don’t want to do this for each file manually, we’ll create a list of all the files in a folder and invoke the function on a list.
List of all the files inside a folder
We create a list of all the files inside a folder by selecting New Source > Folder and finding a directory path. We confirm with Ok and then select Transform Data. We get a table that describes one file in each row.
We’re only interested in columns containing file path and file name. We select the columns in this order, first Folder Path and then Name. We right-click and select Remove Other Columns. We’re left with two columns. We merge them by selecting Transform > Merge Columns. We leave the separator set to None and enter a new column name, for example Merged. We now have a column of all the file paths in a directory.
We can now invoke the function on this column. We select Add Column > Invoke Custom Function. We enter the name of the new column, we select the function fnTXT under Function Query and confirm with Ok. We get a new column that contains all the edited tables.
We can now merge all the tables: we click the Expand icon in fnTXT column and select the option Expand. We select all the tables and confirm with Ok.
We get one big merged table.