Combine Sheets Using Power Query
A common problem in Excel is the need to combine data from different sheets to create a report based on the combined data.
This post will demonstrate how we can append (combine) data from multiple sheets into a single table or directly into a Pivot Table.
We will do this in a dynamic way. If we add additional sheets, we need only refresh the final report and the newly added sheet’s data will appear in the report.
The Data Structure and Issues
Our data is stored where each sheet represents a different month’s sales.
To incorporate new sheets as they are added to the file, we can capitalize on the table names.
For our example, we have six sheets (Jan thru Jun) that are formatted as proper Excel Tables.
The tables are named “Data_Jan” thru “Data_Jun”.
The tables all have the same columns (number and headings).
Issues / Considerations
We have an issue where some columns have empty cells on certain rows. The implication is that if we are selling different items to the same Customer or on the same Sales Document, the repetitive information will not be repeated.
This is something that will need to be taken care of when processing the query.
Creating a Query from the Ground Up
Because we don’t want to create a separate query for each sheet/table, and we don’t want to be burdened with updates every month, we will solve this with a bit of M code.
- Select Data (tab) -> Get & Transform (group) -> Get Data -> From Other Sources -> Blank Query.
- In the Formula Bar, type the following (watch your capitalization):
We are presented with a table that lists all detected tables and their respective names.
- Rename the query “AllData”.
Because Document Date information is part of the table data, we do not need to keep the Name column in the preview table.
- Remove the Date column by selecting the column heading and pressing the Delete
- Expand the Content column to include all columns (deselect the “Use original column name as prefix” option).
We are delivered all the data appended to one another in a single table.
NOTE: Because we are working with Excel Tables, we do not retain the heading rows from the subsequent tables. Duplicate headings are removed automatically.
- We don’t need the “Grand Total” rows, so we will filter the rows containing the words “Grand Total” from the Sales Document.
- Fill in the gaps in the data by replacing the nulls with the information directly above the nulls. This is done by selecting the Sales Document and Customer Name columns, then select Transform (tab) -> Any Column (group) -> Fill -> Fill Down.
- Select all columns (CTRL-A) and assign proper data types; Transform (tab) -> Any Column (group) -> Detect Data Type. Adjust as necessary (ex: Document Date to Date.)
- To make our reports easier to read, we want to see the month names as words instead of numbers. Select the Document Date column and click Add Column (tab) -> From Date & Time (group) -> Date -> Month -> Name of Month.
A Subversive Little Problem
Once we send the data to an Excel Table, we see that there are 283 rows in the query output.
Without changing ANYTHING, we refresh the query and observe what happens to our row count.
Each time we refresh, the row count increases to an unexpected value.
The problem is that the output query is storing its results in the same file as the source material, and it is storing it as a proper Excel Table.
This means that each time the query refreshes, it will read EVERY table in the workbook, including the output table. We refer to this as “the snake eating its tail.”
We can see from the Source step in the query that the query output table is seen as a viable input table.
To exclude the output sheet from the Refresh operation, we can insert a step in the query after the Source step that filters the table to only include tables whose table names begin with “Data_”.
This ensures the result table is not included in further updates.
Creating a Report from the Data
The above result takes the form of a table that must ignore its own result when gathering data.
We can bypass that result and send the consolidated data directly to a Pivot Table report, thus eliminating the need for the filtering of non-“Data_” sheets.
- From the Queries & Connections panel, right-click the query and select “Load to…”
- In the Import Data dialog box, select PivotTable Report and click OK.
- We are informed that performing this change will eliminate the original consolidated table and replace it with a Pivot Table. The data will be consolidated “behind the scenes” and fed directly to the Pivot Table.
- Next, we will place the “Month Name” field in the Rows section and the “Sales Value” field in the Values.
We see a report showing sales by month.
Feel free to Download the Workbook HERE.
Get the ULTIMATE Excel Power Query Course
Use Power Query Like an EXPERT From the Start
Learn anytime that fits your schedule.
Download files. Practice. Apply.