Removing “Stubborn” Spaces in Excel Data
When dealing with “other people’s” data, it’s not uncommon for users to create what appears to be an easy, straight-forward formula that should have no problem working, but instead fails in the most baffling way.
A typical example is where a user writes something like a lookup function to find a part number in a table to return a description. The number being searched appears to be in the table, yet the lookup function can’t locate it and returns an error.
Although this error can be caused by many factors, it is often attributed to extra spaces before and/or after one of the values (the value being searched for or the corresponding value in the table.)
It is these leading or trailing spaces that Excel sees quite easily that our eyes don’t notice without special investigative steps.
One of Excel’s versatile text manipulation functions, the TRIM function, can assist with the lookup function to “see” the data without the presence of the extra leading/training spaces.
This works great… until it doesn’t.
You copy and paste new data into your worksheet and apply the same function’s logic…and it fails.
Why did it fail? Let’s see what the reason behind the failure is along with a more robust solution.
Digging Deeper Into the Data
Why does this newly copied and pasted dataset fail?
The formula appears to be correct.
=VLOOKUP(TRIM(A9), $G$3:$H$7, 2, FALSE)
If we highlight the TRIM function portions of the formula and press F9, we can clearly see a space at the end of the search value.
Why did the TRIM function fail to remove this trailing space?
Although the TRIM function removes spaces, there are a variety of types of spaces, and TRIM can’t deal with all types.
The TRIM function removes spaces which are generated using the ASCII code of 32.
SIDEBAR: The American Standard Code for Information Interchange (ASCII) code table is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices as numbers. For example, the capital letter “A” is represented in the ASCII table as the number 65, “B” as 66, and “C” as 67, etc. The “space” character is represented as the number 32.
There is, however, another type of “space” character known as a non-breaking space that is represented by the ASCII code number 160.
An Interesting Test
If you want to see the code number that hides behind a character, type a character in a cell (like the capital letter “A” in cell A1), then select another cell and write a formula like the following
The CODE function displays the ASCII code for the first character in a string detected in the target cell.
If we were to highlight and copy the trailing space from the value in cell A9 and paste it into our test cell, the CODE function sees it as ASCII code 160.
This non-breaking space is common amongst data that is copied from sources such as email readers, word processors, and web browsers.
Seeing Things from the Opposite Side
Another Excel text function is the CHAR function. This function takes an ASCII code number and represents it as its “human-friendly” version.
=CHAR(32) or =CHAR(160)
The above formulas will generate a traditional “space” and a non-breaking “space”.
To our eyes, these both look the same, but Excel sees them as completely different characters, no different than seeing “A” and “B” as different characters.
The TRIM function can remove ASCII 32 characters but not ASCII 160 characters.
Improving the TRIM Function
Until Microsoft releases a version of the TRIM function that can deal with all types of spaces (TRIMALL sounds like a catchy name), we will have to take matters into our own hands and develop a more robust formula that can work with both types of characters.
If we ask ourselves, “How can we get TRIM to see the character 160s as character 32?” This sounds like the job of a Find/Replace action.
Excel has a function that can replace text with different text: the SUBSTITUTE function.
The SUBSTITUTE Function
The SUBSTITUTE function will search a cell for a defined text string and replace it with another text string. An example would be as follows:
=SUBSTITUTE(A1, “Microsoft”, “XelPlus”)
There is an optional 4th argument that allows you to specify exactly which occurrence in the text to replace, such as “only the 3rd occurrence”.
=SUBSTITUTE(A1, “Microsoft”, “XelPlus”, 3)
If you omit this 4th argument, all occurrences are replaced.
Updating Our Formula
Remember our original lookup function?
=VLOOKUP(TRIM(A9), $G$3:$H$7, 2, FALSE)
Let’s begin by taking out the TRIM function and putting in its place a SUBSTITUTE function.
=VLOOKUP(SUBSTITUTE(A9, CHAR(160),), $G$3:$H$7, 2, FALSE)
“But wait! You forgot to replace the non-breaking space with a normal space.”
It’s easy to think that would be the required step, but why replace a non-breaking space with something we ultimately want to throw away?
=VLOOKUP(SUBSTITUTE(A9, CHAR(160), CHAR(32)), $G$3:$H$7, 2, FALSE)
We got a bit crafty here and replaced the non-breaking space with nothing, which is a legal move and more efficient because it achieves what would have been a 2-step process in a single step.
If we fill the updated function down the list, we see that it works in most cases, but fails in a few.
This is because the SUBSTITUTE function is not searching for regular spaces to change to nothing.
Our solution is to give the SUBSTITUTE function a “first pass” at extracting non-breaking spaces, but afterward, give the TRIM function a “second pass” to catch any normal spaces.
=VLOOKUP(TRIM(SUBSTITUTE(A9, CHAR(160),)), $G$3:$H$7, 2, FALSE)
Aggregating Values with Spaces
Suppose we wanted to SUM all the values for “Product A” in our table.
The SUM function returns 0 (zero) because it sees all the data as text.
The below image uses a series of ISTEXT functions to test the “Product A” cells to see if the data is perceived as text by Excel. As you can see, Excel returns TRUE responses for all the tests.
Another Excel text function called VALUE can read a number stored as text as a ‘normal’ number.
Although the VALUE function ignores “space” characters, it fails to ignore non-breaking spaces.
If we nested the VALUE function within a TRIM function, it wouldn’t solve our problem.
We need to use the SUBSTITUTE function.
But the result is still not a usable number because the original text is still perceived as text. We need to convert the text to a real number using the VALUE function.
When to Use
Whenever you encounter data where…
- your numbers are not being perceived as numbers,
- you use the TRIM function and it fails,
- you use the VALUE function and it fails,
…use the SUBSTITUTE function to search for character code number 160 and replace it with “nothing”.
Get Fancy with a Single Step Solution
Instead of creating a pre-processed version of the data to then aggregate, we can perform the conversion and aggregation in a single step.
If you are working with the new Dynamic Array version of Excel, this formula works perfectly.
If you are working with an older (pre-Office 365) version of Excel, you will need to write the formula the same way as before,
..but you will need to commit the formula using the CTRL-Shift-Enter key combination instead of a normal Enter key.
This creates the formula as an array formula to enable the SUBSTITUTE function the ability to process multiple cells worth of data.
Feel free to Download the Workbook HERE.
Learn Excel from a Microsoft MVP
Check out my bestselling Excel Courses
Learn anytime that fits your schedule.
Download files. Practice. Apply.