Imagine taking a dataset like the following…
…and charting the yearly salaries for all employees. You are likely to end up with something that looks like the following.
Although this chart does display the information and given enough time you could decipher relevant information about the salaries, the time it would take is likely more than you care to invest.
The Story We Want to Tell
We want to answer the following question:
“Given five salary ranges, how many employees receive a salary within each salary range?”
This is the perfect opportunity for a Histogram to shine like the star it is.
Creating the Histogram Chart
To create the Histogram chart, perform the following steps:
- Select a cell in the desired data range.
- Click Insert (tab) -> Charts (group) -> Insert Statistics Chart -> Histogram.
It’s that easy.
Close, but Not Quite There
The result is technically a Histogram chart, but it doesn’t really tell the story in the way we need.
We will improve the chart with the following modifications:
Setting the Bin Width
The groupings of data points are known as “bins”. We can define the bin logic in two ways: either by the width of the bins (i.e. size) or the number of bins.
NOTE: “Automatic” is another option, but it rarely proves useful.
To decide on the strategy and set the parameters, double-click the values along the X-Axis (or single-click the values and press CTRL-1) to reveal the Format Axis panel.
If we set the Bin Width to 40,000, the result is 6 bins ranging from approximately 21K to 261K in increments of 40K.
Not great but getting better.
Creating an Overflow Bin
The purpose of the Overflow bin is to capture all values over a certain range.
For our chart, we will define the Overflow bin to hold all values over 200K.
Creating an Underflow Bin
The purpose of the Underflow bin is to capture all values below a certain range.
For our chart, we will define the Underflow bin to hold all values under 30K.
The final touches to the Histogram chart are as follows:
- Add data labels
- Increase the font size of the data labels for better readability
- Add a chart title
- Remove the vertical axis
- Remove the gridlines
Bonus Final Touch
With number ranges as large as we’re working with, we spend a large amount of real estate currency on the excessive zeroes in the X-Axis labels.
To make more efficient use of the X-Axis label space, double-click the X-Axis to open the Format Axis panel.
Open the Number category. In the Format Code field, enter the following code and press the Add button.
The result is as follows.
The way this works is that if you place a single comma after the zero instead of a period (decimal point), the number will be rounded to the nearest thousand. If you added 2 commas, the number would be rounded to the nearest million; 3 commas would round to the nearest billion, etc.
The “K” is simply a tacked-on suffix used to help the reading understand the true value of the axis.
Interpreting Excel Bin Labels
Excel uses standard statistical notation on labels for the bin ranges on the horizontal axis.
Numbers that are preceded or followed by parenthesis are excluded from the bin, while those adjacent to a bracket are included.
For example, the label “[10, 20]” denotes that both 10 and 20 are within the bin range. The label “(20, 40]” shows that 20 is not within the bin range but 40 is.
In English, if you have the value “20”, the value would be included in the first bin “[10, 20]”, but would not be included in the second bin “(20, 30]”. The second bin effectively starts at a fraction above “20”.
Feel free to Download the Workbook HERE.
I'm a 6x Microsoft MVP with over 15 years of experience implementing and professionals on Management Information Systems of different sizes and nature.
My background is Masters in Economics, Economist, Consultant, Oracle HFM Accounting Systems Expert, SAP BW Project Manager. My passion is teaching, experimenting and sharing. I am also addicted to learning and enjoy taking online courses on a variety of topics.