Data Breach Patterns across Industries and Time

Data Breach Patterns across Industries and Time

QualiTest explores the data breach trends of today and the past

There are many statistics on the cyber insecurity caused by data leaks and breaches.  Where are things getting better and where are they getting worse?  To understand the problem, it is important to see how it continues to evolve.  It is important to be aware of the cyber risks so you can build yourself a proactive solution by testing against these software and network challenges.

Breach Level Index is one of the public Data Breach Databases available for research.  The collection began in 2013, and produces annual reports based on the findings for that year, and their data is available on their website.  The breakdown below was generated from pulling data from there for each year’s data.  Data breaches have the following fields: Organization breached, Records breached, date of breach (single date field), breach type, breach source, country location, industry, risk score (severity, based on multiple fields), news URL.  All columns are sortable, but you can only view one year at a time and cannot output your filtered data.  Also, breaches are not revisited (like Yahoo’s) if the record count is later revised upwards.  If the number of records breached is unknown, the database lists that under number of records breached.

Clearly, malicious outsider (hacker), accidental loss and malicious insider are the most common forms of breach in that order.  Penetration from the outside (hacking) continues to grow in popularity, causing all other accidental and malicious insider drop.

For malicious outsider attacks (hacking), Healthcare is the clear leader here (not by a lot), followed by Financial and Retail.  When going through the list of the breached, the first year of recording (2013) shows many schools, colleges and universities classified as Other (not shown) skewing the placement in the chart.  This was rectified in later years.  For 2013, the percentages should be closer to the later years, and may account for incongruities of other business types.

Healthcare once again is the clear winner (or perhaps loser) in terms of frequency, which is not a surprise since health records are very valuable on the dark web for identity theft.  Government, which had been fourth or fifth for malicious outsider is now second to Healthcare, with Retail dropping down to fourth.  Financial drops from second to third.  Note that Financial which had about doubled Government for malicious outsider is now about half of Government.  We will later see why hackers want financial institutions.

If you want to see a larger spread of years (but smaller concentrations of breaches per year), you can view the public data breach database at Their database lets you search on any combinations of occurrence years, organization types (industries), and breach types (cause).  Additionally, database entries reveal: date made public, company, location, number of records breached (0 displays is the count is unknown), an incident summary (which often includes the type of data in the records), the information source, and a supporting URL.  Data can be output to CSV or Excel.  A variety of charts can be generated from later tabs.

And if you like infographics, is based on their data.  But let’s begin by looking at different industries and methods of exposure.

For the long-term, medical is the most popular area to breach (remember, these are also worth a lot of the dark web), by any method, even accidental.  Retail is safest, in terms of accidental or physical loss, government safest from hacking, and education safest from insiders.  Education is second most likely to be hacked or accidentally exposed, government second most likely to involve a physical data loss, and financial second most likely to involve an inside job.  Let’s look at the 412 cases currently (2/15/2018) being investigated at U.S. Health and Human Services.

In terms of records breached per incident, financial, both by insider and outsider, revealed the most records.  In case you’re wondering about Accidental not hitting the areas listed, River City Media was the big accidental breach, with 1.37B, followed by Deep Root Analytics with 198M, The Hill with 191M and Facebook with 80M.

The giant bars above conceal some of the data where all of the bars are much smaller, so let’s look at some breakdowns more closely, such as the Physical loss bar chart that all looks rather flat compared hack attacks.

Things physically lost that hold financial and retail records sure don’t seem to hold a lot of records!

The U.S. government’s Department of Health and Human Services lists U.S. healthcare-related data breaches.  The current list under investigation (viewed 2/8/18) has incidents ranging from 2/19/2016-2/5/2018 consisting of 406 events, with an average breach size of 42,839.  It should be noted that this list does not include incidents where less than 500 records were breached, and also does not list incidents where the record count of the breach is unknown.  As a result, we might expect the average record count here to lean higher than the other 2 databases studied above.  Also keep in mind that an accidental exposure may not have been seen by anyone else before being discovered and corrected.

Observe how the frequency of the breach strongly correlates to the number of records breached, the one exception being the internal hack (unauthorized access/disclosure) which yields fewer records statistically.

These health figures also include the state they occurred in, so we can calculate (records exposed) / (state population) to determine the probability by state of having one’s records exposed.  Naturally, a big exposure ruins the odds for the state it occurs in.  Those over 10% were: Georgia (10.2%), Maryland (11.3%), Florida (11.4%), New York (18.5%), Kentucky (also 18.5%), Arizona (52%, thanks to Banner Health’s 3.62M record exposure).

Risk Based Security seems to have the most breach incidents recorded (3930) for 2015.  Each incident is tracked by a large number of fields, but the data itself is not available for public review, although there is a licensable API.  On, they’ve reported that the first half of 2017 had 2,227 breaches which exposed over 6 billion records.  As happened with other breach databases, U.S. was the leading country, and hacking (malicious outsider) was the leading method.  They have detailed reports analyzing various breakdowns (including data types exposed – name was most common, phone number was least common, and Social Security Number is somewhere in the middle), and cite that a single insider incident exposed 2 billion records.

Many U.S. states have an attorney general’s office that tracks data breaches, and maintains letter templates sent out alerting victims of the exposure.  In fact, Alabama and South Dakota are the only 2 states that do not have state data breach notification laws (New Mexico joined in 2017). Note the wide variance in collected data.  They all specify entity breached, exposure date(s), notification date, and type of data exposed.  Some also include a record count, means of access and/or state attorney general letter as a PDF. You may even get an annual state report like  States with such laws may use transparency to publicly list breaches or breach notification letters such as:

These are often organized displaying Organization name, Date(s) of Breach, and Reported Date.  Each incident may also reveal: number of records affected, method of access or industry.  It may also include a copy of the notification letter sent out to those affected, often specifying the types of data exposed.  There is often a lag time of about a month between the reported date and the inclusion of a letter template.  Likewise, there is lag time between the breach, the discovery of the breach, and the public acknowledgment of the breach.

Let’s look at one of the states and analyze the data more closely.  Massachusetts residents experienced 1,890 data breaches, not necessarily originating there.  3,321,352 records were exposed.  97,928 records (just under 3%) were exposed by paper, not electronically. We will look at the method of exposure, and if free credit monitoring was offered.  From 2007 to 2017, only 2 years have stood out from a continual climb in the number of notifications: 2013 and 2016, both of which were higher than the number from the following year.  However, it should be noted that there are many same-day bank breaches that record as multiple single person events, skewing the data.  The number of state residents affected, however, seems to alternate wildly each year between under 400,000 and over a million, with only 2 exceptions.  All it takes is one really bad breach to confuse the statistics, like Equifax in 2017.

Next let’s look at the average number of records exposed per method in the grid below.  Why is driver’s license so high you might ask, especially when it is more popular to steal charge cards, Social Security numbers and account numbers?  Because it was a field exposed by Equifax that had a relatively small number of incidents (see above grid).  It approximates dividing 3,000,000 people by the number of incidents – that one big number (Equifax records) skews the statistics, overwhelming all other incidents.

The chart above, once again, correlates heavily to Equifax breach exposures.

What can we learn from all of the data from all of the studies?  Breaches are becoming more frequent and increasing in the number of records affected, even with Gartner’s forecast of an 8% growth in worldwide security spending between 2016 and 2017.  Breaches are increasingly caused by malicious outsiders, but still occur rather often from people with inside access, by accident, and by physical theft, each of which can be made less likely by taking preventative measures in cyber security.  Companies issuing reports on data breaches (Poneman, ITRC/CyberScout, Verizon, etc.) use different methodologies and groupings, and as such the statistical conclusions, while often similar, differ greatly, often disagreeing if Healthcare or BFSI yields greater jeopardy and by how much.

Data breach penalties (GDPR’s deadline for Europe is May 25, 2018) and publicity will only increase with time, with a single incident being enough to wipe out a company.  According to the National Cyber Security Alliance, as much as 60% of hacked small and medium-sized businesses go out of business six months after a breach.  Meanwhile, the malicious outsiders, easily the most effective and growing threat, will continue to grow smarter and cause the most damage, which is why we recommend an outsourced cyber security solution that includes penetration testing.