The Decision Maker

5 Reasons to Pool your Data

Posted by Mark Portz on Mar 7, 2017 1:04:00 PM

Find me on:


Data continues to prove itself as a necessity for decision-making in financial institutions. For years, major banks and innovative companies such as Google and Amazon have taken advantage of “Big Data” to gain better insights into their customer base and make business decisions to position themselves for the future. The credit union industry is finally beginning to take advantage of their data and utilize new technologies. However, credit unions are much smaller than major banks and simply don’t have the same quantity of data that banks are able to collect from their customers. Fortunately, data pooling serves as a great solution to this problem. Here are 5 reasons your credit union should participate in data pooling:

  1. Access to Diverse Data

“Why do I care about the data collected from a credit union on the other side of the country?” This is a frequently asked question when discussing data pools. Of course, it is a valid question. The economy may be different in December in Alaska compared to Florida. However, it is important to recognize that this diversity can actually be a major advantage that should not be overlooked. 

As Joe Breeden of Deep Future Analytics explains in a podcast with Best Innovation Group, titled The CECL Effect – How the New Credit Loss Rule will alter Financial Analytics, data diversity is healthy for pooling and advanced analytics. In the podcast he states, “If we get folks spread around the country, in a shared blind repository, then it gives us a better overall view of the scaling of the risk versus economics and other things.” He continues to explain that “We leverage that pool to learn aspects that are in common, like economic sensitivities, but then also to calibrate to the individual… so you get the benefit of the whole, but specific to the individual institution.”

  1. Affordable Access to Data Scientists

Data scientists are highly skilled, highly demanded, and expensive resources. They play a major role in analyzing and creating predictive insights (such as ALLL forecasting for CECL) from raw data, which means there is a reason data scientists often earn $175k+ per year.

Credit unions simply don’t have the same assets and hiring power as Google, Microsoft or the large banks which makes hiring a single data scientist a non-option. This is where the power of the data pool comes into play. If a data scientist works on a pool of data, consisting of the data from, say, 50 credit unions, those 50 credit unions get to split the cost of the data scientist, making advanced analytics much more affordable.

  1. Encrypted and Secure

Another common concern around the topic of data pooling is the access to private information. In a proper data pool, all personally identifiable information (PII) is encrypted prior to leaving the firewall at the credit union. In the pool, the data is still anonymized. Only after the data reenters the firewall again, is it de-encrypted using a de-encryption key that only the credit union holds. 

Data Scientist don’t need to know your individual members’ contact information, SSNs, etc., but all contributing organizations will benefit from sharing data that provides insights into loan risk, for example. Post analysis, you will never even be able to tell your data was pooled, except for the increased accuracy in your results.

  1. Quantity of Data for Predictive Analytics

Predictive analytics is no longer a luxury, but a requirement for upcoming regulations such as CECL. It is well-known that more data means more accurate results. Credit unions have potentially very insightful data to learn more about their members, but only if done collectively with the rest of the industry. There is simply not a large enough data set to perform accurate predictive analytics within the individual credit union. 95% of the credit unions in the United States are below $3.0 billion in Assets and do not have enough data to build accurate predictive models.

Fortunately, data pooling is coming to the rescue. Pooling data provides an opportunity to analyze a much larger data set. With a good model, each additional credit union participating in the pool will help to continue to decrease your margin for error and allow you to have more confidence in your data-driven decision making for the future.

  1. Near Real Time Industry Data for Peer to Peer Analysis

Although it is highly valuable, it is currently very difficult for credit unions to perform peer to peer analysis in a manner that is near real time. Typically, the best option for credit unions to perform any sort of peer to peer analysis is to compare data captured in 5300 Call Reports. However, this data is collected only once a quarter and likely published at least a month after collection. Valuable insights can be gained from this type of analysis, and it would be beneficial for credit unions to have access to this data before it is 4-5 months old. For example, if you realize your credit union is behind on loan origination, what changes can be made today versus 5 months from now.

A proper data pool makes it possible for credit unions to access industry data and perform analysis on data that is updated daily. This makes it possible to stay on top of industry trends before they have passed.

To learn more, listen to the Joe Breeden BIGcast about data pooling and CECL at 

Check out the entire podcast now!

Topics: Data Pool, Data Analytics