The Trellance Data Blog

Data Pooling: Leveraging Your Neighbor’s Data

Posted by Mitch Nelson on Oct 8, 2015 1:27:48 PM

Data Pooling: Leveraging your Neighbor's Data

The trend of data-driven decision making is exploding within the credit union space.  Pressures to increase revenue, reduce risky assets, and efficiently identify qualified sales leads have all contributed to the growing trend.  But as the push for data-driven decision making has gained popularity, the need for a wider breadth of data has become apparent. 

For most decision making, credit unions need only leverage the data within their own walls. However, some types of decisions require a larger volume of data. Data like credit risk forecasts require immense amounts of underlying data to be accurate. Most credit unions by themselves do not have the critical mass of necessary data for such forecasting.  However, in the collaborative spirit of the industry, credit unions can join forces to amass an adequate amount of data through the process of data pooling. 

What is Data Pooling?

The process of data pooling involves multiple credit unions securely transmitting their data to a data pool provider. The data is compiled into a single data set where algorithms created by the data pool provider generate analytic results.  The results are then sent back to the originating credit unions for their personal use.  A credit union will only receive data back related to the data they originally sent. 

The beauty of data pools, however, is the results sent back from the data pool are based off ALL the data within the data pool.  In effect, a credit union can leverage another credit union’s data.  That being said, pulling together multiple credit union data sets into one common data set is not an easy feat.

The challenge in data pooling arises from the fact that credit unions tend to store their data in varying systems.  Different core and ancillary data systems sometimes do not mix well when creating a data pool.  In order for multiple credit unions’ data to be compatible, the data needs to be stored under the same standard.  Once the data is housed under a uniform standard, creating a data pool becomes more manageable.

The key to successful data pooling is establishing the connection.  Connection to a data pool consists of five main phases: data extraction, data transmission, data pool presence, data retrieval, and data storage. 

Data extraction first involves identification of the relevant data that will be sent to the data pool.  In a data pool, not all data at the credit union needs to be sent to the pool.  Only the data needed for analysis and identification will be sent, and it will be arranged to fit the data pool format. 

Data transmission to the data pool requires a few preparations for security measures.  First data is masked, meaning that the data which identifies either an individual member or credit union is scrambled to random characters.  The credit union has the unique key to unmask its own data.  The data transmission will also be packaged in an encrypted file.  The credit union and the receiving pool have the unique key to decrypt the transmission package.  With the data security measures in place, the data is sent via a secure file transfer protocol to the data pool.

In the data pool (data pool presence), the data pool provider will run its data analytics model against the pool’s data and assign the result to the corresponding individual credit union data.  The data itself will not be identifiable, just the origin of the data so it can be packaged for retrieval by the appropriate credit union.  The new package of data is then prepared for retrieval.

For data retrieval, the results are posted to an origin specific directory.  A credit union based scheduler pings credit union specific mailboxes and retrieves data when present.  Once returned, the data is decrypted using the credit union unique password, and unmasked using the unique masking key.

Data storage is the last phase in a data pool connection.  After the data is retrieved, the analytic results are linked to the corresponding member data and are integrated back into the credit union’s data set.  The newly generated data is stored into the credit union’s data warehouse as another data element and can now be used for analysis and reporting.

Data pooling is an innovative process that can expand a credit union’s view of useful (and profitable) information.  Utilization of analytics from pooled data can give your credit union the extra validation needed for important decision making.  Imagine knowing that each decision made has measurable and tangible proof behind it.  Integrating data from your organization into a data pool may provide the answers your credit union is missing out on.   

New call-to-action

Topics: Data Pool, Data Storage