What startups need to take advantage of AI – And how data sharing can help

Published in May 26, 2020

It’s the data

Data is driving innovation. The increase of data science techniques, not least Machine Learning and Artificial Intelligence, have increased the value and role of data as an asset even more. But what to do if the data needed to innovate is not readily available? Open data is well and good but comes with certain drawbacks. While a multitude of open data is freely available, it is often of limited quality, unstructured, or inconsistent, and thus insufficient for reliable and consistent development. Higher quality data may exist, but not be open – nor should it be. So what are innovators to do if they need access to specific datasets?

Much high-quality data is held within private organizations, and not intended for public consumption. This data could be of public interest, such as transaction data from mobile telecom operators, sensor data from personal communication devices, or from smart electricity consumption meters (sometimes known as ‘business to government’ or ‘b2g’ data sharing). It could be privately held data that is of economic interest, if it were to be used by skilled technology companies, to develop innovative new products, services, and markets.

Data shared is value-added

These data are only accessible if they are shared. Sharing data means allowing third parties specifically permissioned access to datasets to generate value – and is one of the cornerstones of the data economy.

Currently closed data is projected to be worth €739bn by 2020. This is one of the reasons that building a European data economy is one of the strategic goals of the European Commission, which named data-driven innovation as “a key enabler of growth and jobs in Europe.” Both the European Commission and the OECD consider data sharing a key driver of innovation, and a way to maximize economic, social, and environmental value.

The data need not even be complete, static data sets; it may be metadata or synthetic samples, and with the growth of the Internet of Things, could also be streamed sensor data. Making this data available for specified purposes can unlock value for the organization that holds it, for innovators working with the data, or for the general public.

A variety of benefits

Access to otherwise closed data is a competitive advantage, and access to vast, high-quality data sets common in the industry allows innovative start-ups to increase their deep / machine learning and artificial intelligence capabilities. They can use this shared data to develop new insights, products, or services.

Users of shared data are typically innovative start-ups or small to medium enterprises, but the value is not limited to them. It could equally be other organizations, neighbouring departments in the same organization, researchers, or individuals. They all benefit from data sharing in that they get access to data which they or their competitors would not otherwise have. This allows them to generate new insights, develop new or improve existing products or services, and establish themselves in the market.

Beyond simple access to the data, data sharing is also about the relationship. Working with other organizations’ data is conducive to improved business relationships, and can allow data users to work with and understand their business partners, or new market segments. This can in turn allow them to break into new markets and expand their reach.

Combining a startups’ own data with others’ in a reciprocal data-sharing relationship may be even more valuable. Rather than sharing data with others or having data shared with them, organizations can swap their data, pool it for a mutual benefit, or supplement data that is shared with their own, internal data. This would allow innovators to combine proprietary data sources, to produce a solution that is mutually beneficial to all involved.

Where to start?

All of that being said, setting up a data-sharing relationship can be a lengthy and complicated process. Fortunately, there are resources to help make it manageable, such as the Data Sharing Toolkit.