FAQ : the Data Catalog in 7 questions/answers

Jul 17, 2020 | Data Catalog

The Data Catalog is the perfect tool to start implementing effective data governance. This data solution is ideal for making the company’s metadata accessible to all employees. 

A collaborative Data Catalog is essential in order to pool everyone’s efforts in the definition of data. Everyone collaborates around a common data heritage. With our FAQ on the Data Catalog, you will have a good knowledge of this data tool.

Data Catalog FAQs

? What is a Data Catalog?

A Data Catalog is a place where metadata of data stored in the enterprise is centralized and grouped together.

This metadata is critical to understanding the context of the data: structure, quality, definition and use of the data is accessible from a centralized location.

With an ever-increasing amount of data, companies are forced to adapt and use new tools. The Data Catalog becomes the indispensable tool to control and better use all this data.

? What is the Data Catalog for?

The main purpose of the Data Catalog is to make self-service data sources accessible and understandable to all users.

With this access to and understanding of metadata, the number of data silos within the data environment is greatly reduced.

The Data Catalog also enables data teams to speed up data analysis and make it more accurate.


? Who uses the Data Catalog?

The Data Catalog is used by all Data Bakers, the everyday users of data. Whether Data Governor, Data Manager, Data Craftsman or Data Consumer, the profiles using the Data Catalog are diverse and varied.

The Data Catalog allows its users to discover and use data sources, but above all to understand them in order to move forward with their projects. 

?️ What are the essential features of a Data Catalog?

There are several types of Data Catalog. They do not all have the same functionalities. However, it is important that the one you choose offers these features.

A good Data Catalog should have the ability to automatically import metadata from different data sources. Without this possibility, everything will have to be done manually, a long and tedious job.

A data catalogue that will be able to scan and automatically load metadata from hundreds of different data sources will be a definite asset.

A good data catalogue should contain a collaborative layer to allow users to collaborate around data knowledge. It will offer the addition of comments, information, access rights, a notification center, labels, etc.

Finally, a good Data Catalog must absolutely offer an accessible and above all very powerful search engine.

It is notably from this functionality that most users will use the Data Catalog, in order to understand a piece of data.

? What are the objectives of the Data Catalog?

The objectives of a Data Catalog are multiple, in particular :

The implementation of agile data governance

The data catalogue is the ideal tool to start mapping and highlighting the data life cycle. DataBakers will know where their data is located, who uses it, how it is used and for what purpose.

Data knowledge is accessible to everyone.

Ensure real-time data documentation

The Data Catalog allows to set up a directory of metadata, technical, but also business. This stored information is easily accessible to accelerate collaboration around data in different projects.

Giving context makes data intelligent

With the accessibility to the context of all metadata, uses and treatments of the data that are known and documented, there is no room for error.

Productivity is optimized to support projects and increased innovation.

Data is more and more accessible, faster and faster.

? What are the use cases for a Data Catalog?

The Data Catalog can be useful and used for different use cases.

In a data governance implementation approach, the Data Catalog can be used to :

  • Better organize its data
  • Improving access to information for data users
  • Identifying who is responsible for a data item and managing access rights
  • Qualify the quality of data
  • Begin its GDPR compliance

? What are the qualities of a Data Catalog?

The Data Catalog gives access to information, it allows a democratization of the data knowledge of the company. It is a massively collaborative tool, allowing data assets to be shared.

The first expected quality is therefore inclusion. It is unthinkable to limit access to information to a single population of users without really valid reasons. Controlled access must be explained and understood by all users of the data catalogue.

The second quality is that the Data Catalog must be a two-way repository, as interactivity is often much more powerful than simple dissemination. This is why the data catalogue must be a collaborative data catalogue.

All users must be able to provide feedback, specify information, notify managers, ask questions, etc.

And the last quality is that the Data Catalog must be an enabler, therefore allowing to accelerate the data census and facilitate the navigation in the data heritage.

A powerful search engine is obviously a definite advantage.