
In the next couple of articles, we look at the organizational structure of the EDW development team. This needs to be considered in the context of the development life cycle and areas of governance that I have discussed elsewhere, with particular reference to Release Management.
A few things to note while reading this list:
- Technical infrastructure is not included – this is taken as a given. In my experience, the hardware and software is set up and maintained by a different group, who are not usually involved directly in any design or development decisions. Of course, they are involved when it comes to the underlying technical architecture of the overall solution, and the infrastructure needs to support the flow of data through the warehouse layers.
- The organization tree structure diagram above is conceptual, and intended to represent a generic organization. Some organizations are larger and have many resources, internal and external within any branch. Others have a small core team that does all the development, with a single person handling many roles.
- The main thing to take away from the diagram is that the EDW Program team analyse, data model, ETL and QA their work, while project streams gather requirements, spec and build reports. The EDW Council is a body of higher authority who can arbitrate design decisions and provide governance of the process.
Program/Release Manager
The program manager is responsible for the overall management of the EDW release. This includes the fulfillment of all data requirements, adjusted after identification and analysis of the sources; as well as delivery of the validated, repeatable processes to populate the target databases.
Throughout the release life cycle, the program manager is accountable for all the major milestone deliverables that sit within the responsibility of the EDW development team. This excludes only the detailing of requirements for each project stream and the development of the business intelligence solution that is built on top of the data mart structures.
The program manager must be part of the intake process to determine the scope of each release. To this end, he/she is consulted on the definition of each project’s requirements.
While the program manager is not directly involved in, or responsible for, business intelligence, he/she is kept informed of BI progress to remain responsive to needs that may arise.
Project Manager
The project manager of each stream is directly accountable for the detailed definition of requirements feeding into the EDW, as well as the documentation of BI specifications and the delivery of BI reporting that will draw from the data mart.
The project manager will not be involved in the development of the EDW, but will remain informed of its progress. Early in the process, it is particularly important that the project manager be apprised of data quality issues that arise through identification and analysis of sources. The system of record is not relevant to an individual project, while the data mart will be a critical aspect of the project’s success; and so the project manager is informed of major milestones in the development of the data mart.
Business Analyst
The business analyst is responsible for the compiling the detailed definition of requirements. It is also the business analyst’s responsibility to identify the sources that will fulfill those requirements. It is to be expected that not all analysts will have the business knowledge to complete the first task as well as the technical knowledge to complete the second. This is a case where a single role could be performed by different people.
The business analyst supports processes that involve, or require input from, analysis. This means that may be actively involved in the analysis of profile results and writing SQL queries to interrogate the source data. He/she will find, amend and compose business definitions of fields. When it comes time to map sources to targets, the analyst will be engaged to communicate his/her understanding to the person writing the document. It may even be the case that the analyst’s skills in documentation mean that he/she writes does the writing.
Data content governance involves insight into the nature and quality of the data. To this end, the business analyst supports this governance process. The analyst is also consulted on the information and reporting requirements to determine suitable structures for the data mart.
Data Steward
The data steward is responsible for governing what data will be brought into the EDW, the governance of definitions for each of the data elements, and the identification and definition of business rules to ensure data quality. This means, that while the business analyst will see that definitions are attached to each field, the data steward must ensure that the definitions are complete and accurate.
The data steward supports processes that depend on an understanding of the business meaning of data. This includes the definition of requirements, the identification and analysis of sources, and the mapping of those sources to the system of record.
The data steward also supports naming standards, although, due to each steward being attached to a given subject area domain, or set of domains, they are not responsible for setting or maintaining these standards.
Data Quality Analyst
The data quality analyst is responsible for assessing the quality of data at various layers. Initially, the quality analyst is engaged in the activity of running the data profiling functions and any interrogation queries. While the data analyst may draft queries, it is ultimately the responsible of the quality analyst to ensure that the analyst’s questions get answered. Later in the development life cycle, the data quality analyst is responsible for testing the data populating the system of record and the data mart.
The quality analyst enables data analysis, and so supports the governance of data content in assessing the integrity and suitability of the sources.
The data quality analyst may be consulted on questions arising from mapping sources to the system of record. They may offer technical insights or be able to perform additional quality tests to support the mapping exercise.
The requirements and source fields are an input to the quality analyst’s activities, and so he/she is informed of the results of these processes.
Enterprise Data Architect
The enterprise data architect is responsible for the governance of data architecture, including the establishment of principles and guidelines, as well as reviews and final sign-off of the data models. He/she is also responsible for establishing guidelines for logical and physical naming standards.
The enterprise data architect is consulted on all data modelling activities, relating to both the system of record and the data mart. Data models for any other component of the architect will also pass through enterprise architecture for consultation.
The governance of data content relates to the placement of relevant data within the context of the reference architecture. The governance process involves the consultation of the enterprise data architect.
The requirements and source fields are an input to the enterprise data architect’s activities, and so he/she is informed of the results of these processes.
Data Modeller
The data modeller is responsible for all processes that are directly related to data model artifacts, including the logical mapping of the source to the system of record, logical modelling and physical modelling.
The data modeller supports processes that modelling depends on, including the analysis of the sources through profiling and interrogation; or are dependent on the models, which includes the documentation of the physical mappings to populate both the system of record and the data mart.
The governance of business definitions, naming standards and data architecture all pertain directly to the data models; and so the data modeller supports these processes, by ensuring that all fields contain complete, well-formed and accurate definitions and conform to standard naming conventions, and that the structure aligns with the design standards of the given component within the reference architecture.
The data modeller is informed of requirements and source fields, as these are inputs to the process of designing data structures.
Part II will complete the list of EDW Roles and Responsibilities.