Introduction

Our consultant was brought in to assume control of a pre-existing Data Vault project for a Government Agency.

Project Assessment and Challenges

The Data Vault solution was hosted in an Azure SQL DB environment with data movement handled by an SSIS runtime in Azure Data Factory. The choice of SSIS-based ETL pipelines was influenced by the internal team’s familiarity with SQL Server. The internal team were struggling for motivation as they had been marginalised and their ideas were not being considered.

Ownership Transition

During transition of ownership, our consultant:

  • Held workshops with the internal team to understand their perspectives and provide them an opportunity for them to showcase their ideas.
  • Established a clear project vision and objectives to guide the team’s efforts.
  • Aligned the project with Data Vault 2.0 delivery principles, ensuring the project adhered to industry standards and best practices.

Review and Corrections

The review led to revisions in the design:

  • Meta-driven ETL: A meta-driven approach to data ingestion was designed using Data Factory, transitioning the project towards a more cloud-centric solution.
  • Iterative Approach: The delivery process was streamlined, allowing for different phases of the project to be delivered in parallel, enhancing efficiency and productivity.
  • Continuous Monitoring: Implementing of a system for continuous monitoring and feedback to promptly identify and address issues that may arise during the project lifecycle.
  • Learning: “Lunch learns” were introduced to showcase new developments. “Paired Programming” sessions were also introduced to encourage collaboration and knowledge sharing.

Revised Architecture Implementation

The revised architecture implementation led to significant enhancements in the project:

  • ETL Pipelines: Priority was given to rewriting the ETL pipelines in Data Factory. This step improved the efficiency and reliability of data transformation processes.
  • Landing Zone: A landing zone was established to handle both structured and semi-structured data allowing for data to be received and processed in isolation before being loaded into the Data Vault.
  • Data Privacy: Encryption and pseudonymisation were integrated into the system to enhance data privacy during ingestion. This step was crucial in ensuring compliance with data privacy requirements and safeguarding sensitive information.

Project Recovery Strategies

To alleviate the delays in delivery, the team implemented the following changes:

  • Decoupling Data Layers: Decoupling the data ingestion layer from the data storage layer, allowed the team to deliver content regularly, enhancing the project’s agility and adaptability.
  • Data Sources and Presentation Layer: The team was intentionally split to focus on separate deliveries, allowing for the consumption of additional data sources whilst modeling the presentation layer.

These strategies not only helped in recovering the project timeline but also set a robust foundation for future scalability and adaptability.

Lessons Learned and Best Practices

Some of the key lessons learned and best practices:

  • Early Architectural Alignment: It is crucial to establish a clear architectural vision at the outset of a project. This helps in avoiding costly redesigns and rework later in the project lifecycle.
  • Cloud-Native Design: When working with cloud implementations, it is beneficial to leverage cloud-native services. This approach can enhance scalability, resilience, and performance.
  • Inclusive Team Culture: In a blended team, it is important to create an environment where everyone’s ideas are heard and considered. This can lead to more innovative solutions and a more engaged team.
  • Continuous Learning and Knowledge Sharing: Fostering a culture of continuous learning and knowledge sharing encourages innovation and collaboration within the team.
  • Data Vault Methodology: While Data Vault is a robust methodology, it requires significant planning and design for effective implementation. Having the right automation tools in place can greatly enhance its implementation and effectiveness.

Results and Impact

Here are some of the key results and impacts of the project:

  • Cost Reduction: The migration from SSIS to an native Data Factory solution led to significant cost reductions.
  • Improved Efficiency: The implementation of batch processing and micro-batching improved overall efficiency.
  • Scalability and Maintainability: The implementation of Data Vault provided a more scalable and maintainable solution. This underscores the importance of choosing robust and flexible methodologies for data management.
  • Enhanced Data Privacy: The implementation of a zero-trust security model enhanced data privacy.

Conclusion

The strategic realignment of the project has yielded significant benefits. The migration to a native Data Factory solution has not only reduced costs but also improved efficiency. The Data Vault implementation has proven to be a scalable and maintainable solution, demonstrating the importance of robust data management methodologies. Furthermore, the adoption of a zero-trust security model has significantly enhanced data privacy, underscoring the critical role of stringent security measures in today’s data-driven world. These collective improvements have greatly enhanced the overall performance and security of the system, paving the way for future growth and innovation.

Get Started

If you are interested in discussing your needs with us, feel free to book some time with us.