Québec Blue Cross

Big Data Solutions Architect

Québec Blue Cross is a registered trademark in Health Insurance. It is exploited by Canassurance Hospital Service Association, a member of the Canadian Association of Blue Cross Plans.

The Chief Architects Manager has the project to build a data hub, a kind of a data lake to centralize data events and share them among business applications and services.

In this context I had to:

  • Design of a decoupled and event-driven architecture (Data Hub platform) based on Microsoft Azure Data Lake Gen 2, Microsoft Azure Databricks (Spark, Scala), Kafka, Nifi, MongoDB, MySQL, Atlas, Ranger
  • Definition of a security strategy (RBAC and TBAC)
  • Act as a technical advisor and mentor
  • Work on a comparative study of data governance tools (Collibra, Informatica, etc.)
  • Implementation of a POC to illustrate Apache Spark -> MarkLogic interaction
  • Implementing a POC to demonstrate real-time event ingestion, processing, and storage using Nifi, Kafka, Spark, MongoDB, and MySQL
Methodology

Implementation of a Marklogic environment to validate the correspondence with business needs.

Setting up a data analysis environment in the MS Azure cloud.

Key Details

Role: Big Data Solutions Architect

Project Date: 2020

Project duration: 6 months

Location : Montréal – Canada

Technologies: MS Azure, Spark (Scala), Kafka, Nifi, MongoDB, MySQL, Atlas,  Ranger, Hadoop, Hive, Marklogic

Main Steps

Defining access and security strategy

Definition of the data access strategy. Using Atlas and Ranger to define the access control level (Role Based Access Control) and (Tag Based Access Control)

Preparing the environment on MS Azure

Designing the architecture of the data platform.

Data flow ingestion with Nifi and Kafka

Analysis with Spark Databricks

Storage with MongoDB and MySQL

Preparing POC Infrastructure

Implementing  an environment based on Marklogic to validate business needs