Profil professionnel
Vue d'ensemble
Expérience
Formation
Compétences
Langues
Certificats
Projets réalisés
Personnalisé
Chronologie
Generic

VENCESLAS OSEE NGASSAM KATE

Data Engineer |Databricks |Snowflake |Azure|Fabric
Montsoult, Ile de France

Profil professionnel

Certified Data Engineer (Databricks, Snowflake, Microsoft Fabric DP-700) with over 4 years of experience in designing, industrializing, and operating data platforms within complex cloud environments. I possess solid expertise in Big Data architectures (batch and streaming) , the productionalization of analytical pipelines and Machine Learning models , as well as supporting business teams in high-stakes business and regulatory contexts.

Accustomed to large-scale enterprise environments (Société Générale, Orange) , I have a strong understanding of business challenges , the ability to simplify complex technical topics , and a value-and-impact-oriented approach.

Vue d'ensemble

3
3
years of professional experience
2
2
years of post-secondary education
1
1
Certification

Expérience

Data Engineer

Société Générale
Paris
2024.07

Context & Challenges Role & Key Achievements Results & Impact

  • Société Générale managed massive volumes of sensitive financial and customer data for credit scoring and decision support.
  • Existing scoring pipelines required modernization and industrialization to enhance model performance, data traceability, and compliance with regulatory requirements.
  • The primary challenge was to ensure faster, more reliable, and explainable credit decisions within a highly constrained banking environment (security, auditability, data governance).
  • Redesigned and industrialized credit scoring pipelines using Azure Databricks and Azure Machine Learning.
  • Designed robust batch pipelines for the preparation, transformation, and validation of sensitive financial data.
  • Implemented workflow orchestration via Apache Airflow, including monitoring, alerting, and incident management.
  • Collaborated closely with Risk, Business, and IT teams to translate functional and regulatory requirements into operational data solutions.
  • Contributed to the deployment and monitoring of Machine Learning models (feature engineering, performance monitoring).
  • Participated in data documentation, traceability, and governance initiatives essential to the banking sector.
  • Improved credit scoring model accuracy by approximately 15%.
  • Reduced delays in credit decision-making processes.
  • Strengthened the reliability and auditability of data pipelines

Data Analyste

Institut National de la Statistique
Yaoundé
2022.09 - 2023.09

Context & Challenges Role & Key Achievements Results & Impact

  • The National Institute of Statistics was responsible for processing national census data, involving massive volumes of demographic, socio-economic, and geographic data.
  • Data originated from multiple, often heterogeneous sources, presenting significant challenges regarding data quality, reliability, and statistical consistency.
  • Insights needed to be clear and actionable for non-technical decision-makers within a high-stakes institutional and public policy context.
  • Analyzed, cleaned, and structured massive demographic datasets (over 10 million records) using Python, SQL, and R.
  • Performed exploratory data analysis (EDA) and advanced statistics to identify trends, anomalies, and key indicators.
  • Developed statistical and predictive models to analyze demographic and territorial evolution.
  • Designed and automated decision-support dashboards (Power BI, Tableau) for institutional leaders.
  • Implemented spatial analysis (GIS) using QGIS and Python, cross-referencing geographic data with demographic insights (density, distribution, territorial dynamics).
  • Communicated and simplified complex findings for non-technical stakeholders (statisticians, public officials, and decision-makers).
  • Improved the reliability and readability of the demographic indicators produced.
  • Accelerated the production of statistical reports and decision-support analyses.
  • Enhanced the integration of geographic data into national-level analyses

Data Engineer

Orange
Douala
2020.09 - 2022.09

Context & Challenges Role & Key Achievements Results & Impact

  • Orange managed large volumes of heterogeneous network data (telecom equipment, logs, real-time events) used for quality of service monitoring and incident management.
  • Existing processes were insufficiently industrialized, resulting in high processing times and limited reactivity from network teams when facing critical incidents.
  • The challenge was to implement a reliable and high-performance Big Data platform capable of processing massive real-time and batch streams while improving operational visibility.
  • Joined the Data team responsible for the supervision and performance analysis of telecommunications networks.
  • Designed, developed, and maintained Big Data ETL pipelines for the collection, transformation, and analysis of data from network equipment.
  • Processed millions of real-time network events using Apache Spark and the Hadoop ecosystem.
  • Implemented batch and streaming processes for incident monitoring, anomaly detection, and Quality of Service (QoS) analysis.
  • Optimized pipeline performance, leading to reduced processing times, improved resource management, and increased reliability.
  • Collaborated closely with network and IT teams to translate operational needs into actionable data solutions.
  • Authored technical documentation for pipelines and data flows to facilitate maintenance and future solution evolution.
  • Reduced processing time for critical tasks by approximately 30%.
  • Significantly improved the reactivity of network teams regarding incidents and anomalies.

Formation

Master Statistique & Big Data - Statistique & Big Data

ESSFAR
Yaoundé
2021.09 - 2023.04

Compétences

  • Multi-Cloud Expertise & Modern Platforms: Advanced mastery of Azure (Expert), with functional knowledge of GCP and AWS
  • Microsoft Fabric Architectures: Certified Expert (DP-700) in the design and implementation of end-to-end data solutions on the Fabric platform
  • Data Engineering (Batch & Streaming): Design of complex and real-time pipelines using Spark (PySpark/Scala), Event Hub, and Dataflow
  • Modern Data Stack & Transformation: Expert in data modeling (Data Vault, Star Schema) and industrialization using dbt (Expert) and Airflow
  • Data Science & MLOps: Development and productionalization of Machine Learning models using Azure ML, TensorFlow, and Scikit-Learn
  • Data Quality & DevOps: Industrialization through CI/CD (GitHub Actions, Docker) and ensuring data reliability with Great Expectations
  • Leadership & Communication: Agile project management, business requirements gathering, and the ability to simplify and explain complex architectures to diverse stakeholders

Langues

Français
Langue maternelle
Anglais
Opérationnel

Certificats

  • Microsoft Certified: Fabric Analytics Engineer Associate (DP-700)
  • Databricks Certified Data Engineer Associate
  • Snowflake SnowPro® Core Certification

Projets réalisés

End-to-End Data Architecture on Microsoft Fabric Modern Data Stack – Snowflake & dbt

  • Lakehouse Design: Implemented a Medallion architecture (Bronze, Silver, Gold) centralized on OneLake.
  • Ingestion & Engineering: Orchestrated multi-source data flows using Data Factory and performed complex data transformations via Spark notebooks (PySpark).
  • Modeling & Performance: Created optimized semantic models leveraging Direct Lake mode for high-performance analytics without data duplication.
  • Analytics & Governance: Deployed integrated Power BI reports featuring data quality monitoring and comprehensive technical documentation
  • Multi-source Ingestion: Automated data collection from MongoDB and various APIs using Airbyte.
  • Transformation & Modeling: Led the transformation layer using dbt (Expert level), applying Data Vault and Star Schema methodologies.
  • Analytical Storage: Optimized and secured the data warehouse environment on Snowflake.
  • Quality & CI/CD: Implemented automated quality testing with Great Expectations and established continuous integration via GitHub Actions.

Personnalisé

VENCESLAS OSEE, NGASSAM KATE, Data Engineer |Databricks |Snowflake |Azure|Fabric, venceslasngassam@gmail.com, +33745168946, www.linkedin.com/in/venceslas-osee-ngassam-kate-data-engineer, 22 rue de montmorency, Montsoult, Ile de France, 95560, 22 rue de montmorency, 95560 Montsoult, Ile de France

Chronologie

Data Engineer

Société Générale
2024.07

Data Analyste

Institut National de la Statistique
2022.09 - 2023.09

Master Statistique & Big Data - Statistique & Big Data

ESSFAR
2021.09 - 2023.04

Data Engineer

Orange
2020.09 - 2022.09
VENCESLAS OSEE NGASSAM KATEData Engineer |Databricks |Snowflake |Azure|Fabric