With my experience and my expertise in the field of information technologies and specially in the field of Big Data, I am able to offer you advice and solutions to maximize the use, performance and security of your Big Data platform.
- Big Data Strategy
I am proficient in Big Data and the technologies you need to use to make it meet your needs.
Whatever the format of your data, structured or unstructured, I can define and develop a plan for you to implement, prepare, cleanse, analyze and store your data in a Cloud environment (cloud computing) or locally.
- Big Data Technologies
In any discussion about big data, we inevitably end up talking about Hadoop or Spark. While the two tools are sometimes seen as competitors, it is often believed that they work even better when taken together.
Both are big data frameworks, but they don’t really have the same use. Hadoop is essentially a distributed data infrastructure: this free Java framework distributes the large amounts of data collected through several nodes (a cluster of servers), and it is therefore not necessary to acquire and maintain specific and expensive hardware. . Hadoop is also able to index and track this big data, making it much easier to process and analyze than was possible before. Comparatively, Spark knows how to work with distributed data. But it doesn’t know how to do distributed storage. It therefore needs to rely on a distributed storage system.
For storing, processing and analyzing large volumes of structured or unstructured data, there is typically no better tool than Apache Hadoop and Apache Spark combined together.
I support you to help you understand and master these technologies and set up a personalized Hadoop solution that perfectly matches your use case, specifications and requirements.
- Analyzing structured, non-structured, IoT data …
Both structured and unstructured data offer many possibilities for businesses. But it is naturally necessary to be able to analyze them.
In order to anticipate business demand, we are constantly developing new tools. The most popular set of these tools is the Hadoop ecosystem combined (or not) with the Spark ecosystem. This set of technologies makes it possible to analyze any format of data whether structured or unstructured in no time at a reasonable cost for various purposes.
No matter how much data you have, it is wasted until – across the enterprise – different business areas can use it to benefit from it.
I put in place for you the necessary tools to analyze your data and offer you solutions for storing this data. Also, I am specialized in the development of NoSQL solutions. Solutions that transform your unstructured data into payload and make it available to your users and your business intelligence infrastructure.
- Proof of Concept
First of all, let’s define what is a Proof of Concept (PoC). A PoC is the realization of a small part of a larger project. This may be the delivery of a working prototype or not. In addition, the PoC must have a specific end date as well as a well-defined scope.
Therefore, it is the ideal platform to be able to validate the dependencies, the assumptions and the sequence of the stages of realization of a project. This is also a good time to better define the less precise requirements. Because the scope is small, the impact of things that are unforeseen or beyond the control of the production team is lessened. This makes it much easier to react and adjust along the way.
PoC can also be an excellent solution to demonstrate the feasibility of a big data project. Indeed, being able to present concrete results and thus offer better visibility very quickly to management makes it possible to concretely demonstrate the advantages and thus increase enthusiasm for the project!
Would you like to produce a PoC that demonstrates the feasibility of your project? Or which involves statistical learning such as face detection and sentiment analysis? I am here to guide you and support you in setting up such a Proof of Concept.
The concept of Big Data was born out of the quantitative explosion of digital data. Training explains the challenges of big data, both in terms of its capture, storage and analysis, and also how the architecture of big data differs from traditional business intelligence.
Whether you are an engineer, technician and professional or just interested in the phenomenon, a training in Big Data is always useful to master the concepts, technologies and architecture of big data.
I can train you to better understand these concepts and master these technologies: (non-exhaustive list)
- Data volume
- Data formats
- Data sources
Hadoop/ Saprk ecosystems
- Main components of Hadoop
- Hdfs, MapReduce, Pig, Hive, Impala, Sqoop, Nifi, Kafka, Solr, HBase, ..
- Distributions (Cloudera, EMR, …) , Other Big Data solutions
- Cloud solutions
- Apache Spark (Spark SQL, MLLib, Streaming, Scala, Python, …)
- Data lake and NoSQL database
- Data ingestion
- Real-time or batch analysis