šŸ“Š Big Data

āœ… Big Data Prescriptive Analytics Engine using Apache Spark

In this project, I actively contributed to the development of a powerful Big Data prescriptive analytics engine leveraging the capabilities of Apache Spark. The goal was to harness the potential of large-scale data processing and analytics to provide valuable insights and recommendations for decision-making processes for a renowned market research firm - IRI

Key Contributions:

  1. Big Data Processing with Apache Spark: I played a significant role in implementing data processing workflows using Apache Spark, a robust distributed computing framework. Through Spark's efficient data processing capabilities, we were able to handle vast volumes of data and perform complex computations at scale.

  2. Prescriptive Analytics: The focus of the project was on moving beyond descriptive and predictive analytics and incorporating prescriptive analytics. By analyzing historical data and combining it with real-time information, the engine offered actionable recommendations to optimize business processes and maximize outcomes.

āœ… Distributed Systems for Offline Video Processing

In this study, I explored the feasibility of employing distributed systems for offline video processing applications. The objective was to identify the potential benefits and challenges of using distributed architectures in handling video data at scale.

Key Study Points:

  • Scalable Video Processing: Through the study, I assessed the performance and scalability of distributed systems in processing video data. By distributing the computational workload across multiple nodes, we aimed to achieve faster and more efficient video processing.

šŸ¦¾ Technology Stack:

During the course of these projects, I gained hands-on experience with a range of Big Data technologies, including:

  • Apache Spark: Leveraged for distributed data processing and advanced analytics.
  • Apache Kafka: Utilized for real-time data streaming and messaging.
  • Apache Hadoop: Employed for distributed storage and processing of large datasets.
  • Apache Zeppelin: Used for interactive data exploration and visualization.

Working with these cutting-edge tools provided me with valuable insights into the world of Big Data and distributed computing. The projects not only enriched my technical skills but also deepened my understanding of how these technologies can be leveraged to solve real-world challenges effectively.