Find Jobs
Hire Freelancers

Big Data Project

₹5000-5500 INR

Κλειστή
Αναρτήθηκε 6 μήνες πριν

₹5000-5500 INR

Πληρωμή κατά την παράδοση
Store Sales Data Analysis: A Data Engineering Capstone Project Project Overview The project aims to analyze global sales data to offer actionable insights into regional sales trends, item popularity, and profitability. Real-World Implications • Optimizing Inventory: Know what items sell well in which regions. • Sales Strategy: Develop targeted sales strategies for different markets. Target Audience • Sales Managers • Business Analysts • Data Scientists Technologies and Tools • Data Processing: Pandas, Spark • Query Language: Hive • Data Visualization: Matplotlib, Seaborn • Big Data Technologies: HDFS, YARN Data Source The dataset includes: • Transaction Information: Region, Country, Item Type, Sales Channel, Order Priority, Order Date, Order ID, Ship Date • Sales Data: Units Sold, Unit Price, Unit Cost, Total Revenue, Total Cost, Total Profit Problem Statements Data Preprocessing 1. Null Value Elimination 2. Date Data Cleaning 3. Categorize Items 4. Sales Data Cleanup 5. Data Type Conversion 6. Seasonal Decomposition: Break down sales data into seasonal, trend, and residual components. 7. Feature Engineering: Create new features like Profit Margin, Sales Velocity. Data Analytics (Big Data Analysis with Visualization) 1. Number of Countries (Using Hive) 2. Units Sold by Region (Using Hive) 3. Most Recent Sales (Using Hive) 4. Products with Specific Letters (Using Spark) 5. Top Selling Countries (Using Spark) 6. Item Costs (Using Spark) 7. Sales Yearwise (Using PySpark) 8. Orders per Item (Using PySpark) 9. Country with Highest Sales (Using PySpark) 10. Customer Segmentation: Use clustering algorithms to identify different customer segments. 11. Time Series Forecasting: Predict future sales using ARIMA or LSTM. 12. Anomaly Detection: Identify any anomalies or outliers that could indicate fraudulent activity. 13. Association Rule Mining: Find associations between different products in the data (Using Spark). 14. Price Elasticity: Understand how the demand for a product changes with a change in its price (Using PySpark). 15. Correlation Between Priority and Profit: Analyze if 'Order Priority' has any correlation with 'Total Profit'. Data Visualization 1. Regional Sales Distribution 2. Top 10 Items Pie Chart 3. Sales Time Series 4. Profit Distribution 5. Sales by Item 6. Heatmap: Show the correlation between different numerical features like Unit Price, Unit Cost, and Total Profit. 7. Interactive Dashboard: Create an interactive dashboard where users can filter data by year, region, or item. 8. Geographic Heatmap with Time Slider: Show how sales in different regions have evolved over time. 9. Cohort Analysis: Visualize customer retention over time. 10. Bubble Chart: Display Units Sold, Total Revenue, and Total Profit in a three-dimensional bubble chart. Performance Metrics 1. Spark Job Metrics 2. Query Latency in Hive 3. HDFS Storage Utilization 4. Data Skew Detection 5. Resource Utilization with YARN 6. Task Failure Rates: Monitor and minimize the failure rates of tasks in Spark or Hive jobs. 7. Data Replication Metrics in HDFS: Track and optimize data replication times and success rates. 8. Data Ingestion Latency: Measure the latency of data ingestion from different sources into HDFS.
Ταυτότητα εργασίας: 37404545

Σχετικά με την εργασία

4 προτάσεις
Απομακρυσμένη Εργασία
Ενεργός/ή 5 μήνες πριν

Ψάχνεις τρόπο για να κερδίσεις μερικά χρήματα;

Πλεονεκτήματα πλειοδοσίας στο Freelancer

Καθόρισε τον προϋπολογισμό σου και το χρονοδιάγραμμα
Πληρώσου για τη δουλειά σου
Περίγραψε την πρόταση σου
Η εγγραφή και η πλειοδοσία σε εργασίες είναι δωρεάν
4 freelancers δίνουν μια μέση προσφορά ₹5.938 INR για αυτή τη δουλειά
Avatar Χρήστη
I can Store Sales Data Analysis. I am a freelancer having 7 years of experience in Python Language Development. I'm having the following skills in python: ◈ Object-Oriented Programming (OOP) in python ◈ R programming ◈ Jupyter notebook and Google Colab ◈ Data structures and Algorithms ◈ Web development with frameworks such as Django, Flask, and Streamlit and others ◈ Machine learning, deep learning, and Artificial Intelligence ◈ Database integration, including SQL and NoSQL ◈ Data analysis and visualizations ◈ Text processing and natural language processing, Tokenization ◈ Debugging and troubleshooting ◈ Functional programming. I hope I'm a good candidate for your project. I will deliver your project on time with quality assurance at affordable price. Please message me so that we can discuss more about your project.
₹5.250 INR σε 2 ημέρες
4,8 (3 αξιολογήσεις)
2,3
2,3
Avatar Χρήστη
I have an extensive experience as MLOps and worked with various projects and data formats. I find myself as a good candidate for this problem.
₹5.250 INR σε 60 ημέρες
0,0 (0 αξιολογήσεις)
0,0
0,0

Σχετικά με τον πελάτη

Σημαία της INDIA
Nadia, India
0,0
0
Μέλος από Σεπ 10, 2023

Επαλήθευση Πελάτη

Ευχαριστούμε! Σου έχουμε στείλει ένα email με ένα σύνδεσμο για να διεκδικήσεις τη δωρεάν πίστωση σου.
Κάτι πήγε στραβά κατά την προσπάθεια αποστολής του email σου. Παρακαλούμε δοκίμασε ξανά.
Εγγεγραμμένοι Χρήστες Συνολικές Αναρτημένες Δουλειές
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Φόρτωση προεπισκόπησης
Δόθηκε πρόσβαση για Geolocation.
Η σύνδεση σου έχει λήξει και τώρα έχεις αποσυνδεθεί. Παρακαλούμε συνδέσου ξανά.