Data Analytics
Projects
Studywalk deals in assisting research scholars in the field of data analytics. Studywalk also provides business solutions in all industries by analyzing their data. Studywalk also helps various upcoming startups to estimate their sales/customers for future years. We work at 95% confidence level. Software used for purpose of data analysis is R/ SAS/ SPSS/ Python. A glimpse of our projects is given below.
Company Treating Depressed Patients
A company “Colours” deals with patients suffering from depression. Colours has team of psychiatrists who council individuals regarding the ups and downs in life, advantages of patience & hard-work. These psychiatrists help individuals in depression to refresh and restart their happy with the aim of success in happiness. The techniques of business analytics are used in this report to help Colours company in the correct direction of treatment for patients suffering from depression. The dataset masters is used which is obtained from a secondary source kaggle.com. The purpose of this report is to give recommendations to Colours Company dealing with patients in depression on the basis of suicidal dataset.
It is recommended to open the branch of Colours Company in country of the United States, and Japan. It is further advised to open more clinics in Japan as compared to the United States. It is calculated to open 55% of the total clinics in Japan in comparison to 45% of the clinics in the United States.
It is recommended to increase the treatment fees in the United States every 6 months. However, discounted schemes regarding the number of treatment sessions should be implemented in Japan.
The colours company is recommended to focus more on males as the clients in both the countries of Japan and the United States. Colours Company should focus on the client-age in the age group of 35 to 54 years in the United States and in the age group of 35 to 74 years in Japan. Providing a free counseling session regarding the identified reason for depression in this age group can beneficial for the company.
Car Repair Shop in Barcelona
The data set of accidents for the year 2017 in Barcelona is obtained from kaggle.com. The analysis of accidents in Barcelona is helpful for the company who is providing services to repair a damaged car. The analysis is performed to know the average number of victims and the total vehicles are maximum in which day, month, part of the day and district.
It is suggested to the company that they should emphasize on promotions on the weekdays in the month of October at as they have the maximum average number of victims. The company should increase the number of employees on duty on Tuesday and Friday as there are more than 27000 vehicles which met with an accident on these days. There are less than 10000 total vehicles with an accident in the month of August, thus the company should implement the procedure of hiring in the month of August. The two shift which should be created by the company is Morning-Afternoon and Afternoon-Night. It is further recommended to the company to emphasize the employees taking the morning-afternoon shift. The total number of vehicles which met with an accident during afternoon and morning time is more than 80000 and 64000 respectively. As a future recommendation, the company is recommended to open the service center in the district of Eixample as it has the highest number of average victims and total vehicles which meet with an accident. The results of cross-tabulation suggest that the number of employees with weekly off on Friday should be minimized for both the shifts. The company should give weekly off for employees in the second shift of afternoon-night on Monday and Tuesday. And weekly off for employees in the first shift of morning-afternoon should be given on Saturday and Sunday.
Music Company
From the past records of a Music Company, the analysis is performed regarding Adverts, Sales, Airplay, Attract, Price, Gender, Streaming services, Piracy, Top 20, Age etc.
Increased Usage of Cell Phones by Teenagers
The objectives of this analysis were:
The objectives of this analysis were:
1. To analyze the trend of usage of cell phones by teenagers with help of tables and graphs.
2. To detect and remove any outliers present in the data with help of z scores.
3. To know the relationship between year and total time spent on cell phone (hours/day)
This survey resulted in sufficient evidence of increasing trend of money spent on INTERNET pack per month, GB consumption per month, talking time, text time, social media time, and the total time spent on the cell phones by teenagers since 2009 until now. Currently, almost all the teenagers prefer to carry their cell phones while on a holiday and they recharge their cell phone once a month. With the increase in one year, there is 48 minutes per day increase in the time spent on cell phones by teenagers.
This survey resulted in sufficient evidence of increasing trend of money spent on INTERNET pack per month, GB consumption per month, talking time, text time, social media time, and the total time spent on the cell phones by teenagers since 2009 until now. Currently, almost all the teenagers prefer to carry their cell phones while on a holiday and they recharge their cell phone once a month. With the increase in one year, there is 48 minutes per day increase in the time spent on cell phones by teenagers.
Real Estate
The objective was to estimate the sale price of properties for future years. Preliminary data screen process includes identification, sorting and removal of duplicate values and outliers in the data. The dependent variable “sales price” was tested for trend with the help of Kruskal Wallis test. The trend was removed after taking various transformations. Correlation analysis was conducted and all highly correlated independent variables relating to the area were combined into one variable by applying appropriate linear transformations as using cluster sampling. I run a stepwise regression analysis and select the best model using adjusted R^2 and R^2. The significance of model and of each coefficient was tested. The assumption of regression analysis of homogeneity of variance and normal distribution of residuals was checked.
Chips Industry
Understand the buying pattern of **** chips in the organized snack food industry, observed across various socio-economic segments of society in terms of their frequency, channel and other behavioral aspects which affect their buying pattern for this category. I want to test whether Sales of Chips depends on Brand, Availability, Price, Quantity, Price spends on Advertising (Ad), Occupation, and Flavor. Also how these factors affect the buying pattern. The research helps to study the consumer behavior towards the product (***) in the presence of other players who are providing a similar product at the same price point.
Sleeping Times of Animals
The objective was to analyze the Sleeping time of various species of animals. For each species, Body Weight, Brain Weight, Non-Dreaming time, Dreaming time, Total Sleep time, Life Span, Gestation, Predation, Exposure, and Danger was recorded. Descriptive statistics were used to have an idea of all these variables for all animals in general. Correlation between total sleep and lifetime was observed. Techniques of regression analysis were applied to know the relation between body weight and brain weight on total sleep. Hypothesis testing and confidence interval were used to establish statistically with 95% confidence that mean sleeping time of animals is more than of humans (8 hours on an average).
Electronic Commerce
To set up an e-commerce business, comparison of E-commerce sales of various business types in the United States was carried out. The objective was to determine the field favorable to invest in. Statistical techniques were carried out to determine if e-commerce is spread in all types of business or if the electronic department is the most rapid in the field of E-commerce sales. To analyze whether there is the significant difference in the E-commerce value of Sales (in Million Dollars) for the business of “Clothing and clothing accessories stores” and “Electronics and appliance store” and Food and beverage stores”. Which type of business would contribute most to the total e-commerce sales in future? Concepts of one way ANOVA were applied to test if there is any significant difference in the mean e-commerce sales of three business namely clothing, electronics, and food industry. To test which pair differs significantly, Post Hoc test like Turkey Crammer was used. Further application of regression analysis was carried to predict the change in change in total e-commerce sales in future for of each type of business.
The objective of the current study is to see that trend or the pattern with which the E-Commerce handling companies are approached in the recent years in **** and ****. The objective is to know about the trust of buyers in the field of e-commerce and to analyze the portion of customers who are willing to make an online payment.
Clothing Industry
**** are a newly launched (fictitious) high-end fashion boutique operating an e-commerce only business model. They specialize in formal clothing, shoes, and accessories from international designers and are seeking to expand their customer-base through social media. The main issue is to analyze how consumers have interacted with their Instagram page in the first month of its launch. Primary descriptive and causal research through collecting quantitative data from their Instagram page analytics is conducted.
Economics
The aim of my project is to analyze and understand the factors which are responsible for the slow economic growth in ****. The dependent variable is the growth rate of *** which is measured by Gross Domestic Product it is also known as GDP rate and per capita Gross Domestic Product (PCGDP). The independent variables are various economic factors. A model will be developed to predict the growth rate on the basis of various economic factors. The economic factors which are also my independent variables include gross capital formation, labor force, foreign direct investment and human capital formation. In *** country world price of oil and exchange rate are the most important factors. The oil and natural gas are the main source of income for ***. The exchange rate is one of the important independent variables as the country has to depend on other nation for most of the items. Hence for the regression model, I consider the two main economic factors which are world price of oil and exchange rate as the independent variable.
Research area lies in the fact that unemployment, total (percentage of the total labor force) depends on Inflation based on consumer prices (annual %), Internet users (per 100 people), Adult Literacy Rate (population of 15+ in terms of percentage), and total fertility rate of women (percentage).
Medical
There were 8 levels of treatment namely DW (control), TWEEN, ARTH, WS 500, WS 1000, 6 OMEGA, 7WS +O, and 8 INDO. The 10 factors namely hemoglobin, paw volume, paw thickness, ankle diameter, arthritic index, body weight, ESR, RBC, TLC, and TNF Alpha of a rat is measured at an interval of 10 days. These factors are measured on day 0, day 10 and day 30. The objective is to determine the percentage increase in mean hemoglobin, paw volume, paw thickness, ankle diameter, arthritic index, body weight, ESR, RBC, TLC, and TNF Alpha from day 0 to day 30 for all significant groups. Techniques of ANOVA, Dunnets test, Turkey Crammer, growth rate were used.
Airlines Industry
To know if the carbon dioxide (CO2) emissions in million tons are dependent on Revenue by Passenger ($ billion), Freight tons (millions), World economic growth, % and Flights (million).