Q1. Carefully Read the “COMM Bank Retail Business Insights Report FY18” provided with this as an attachment and answer the below questions.
i. Comment on the insights report based on the overall features; including the
quality of visualisations, presentability, and the information provided.
ii. List the key information you derive from this insights report and explain how
they will be useful in decision making.
iii. Write an abstract (one paragraph) summarising the insights report.
iv. Suggest improvements to this insights report.
Q2. Regression analysis is a commonly used technique to find relationships among
variables. Answer the below questions based on regression analysis.
i. Provide an example where regression analysis can be effectively used.
In finance, a regression analysis used to calculate the beta of a stock
ii. Collect height and weight data from 10 friends/relatives of yours and complete
the below table. Every student in class should have a unique set of values.
Height (cm) Weight (kg)
1. Ceci 170 51
2. Krystle 165 48
3. AL 173 56
4. Nathan 190 71.5
5. Mandy 160 60
6. Chris 185 68
7. Betty 155 47
8. Sue 155 55
9. Jackson 178 65
10. Tina 158 55
iii. Draw a scatterplot based on above data. Based on your plot comment on the relationship between height and weight.
iv. Compute the equation of the regression line.
Mean X = 168.9
Mean Y = 57.65
Sum of squares (SSX) = 1404.9————SSx=(X-Mx)²
Sum of products (SP) = 742.15————SP=(X-Mx)(Y-My)
b = SP/SSX = 742.15/1404.9 = 0.52826
a = MY – bMX = 57.65 – (0.53*168.9) = -31.57282
ŷ = = bX + a —–ŷ = = 0.5283x – 31.573
v. Calculate the R² value and comment on the goodness of the fit.
—————R² = 0.6262 The correlation is positive, and there seems to be a relationship between height and weight as height increases. R2 showed that 62%% of the change in height was due to a change in weight, while 38%% was unknown.
vi. Use an analytics tool of your choice to calculate the values for iv, and v. Compare them with your answer.
Q3 Classification and regression are commonly used processes in business analytics.
i. Briefly explain the difference between classification and prediction.
Classification is the process of identifying the category or category label to which the newly observed object belongs. Prediction is the process of identifying missing or unavailable numerical data for new observations.
ii. Give examples for classification methods you know.
Credit card companies typically receive thousands of applications for new cards. The application contains information about several different attributes, annual salary, age, job nature, and so on. How to classify applicants into good credit, bad credit or grey area.
iii. The following diagram shows a neural network with one hidden layer.
Write down the algebraic equation for y1 in terms of input values i1,i2 and weights w. Briefly explain how neural networks are used for classification.
iv. Give at least three examples how clustering can be used in business analytics. In your answer explain how each business case could be addressed using clustering.
Example: The management team of a large shopping center wants to know the types of people who are or may be visiting the shopping center so that they can better design and locate shopping center services (such as event invitations, discounts), etc. To make these decisions, the management team conducted market research on a number of potential customers.
Explain: use 8 steps to clustering, 1. Confirm that the data is the metric, 2. the data range, 3. select the segmentation variable, 4. define the similarity measurement, 5. Specify pairwise spacing, 6. Number of segments and methods, 7. Analyze and interpret, 8. Robustness Analysis
Example: With the development of the Internet, people can browse any web page, content, they can edit or write fake things to upload. Even the news is fake, especially in the case of some major cases, and fake news is not easy to be identified.
Explain; The algorithm works by absorbing the contents of the corpus of fake news articles, checking the words used, and then clustering them. These clusters help the algorithm determine which messages are real and which are fake. Some words are more common in engaging clickbait articles. When you see a high percentage of a particular term in an article, the material is more likely to be fake news.
Example: If you have limited time and need to quickly collate the information stored in the document. To solve this problem, you need to: understand the subject of the text, compare it to other documents, and classify it.
Explain; How clustering works: hierarchical clustering has been used to resolve this problem. The algorithm is able to view text and group it into different topics. Using this technique, you can quickly cluster and organize similar documents using the characteristics identified in this section.