1. Discover Data Analysis
🀄

1. Discover Data Analysis

Multi-select

Lesson 1: Get started with Microsoft data analytics

Multi-select 1
Read
Status
Done

Introduction

  • Data analysis is about story telling and the data must be clean for the story to be right. The underlying facet of the data analytics is that a business needs to trust its data. However, it is equally important that the data is capture from a trusted sources and shape (wrangling) into somethings that readly consumable, meaningful, and easily understood to help with the decision making process. Data analysis support business owners in their confidence that they are making the right decision.
  • As the amount of data grows, so does the need for data analysts. A data analyst knows how to organize information and distill it into something relevant and comprehensible. A data analyst knows how to gather the right data and what to do with it, in other words, making sense of the data in your data overload.

Overview Of Data Analysis

  • Data analysis is the process of identifying, cleaning, transforming, and modelling data to discover meaningful and useful information. The data is then crafted into a story through reports for analysis to support the critical decision-making process.
  • Due to large availability of data, gaining insight from data is one of the top priority for organisation, thus, story telling from data is very important as a skill for data scientist.
  • With the right interpretation of data, all hidden facts about consumers can be unravel, with the ability to make a right decision and commitment of right resources for the business long term goals.
  • We are either doing either one of the data analysts process has identified below when we analysed data:
    1. Descriptive
    2. Diagnostic
    3. Predictive
    4. Prescriptive
    5. Cognitive
  • Descriptive Analytics: This helps in answering questions about what has happened based on historical data. Descriptive analytics techniques summarize large semantic models to describe outcomes to stakeholders. Example is developing a key performance indicators from a report, and the insights from the KPI can be used to track the success or failure of project key objectives. Metrics such as return on investment (ROI) are used in many industries, and specialized metrics are developed to track performance in specific industries. An example of descriptive analytics is generating reports to provide a view of an organisation's sales and financial data.
  • Diagnostic Analytic: This is the state at which the causes of incident is determined. The purpose at this stage is to focus on answering question about why events happened. Diagnostic analysis is a supplement to descriptive analysis because diagnostic analysis usually use the information from the descriptive analysis to form the conclusion about the cause of the incident. The usual steps in applying diagnostic approach involves studying the data for any anomalies with the help of statistical techniques. This will shows all possible trend and their underlying explanation when combine with the descriptive analysis.
  • Predictive Analytics: With the aid of machine learning, predictive analytics can predict the possible outcome from the available data. More of historical data are use in the predictive analytics and this helps in providing insight to what might happen in the future.
  • Prescriptive analytics: This focus on which specific actions should be taken base on the available information to achieve the set target. The insight from the data inform the right decision of the prescriptive analysis because without right information and knowledge from the data, the end point decision cannot be considered objective. Note that machine learning is one of the driving forces for the application of prescriptive analysis.
  • Cognitive Analytics: Existing data and pattern are part of the sources of knowledge that cognitive analytics use in forming knowledge base. This is facilitated by deriving conclusion from these patten and this will serve as a learning for the system. Simply, it is learning from the past data and using the knowledge of the past data to predict the likehold of events in the future.Effective cognitive analytics depend on machine learning algorithms, and will use several natural language processing concepts to make sense of previously untapped data sources, such as call center conversation logs and product reviews..

Roles in data

Often times, you will find the following roles in a company:

  • Business analyst
  • Data Analyst
  • Data Engineer
  • Data Scientist
  • Database administrator
  • Business analyst: While there is overlap between the role of business analysts and data analysts, what they actually do with the collected data makes the difference between the two. A business analyst is closer to the business and it analyse data that comes from visualisation. Note that the two roles often combine as a single responsibility for a person as data analyst.
  • Data Analyst: This role enable company to see the a deeper insight from their data through a tools like the Microsoft Power BI. Data analysts are responsible for profiling, cleaning, and transforming data. Their responsibilities also include designing and building scalable and effective semantic models, and enabling and implementing the advanced analytics capabilities into reports for analysis. They also work with the necessary stakeholders in the indentification of the needed data, requirement, and turning those into a relevant and meaningful insight. They work mostly with data engineer in identifying and locating the appropriate data sources that meet the stakeholders requirement. Data analysts often give an insight to the data engineer on how best to improve the data gathering at source.
  • Data engineer: secure the flow of structured and unstructured data from multiple sources. The data platforms that they use can include relational databases, nonrelational databases, data streams, and file stores. Data engineers also ensure that data services securely and seamlessly integrate across data platforms. Primary responsibilities of data engineers include the use of on-premises and cloud data services and tools to ingest, egress, and transform data from multiple sources. Data engineers collaborate with business stakeholders to identify and meet data requirements. They design and implement solutions. There is overlap of duties between the work of a data engineer and data administrator, however, a data engineer's scope of work goes well beyond looking after a database and the server where it's hosted and likely doesn't include the overall operational data management.
  • Data Scientist: Data scientists perform advanced analytics to extract value from data. Their work can vary from descriptive analytics to predictive analytics. Descriptive analytics evaluate data through a process known as exploratory data analysis (EDA). Predictive analytics are used in machine learning to apply modeling techniques that can detect anomalies or patterns. These analytics are important parts of forecast models. Descriptive and predictive analytics are only partial aspects of data scientists' work. Some data scientists might work in the realm of deep learning, performing iterative experiments to solve a complex data problem by using customized algorithms. On the surface, it might seem that a data scientist and data analyst are far apart in the work that they do, but this conjecture is untrue. A data scientist looks at data to determine the questions that need answers and will often devise a hypothesis or an experiment and then turn to the data analyst to assist with the data visualization and reporting.
  • Database administrator: A database administrator implements and manages the operational aspects of cloud-native and hybrid data platform solutions that are built on Microsoft Azure data services and Microsoft SQL Server. A database administrator is responsible for the overall availability and consistent performance and optimizations of the database solutions. They work with stakeholders to identify and implement the policies, tools, and processes for data backup and recovery plans. The role of a database administrator is different from the role of a data engineer. A database administrator monitors and manages the overall health of a database and the hardware that it resides on, whereas a data engineer is involved in the process of data wrangling, in other words, ingesting, transforming, validating, and cleaning data to meet business needs and requirements. The database administrator is also responsible for managing the overall security of the data, granting and restricting user access and privileges to the data as determined by business needs and requirements.

Tasks of a data analyst

The followings 5 steps detail the data analysts key functional areas: (PMVAM)

  1. Prepare
  2. Model
  3. Visualise
  4. Analyse
  5. Manage

Prepare:

  • Data preparation is the process of taking raw data and turning it into information that is trusted and understandable. It involves, among other things, ensuring the integrity of the data, correcting wrong or inaccurate data, identifying missing data, converting data from one structure to another or from one type to another, or even a task as simple as making data more readable.
  • Data preparation also involves understanding how you're going to get and connect to the data and the performance implications of the decisions. When connecting to data, you need to make decisions to ensure that models and reports meet, and perform to, acknowledged requirements and expectations.
  • Privacy and security assurances are also important. These assurances can include anonymizing data to avoid oversharing or preventing people from seeing personally identifiable information when it isn't needed. Alternatively, helping to ensure privacy and security can involve removing that data completely if it doesn't fit in with the story that you're trying to shape.
  • Data pr seeparation can often be a lengthy process. Data analysts follow a series of steps and methods to prepare data for placement into a proper context and state that eliminate poor data quality and allow it to be turned into valuable insights.

Model:

  • The most critical part after preparation of data is the modelling.
  • Data modelling is an iterative process, and it is crucial that we understand our data through the preparation stage before we model.
  • Simply, data modelling is the process of deterring how the tables are related to each other. This process is done by defining and creating relationships between the tables. From that point, you can enhance the model by defining metrics and adding custom calculations to enrich your data.
  • Creating an effective and proper semantic model is a critical step in helping organizations understand and gain valuable insights into the data. An effective semantic model makes reports more accurate, allows the data to be explored faster and efficiently, decreases time for the report writing process, and simplifies future report maintenance.
  • The model is another critical component that has a direct effect on the performance of your report and overall data analysis. A poorly designed model can have a drastically negative impact on the general accuracy and performance of your report. Conversely, a well-designed model with well-prepared data will ensure a properly efficient and trusted report. This notion is more prevalent when you are working with data at scale.
  • From a Power BI perspective, if your report is performing slowly, or your refreshes are taking a long time, you will likely need to revisit the data preparation and modeling tasks to optimize your report.

Visualise

  • The visualization task is where you get to bring your data to life. The ultimate goal of the visualize task is to solve business problems. A well-designed report should tell a compelling story about that data, which will enable business decision makers to quickly gain needed insights. By using appropriate visualizations and interactions, you can provide an effective report that guides the reader through the content quickly and efficiently, therefore allowing the reader to follow a narrative into the data.
  • The reports that are created during the visualization task help businesses and decision makers understand what that data means so that accurate and vital decisions can be made. Reports drive the overall actions, decisions, and behaviors of an organization that is trusting and relying on the information that is discovered in the data.
  • The business might communicate that they need all data points on a given report to help them make decisions. As a data analyst, you should take the time to fully understand the problem that the business is trying to solve. Determine whether all their data points are necessary because too much data can make detecting key points difficult. Having a small and concise data story can help find insights quickly.
  • With the built-in AI capabilities in Power BI, data analysts can build powerful reports, without writing any code, that enable users to get insights and answers and find actionable objectives. The AI capabilities in Power BI, such as the built-in AI visuals, enable the discovering of data by asking questions, using the Quick Insights feature, or creating machine learning models directly within Power BI.
  • An important aspect of visualizing data is designing and creating reports for accessibility. As you build reports, it is important to think about people who will be accessing and reading the reports. Reports should be designed with accessibility in mind from the outset so that no special modifications are needed in the future.
  • Many components of your report will help with storytelling. From a color scheme that is complementary and accessible, to fonts and sizing, to picking the right visuals for what is being displayed, they all come together to tell that story..

Analyse

  • The analyze task is the important step of understanding and interpreting the information that is displayed on the report. In your role as a data analyst, you should understand the analytical capabilities of Power BI and use those capabilities to find insights, identify patterns and trends, predict outcomes, and then communicate those insights in a way that everyone can understand.
  • Advanced analytics enables businesses and organizations to ultimately drive better decisions throughout the business and create actionable insights and meaningful results. With advanced analytics, organizations can drill into the data to predict future patterns and trends, identify activities and behaviors, and enable businesses to ask the appropriate questions about their data.
  • Prior to this time, analyses of data was a sole responsibility of the data engineer as it is consider as a difficult task, but with the advance tools like PowerBi, data analyst can equally perform the same task. Users can quickly gain insights into their data by using visuals and metrics directly from their desktop and then publish those insights to dashboards so that others can find needed information.
  • This feature is another area where AI integrations within Power BI can take your analysis to the next level. Integrations with Azure Machine Learning, cognitive services, and built-in AI visuals will help to enrich your data and analysis.

Manage

  • Power BI consists of many components, including reports, dashboards, workspaces, semantic models, and more. As a data analyst, you are responsible for the management of these Power BI assets, overseeing the sharing and distribution of items, such as reports and dashboards, and ensuring the security of Power BI assets.
  • Apps can be a valuable distribution method for your content and allow easier management for large audiences. This feature also allows you to have custom navigation experiences and link to other assets within your organization to complement your reports.
  • The management of your content helps to foster collaboration between teams and individuals. Sharing and discovery of your content is important for the right people to get the answers that they need. It is also important to help ensure that items are secure. You want to make sure that the right people have access and that you are not leaking data past the correct stakeholders.
  • Proper management can also help reduce data silos within your organization. Data duplication can make managing and introducing data latency difficult when resources are overused. Power BI helps reduce data silos with the use of shared semantic models, and it allows you to reuse data that you have prepared and modeled. For key business data, endorsing a semantic model as certified can help to ensure trust in that data.
  • The management of Power BI assets helps reduce the duplication of efforts and helps ensure security of the data.

Check your knowledge

image

Summary

  • In this module, you learned that the role of data analyst is vital to the success of an organization. Additionally, the tasks that data analysts perform help ensure that the business decisions are based on trusted data. You also learned about the different roles in data and how the people in these roles work closely with a data analyst to deliver valuable insights into a business's data assets.