CPA Ireland Diploma in Data Analytics

Due to the continued success of the Diploma in Data Analytics, we are delighted to annouce the next intake of this course will be in September 2022. This course will provide a high-level understanding of the main concepts associated with data analytics through the use of Rstudio and Excel. The course will teach you the tools to help you use analytics to formulate and support them in solving business problems and communicating that analysis to a management team.

Dual Qualification

Analytics-Institute-Logo.pngThe Diploma in Data Analytics is approved by the Analytics Institute of Ireland for dual accreditation. This means that anyone who has successfully completed the Diploma in Data Analytics, will be eligible for this dual qualification and will now have the opportunity to register as a Certified Business Data Analyst with The Analytics Institute of Ireland.


Method: 1/2 day workshop, 5 lecture days & assessment

Location: Online via live stream

Date: September 2022 - March 2023

CPD Credit: 40 hours 

Cost: €1550 (non-members: €1750)

Book your place now for September 2022 intake.

Learning Outcomes & Key Takeaways

Over the course of this programme, you will learn:

  • Data Transformation & Visualization

    Learn to find, clean/process, and transform data. Apply Visualization and Transformation to explore data in a systematic way. Engage in exploratory data analysis, (or EDA) to parse through Big Data. Leverage the graphing tools available in ggplot2, created by Hadley Wickham, to create publication quality visualizations and reports with minimum fuss. ggplot2 is introduced here and explained from scratch. The ggplot2 syntax executable in R (and incidentally Python) provides a simplified grammar for producing “elegant graphics for data analysis”. Learn how to create highly nuanced charts simply and extract business/scientific intelligence from vast data frames using a more programmatic and intuitive interface. Learn to specify what variables to plot, display templates, and manipulate general visual properties. Available on the Tidyverse R page, we will extensively make use of examples developed in: R for Data Science.

  • Data Analytics Tools

    Develop a broad insight and understanding of data analytics tools and the ability to extract useful knowledge from data. Develop a mastery of basic statistical techniques. Employ basic frameworks like the Normal Distribution and student-t distribution to understand the preponderance or otherwise of trends or patterns. Establish with statistical confidence relationships. We develop models of qualitative choice with examples drawn from mortgage approval, survivorship, wine quality. Employ basic OLS and random forest modeling for making forecasts. Develop business/scientific intelligence from Machine Learning techniques. Demonstrate real world applications of Artificial Intelligence being deployed to assess mortgage applications.

  • Develop Excel and VBA skills

    Develop basic data analysis in Excel and VBA. We demonstrate how to implement OLS modeling in Excel. Also we provide some training from scratch on how to automate the estimation of mortgage repayments using VBA. (If Excel is not your thing - no problem use Javascript in Googlesheets).

  • Develop EDA skills in R and Python environments - RStudio/Python Anaconda / Google Colab

    Perform Data Analysis in R and Python. Develop Exploratory Data Analysis and pre-modelling using R tidyverse and Python Pandas libraries. We develop a series of tutorials to explain some of the powerful data transformation and manipulation features of Pandas. These are excellent for preparing professional style reports.

    Tidyverse R and base R are incredibly powerful and widely used to execute and communicate forecasts, statistical analysis and modelling. The Tidyverse R suite assembles some of the most versatile R packages: ggplot2dplyrtidyrreadrpurrr, and tibble for visualization and data query. The PandasMatplotlib and Seaborn packages available in Python similarly provides a full complement of data query and visualization tools. These cutting edge packages can be transformative in promoting collaboration and disseminating ideas through data intelligence. In particular, the Tidyverse umbrella package from R can be used to tease out many key areas of data analytics. Tidyverse R can also be installed in Google Colab.

  • From Basic Statistical Modelling to Machine Learning in R and Python

    We introduce statistical modelling very gently here by making use of Excel, R and Python. We demonstrate how to estimate basic linear relationships, simple model parameters and introduce how to estimate model error using: Sum of Squared ResidualsTotal Sum Squares and Explained Sum of Squares. R will be used also to introduce newer forms of statistical modelling: Machine Learning. These tools are available seamlessly in R and Python. We deploy sklearn libraries in Google Colab python notebooks to model and predict house prices in a training and testing framework. We exploit the Kaggle platform - a free to use resource to access both code and data. In particular, the Titanic Kaggle Dataset is presented as a sort of handy "proof of concept" for those new to Machine Learning. We develop an Analytical model for predicting survival on board the ill-fated Titanic. We also demonstrate Machine Learning and AI by training the HMDA dataset for mortgage origination and vetting. Some examples, techniques and code elaborated by Hal Varian (chief Economist of Google) for Machine Learning are introduced here and explained in detail. See link to the Journal of Economic Perspectives with the relevant journal article. Train a Machine Learning algorithm to determine which mortgage applicants would be successful or not. Evaluate varying Machine Learning models using confusion matrices. Predict Wine Quality and Prices using standard regression techniques and random Forest.

    R has grown into fully fledged data science programming language replete with a very active community (see: & ) and web resources fully primed to go. We follow Professor Orley Ashenfelter of Princeton and investigate wine quality. We also leverage content and study materials hosted by MITOPENCOURSEWARE and disseminated freely to users. Christoph Hanck, Martin Arnold, Alexander Gerber and Martin Schmelzer have published an online text second to none for modeling techniques which we draw on extensively here.

    Python code can be intuitively executed in Anaconda and Google Colab and from many other platforms. Significant Python resources and implementation are available from the Python Data Science Handbook This Google Colab makes use of key Python libraries: NumPy, Pandas, Matplotlib and Scikit-learn — The latter is one of the most popular libraries for machine learning.


The Data Analytics course opened me to world of opportunity in learning various tools that can be used to enhance the skills required for my daily tasks at work. Excellent online lecture delivery by Brain and good support from the CPA team.

Gabriel Oguntuase, CPA Learn more about what's covered on this course

Book your place now!

Book your place now on the September 2022 course

Book now