BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//PR Statistics - ECPv6.10.0//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:PR Statistics
X-ORIGINAL-URL:https://prstats.preprodw.com
X-WR-CALDESC:Events for PR Statistics
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20230326T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20231029T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20240331T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20241027T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20250330T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20251026T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20260329T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20261025T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20270328T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20271031T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20280326T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20281029T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20290325T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20291028T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:20300331T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:20301027T010000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;VALUE=DATE:20231003
DTEND;VALUE=DATE:20301006
DTSTAMP:20260418T173206
CREATED:20240220T151755Z
LAST-MODIFIED:20240221T133040Z
UID:10000445-1696291200-1917475199@prstats.preprodw.com
SUMMARY:ONLINE COURSE - Introduction to generalised linear models using R and Rstudio (IGLMPR)
DESCRIPTION:Delivered remotely (United Kingdom)\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n					\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n			\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Course Format\nPre-Recorded \n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				About this course\n				This course provides a comprehensive practical and theoretical introduction to generalized linear models using R. Generalized linear models are generalizations of linear regression models for situations where the outcome variable is\, for example\, a binary\, or ordinal\, or count variable\, etc. The specific models we cover include binary\, binomial\, and categorical logistic regression\, Poisson and negative binomial regression for count variables\, as well as extensions for overdispersed and zero-inflated data. We begin by providing a brief overview of the normal general linear model. Understanding this model is vital for the proper understanding of how it is generalized in generalized linear models. Next\, we introduce the widely used binary logistic regression model\, which is is a regression model for when the outcome variable is binary. Next\, we cover the binomial logistic regression\, and the multinomial case\, which is for modelling outcomes variables that are polychotomous\, i.e.\, have more than two categorically distinct values. We will then cover Poisson regression\, which is widely used for modelling outcome variables that are counts (i.e the number of times something has happened). We then cover extensions to accommodate overdispersion\, starting with the quasi-likelihood approach\, then covering the negative binomial and beta-binomial models for counts and discrete proportions\, respectively. Finally\, we will cover zero-inflated Poisson and negative binomial models\, which are for count data with excessive numbers of zero observations. \n			\n				\n				\n				\n				\n				Intended Audiences\n				This course is aimed at anyone who is interested in using R for data science or statistics. R is widely used in all areas of academic scientific research\, and also widely throughout the public\, and private sector.\n			\n				\n				\n				\n				\n				Venue\n				Delivered remotely\n			\n				\n				\n				\n				\n				Course Information\n				Time zone – NA \nAvailability – NA \nDuration – 3 x 1/2 days \nContact hours – Approx. 12 hours \nECT’s – Equal to 1 ECT’s \nLanguage – English\n			\n				\n				\n				\n				\n				Teaching Format\n				This course will be largely practical\, hands-on\, and workshop based. For each topic\, there will first be some lecture style presentation\, i.e.\, using slides or blackboard\, to introduce and explain key concepts and theories. Then\, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session.\n			\n				\n				\n				\n				\n				Assumed quantitative knowledge\n				A basic understanding of statistical concepts. Specifically\, generalised linear regression models\, statistical significance\, hypothesis testing.\n			\n				\n				\n				\n				\n				Assumed computer background\n				Familiarity with R. Ability to import/export data\, manipulate data frames\, fit basic statistical models & generate simple exploratory and diagnostic plots.\n			\n				\n				\n				\n				\n				Equipment and software requirements\n				\nA laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs\, Macs\, and Linux computers. \n\n\n\n\n\nParticipants should be able to install additional software on their own computer during the course (please make sure you have administration rights to your computer). \n\n\n\n\n\n\nA large monitor and a second screen\, although not absolutely necessary\, could improve the learning experience.  \n\n\n\n\n\nDownload R \n\n\nDownload RStudio \n\n			\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n			\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				PLEASE READ – CANCELLATION POLICY \nCancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered\, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.\n			\n				\n				\n				\n				\n				If you are unsure about course suitability\, please get in touch by email to find out more oliverhooker@prstatistics.com\n			\n			\n				\n				\n				\n				\n			\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				COURSE PROGRAMME\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Day 1\n				Topic 1: The general linear model. We begin by providing an overview of the normal\, as in normal distribution\, general linear model\, including using categorical predictor variables. Although this model is not the focus of the course\, it is the foundation on which generalized linear models are based and so must be understood to understand generalized linear models. \nTopic 2: Binary logistic regression. Our first generalized linear model is the binary logistic regression model\, for use when modelling binary outcome data. We will present the assumed theoretical model behind logistic regression\, implement it using R’s glm\, and then show how to interpret its results\, perform predictions\, and (nested) model comparisons. \nTopic 3: Binomial logistic regression. Here\, we show how the binary logistic regresion can be extended to deal with data on discrete proportions. We will also present alternative link functions to the logit\, such as the probit and complementary log-log links. \n			\n				\n				\n				\n				\n				Day 2\n				Topic 4: Categorical logistic regression. Categorical logistic regression\, also known as multinomial logistic regression\, is for modelling polychotomous data\, i.e. data taking more than two categorically distinct values. Like ordinal logistic regression\, categorical logistic regression is also based on an extension of the binary logistic regression case. \nTopic 5: Poisson regression. Poisson regression is a widely used technique for modelling count data\, i.e.\, data where the variable denotes the number of times an event has occurred. \n			\n				\n				\n				\n				\n				Day 3\n				Topic 6: Overdispersion models. The quasi-likelihood approach for both the Poisson and binomial models. Negative binomial regression. The negative binomial model is\, like the Poisson regression model\, used for unbounded count data\, but it is less restrictive than Poisson regression\, specifically by dealing with overdispersed data. Beta-binomial regression. The beta-binomial model is an overdispersed alternative to the binomial. \nTopic 7: Zero inflated models. Zero inflated count data is where there are excessive numbers of zero counts that can be modelled using either a Poisson or negative binomial model. Zero inflated Poisson or negative binomial models are types of latent variable models. \n			\n			\n				\n				\n				\n				\n				Course Instructor\n \nDr. Rafael De Andrade Moral \nRafael is an Associate Professor of Statistics at Maynooth University\, Ireland. With a background in Biology and a PhD in Statistics from the University of São Paulo\, Rafael has a deep passion for teaching and conducting research in statistical modelling applied to Ecology\, Wildlife Management\, Agriculture\, and Environmental Science. As director of the Theoretical and Statistical Ecology Group\, Rafael brings together a community of researchers who use mathematical and statistical tools to better understand the natural world. As an alternative teaching strategy\, Rafael has been producing music videos and parodies to promote Statistics in social media and in the classroom. His personal webpage can be found here \nResearchGate\nGoogleScholar\nORCID\nGitHub \n​
URL:https://prstats.preprodw.com/course/online-course-introduction-to-generalised-linear-models-using-r-and-rstudio-iglmpr/
LOCATION:Delivered remotely (United Kingdom)\, Western European Time\, United Kingdom
CATEGORIES:Previously Recorded Courses
ATTACH;FMTTYPE=image/png:https://prstats.preprodw.com/wp-content/uploads/2022/02/IGLM04R.png
END:VEVENT
BEGIN:VEVENT
DTSTART;VALUE=DATE:20240206
DTEND;VALUE=DATE:20300209
DTSTAMP:20260418T173206
CREATED:20240220T160615Z
LAST-MODIFIED:20240221T135137Z
UID:10000448-1707177600-1896825599@prstats.preprodw.com
SUMMARY:ONLINE COURSE - Introduction to Time Series Analysis using R and Rstudio (ITSAPR)
DESCRIPTION:Delivered remotely (United Kingdom)\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n					\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n			\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Course Format\nPre-Recorded \n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				About this course\n				In this three-day course\, we provide a comprehensive practical and theoretical introduction to time series analysis and forecasting methods using R. Forecasting tools are useful in many areas\, such as finance\, meteorology\, ecology\, public policy\, and health. We start by introducing the concepts of time series and stationarity\, which will help us when studying ARIMA-type models. We will also cover autocorrelation functions and series decomposition methods. Then\, we will introduce benchmark forecasting methods\, namely the naïve (or random walk) method\, mean\, drift\, and seasonal naïve methods. After that\, we will present different exponential smoothing methods (simple\, Holt’s linear method\, and Holt-Winters seasonal method). Finally\, we will cover autoregressive integrated moving-average (or ARIMA) models\, with and without seasonality. If timeallows\, we will introduce regression with ARIMA errors. \n			\n				\n				\n				\n				\n				Intended Audiences\n				This course is aimed at anyone who is interested in forecasting methods\, and using R for data science or statistics. R is widely used in all areas of academic scientific research\, and also widely throughout the public\, and private sector.\n			\n				\n				\n				\n				\n				Venue\n				Delivered remotely\n			\n				\n				\n				\n				\n				Course Information\n				Time zone – ~NA \nAvailability – NA \nDuration – 3 days \nContact hours – Approx. 12 hours \nECT’s – Equal to 1 ECT’s \nLanguage – English \n			\n				\n				\n				\n				\n				Teaching Format\n				This course will be largely practical\, hands-on\, and workshop based. For each topic\, there will first be some lecture style presentation\, i.e.\, using slides or blackboard\, to introduce and explain key concepts and theories. Then\, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session. \n  \n			\n				\n				\n				\n				\n				Assumed quantitative knowledge\n				A basic understanding of R and statistical concepts. Specifically\, linear regression models\, statistical significance\, and hypothesis testing. \n			\n				\n				\n				\n				\n				Assumed computer background\n				Familiarity with R. Ability to import/export data\, manipulate data frames\, fit basic statistical models &amp; generate simple exploratory and diagnostic plots.\n			\n				\n				\n				\n				\n				Equipment and software requirements\n				\nA laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs\, Macs\, and Linux computers. \n\n\n\n\n\nParticipants should be able to install additional software on their own computer during the course (please make sure you have administration rights to your computer). \n\n\n\n\n\n\nA large monitor and a second screen\, although not absolutely necessary\, could improve the learning experience. Participants are also encouraged to keep their webcam active to increase the interaction with the instructor and other students. \n\n\n\n\n\nDownload R \n\n\nDownload RStudio \n\n			\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n			\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				PLEASE READ – CANCELLATION POLICY \nCancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered\, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.\n			\n				\n				\n				\n				\n				If you are unsure about course suitability\, please get in touch by email to find out more oliverhooker@prstatistics.com\n			\n			\n				\n				\n				\n				\n			\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				COURSE PROGRAMME\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Day 1\n				Section 1: Introductory concepts in time series analysis. White noise\, stationarity\, autocovariance and autocorrelation. \nSection 2: Useful plots in time series analysis. Time plots\, seasonal plots\, autocorrelation plots. Time series decomposition: additive and multiplicative using the fable package in R. \n			\n				\n				\n				\n				\n				Day 2\n				Section 3: Benchmark forecasting methods. The naïve\, mean\, drift\, and seasonal naïve methods. \nSection 4: Exponential smoothing. Simple exponential smoothing\, Holt’s linear method\, Holt-Winters seasonal method\, and fable’s general ETS method. \n			\n				\n				\n				\n				\n				Day 3\n				Section 5: Autoregressive (AR) and moving-average (MA) models. Unit root tests for stationarity. How to identity the order of an AR(p) or an MA(q) model using autocorrelation and partial autocorrelation plots. \nSection 6: Autoregressive integrated moving average (ARIMA) models and seasonal ARIMA models. Automatic order selection for a (seasonal) ARIMA model using fable. Linear regression with ARIMA errors. \n			\n			\n				\n				\n				\n				\n				Course Instructor\n \nDr. Rafael De Andrade Moral \nRafael is an Associate Professor of Statistics at Maynooth University\, Ireland. With a background in Biology and a PhD in Statistics from the University of São Paulo\, Rafael has a deep passion for teaching and conducting research in statistical modelling applied to Ecology\, Wildlife Management\, Agriculture\, and Environmental Science. As director of the Theoretical and Statistical Ecology Group\, Rafael brings together a community of researchers who use mathematical and statistical tools to better understand the natural world. As an alternative teaching strategy\, Rafael has been producing music videos and parodies to promote Statistics in social media and in the classroom. His personal webpage can be found here \nResearchGate\nGoogleScholar\nORCID\nGitHub \n 
URL:https://prstats.preprodw.com/course/online-course-introduction-to-time-series-analysis-using-r-and-rstudio-itsapr/
LOCATION:Delivered remotely (United Kingdom)\, Western European Time\, United Kingdom
CATEGORIES:Previously Recorded Courses
ATTACH;FMTTYPE=image/jpeg:https://prstats.preprodw.com/wp-content/uploads/2022/02/MDAR-scaled.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;VALUE=DATE:20240325
DTEND;VALUE=DATE:20300102
DTSTAMP:20260418T173206
CREATED:20240709T132655Z
LAST-MODIFIED:20240709T132700Z
UID:10000465-1711324800-1893542399@prstats.preprodw.com
SUMMARY:ONLINE COURSE - Advancing in R (ADVRPR)
DESCRIPTION:Delivered remotely (United Kingdom)\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n					\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n				\n					\n						\n						\n							\n							\n						\n					\n				\n				\n				\n				\n			\n			\n				\n				\n			\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Course Format\nPre-Recorded \n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Course Details\n				COURSE DETAILS \nThis course is designed to provide attendees with a comprehensive understanding ofstatistical modelling and its applications in various fields\, such as ecology\, biology\, sociology\,agriculture\, and health. We cover all foundational aspects of modelling\, including all codingaspects\, ranging from data wrangling\, visualisation and exploratory data analysis\, togeneralized linear mixed models\, assessing goodness-of-fit and carrying out modelcomparison. \nData wranglingFor data wrangling\, we focus on tools provided by R&#39;s tidyverse. Data wrangling is the art oftaking raw and messy data and formatting and cleaning it so that data analysis andvisualization may be performed on it. Done poorly\, it can be a time consuming\, laborious\,and error-prone. Fortunately\, the tools provided by R&#39;s tidyverse allow us to do datawrangling in a fast\, efficient\, and high-level manner\, which can have dramatic consequencefor ease and speed with which we analyse data. We start with how to read data of differenttypes into R\, we then cover in detail all the dplyr tools such as select\, filter\, mutate\, andothers. Here\, we will also cover the pipe operator (%&gt;%) to create data wrangling pipelinesthat take raw messy data on the one end and return cleaned tidy data on the other. Wethen cover how to perform descriptive or summary statistics on our data using dplyr’sgroup_by and summarise functions. We then turn to combining and merging data. Here\, wewill consider how to concatenate data frames\, including concatenating all data files in afolder\, as well as cover the powerful SQL-like join operations that allow us to mergeinformation in different data frames. The final topic we will consider is how to “pivot” datafrom a “wide” to “long” format and back using tidyr’s pivot_longer and pivot_widerfunctions. \nData visualisationFor visualisation\, we focus on the ggplot2 package. We begin by providing a brief overviewof the general principles data visualization\, and an overview of the general principles behindggplot. We then proceed to cover the major types of plots for visualizing distributions ofunivariate data: histograms\, density plots\, barplots\, and Tukey boxplots. In all of thesecases\, we will consider how to visualize multiple distributions simultaneously on the sameplot using different colours and &quot;facet&quot; plots. We then turn to the visualization of bivariatedata using scatterplots. Here\, we will explore how to apply linear and nonlinear smoothingfunctions to the data\, how to add marginal histograms to the scatterplot\, add labels topoints\, and scale each point by the value of a third variable. We then cover some additionalplot types that are often related but not identical to those major types covered during thebeginning of the course: frequency polygons\, area plots\, line plots\, uncertainty plots\, violinplots\, and geospatial mapping. We then consider more fine grained control of the plot bychanging axis scales\, axis labels\, axis tick points\, colour palettes\, and ggplot &quot;themes&quot;.Finally\, we consider how to make plots for presentations and publications. Here\, we will introduce how to insert plots into documents using RMarkdown\, and also how to createlabelled grids of subplots of the kind seen in many published articles. \nGeneralized linear modelsGeneralized linear models are generalizations of linear regression models for situationswhere the outcome variable is\, for example\, a binary\, or ordinal\, or count variable\, etc. Thespecific models we cover include binary\, binomial\, and categorical logistic regression\,Poisson and negative binomial regression for count variables\, as well as extensions foroverdispersed and zero-inflated data. We begin by providing a brief overview of the normalgeneral linear model. Understanding this model is vital for the proper understanding of howit is generalized in generalized linear models. Next\, we introduce the widely used binarylogistic regression model\, which is is a regression model for when the outcome variable isbinary. Next\, we cover the binomial logistic regression\, and the multinomial case\, which isfor modelling outcomes variables that are polychotomous\, i.e.\, have more than twocategorically distinct values. We will then cover Poisson regression\, which is widely used formodelling outcome variables that are counts (i.e the number of times something hashappened). We then cover extensions to accommodate overdispersion\, starting with thequasi-likelihood approach\, then covering the negative binomial and beta-binomial modelsfor counts and discrete proportions\, respectively. Finally\, we will cover zero-inflated Poissonand negative binomial models\, which are for count data with excessive numbers of zeroobservations. \nMixed modelsWe will focus primarily on multilevel linear models\, but also cover multilevel generalizedlinear models. Likewise\, we will also describe Bayesian approaches to multilevel modelling.We will begin by focusing on random effects multilevel models. These models make it clearhow multilevel models are in fact models of models. In addition\, random effects modelsserve as a solid basis for understanding mixed effects\, i.e. fixed and random effects\, models.In this coverage of random effects\, we will also cover the important concepts of statisticalshrinkage in the estimation of effects\, as well as intraclass correlation. We then proceed tocover linear mixed effects models\, particularly focusing on varying intercept and/or varyingslopes regression models. We will then cover further aspects of linear mixed effects models\,including multilevel models for nested and crossed data data\, and group level predictorvariables. Towards the end of the course we also cover generalized linear mixed models(GLMMs)\, how to accommodate overdispersion through individual-level random effects\, aswell as Bayesian approaches to multilevel levels using the brms R package. \nModel selection and model simplificationThroughout the course we consider the fundamental issue of how to measure model fit anda model’s predictive performance\, and discuss a wide range of other major model fitmeasurement concepts like likelihood\, log likelihood\, deviance\, and residual sums ofsquares. We thoroughly explore nested model comparison\, particularly in general andgeneralized linear models\, and their mixed effects counterparts. We discuss out-of-samplegeneralization\, and introduce leave-one-out cross-validation and the Akaike Information Criterion (AIC). We also cover general concepts and methods related to variable selection\,including stepwise regression\, ridge regression\, Lasso\, and elastic nets. Finally\, we turn tomodel averaging\, which may represent a preferable alternative to model selection. \n			\n				\n				\n				\n				\n				Intended Audiences\n				This course is aimed at anyone who is interested in using R for data science or statistics. R is widely used in all areas of academic scientific research\, and also widely throughout the public\, and private sector.\n			\n				\n				\n				\n				\n				Venue\n				Delivered remotely \n			\n				\n				\n				\n				\n				Course Information\n				Time zone – NA \nAvailability – NA \nDuration – 5 days \nContact hours – Approx. 35 hours \nECT’s – Equal to 1 ECT’s \nLanguage – English \n			\n				\n				\n				\n				\n				Teaching Format\n				This course will be largely practical\, hands-on\, and workshop based. For each topic\, there will first be some lecture style presentation\, i.e.\, using slides or blackboard\, to introduce and explain key concepts and theories. Then\, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session. For the breaks between sessions\, and between days\, optional exercises will be provided. Solutions to these exercises and brief discussions of them will take place after each break. \nThe course will take place online using Zoom. On each day\, the live video broadcasts will occur during UK local time at:\n• 10am-12pm\n• 1pm-3pm\n• 4pm-6pm \nAll sessions will be video recorded and made available to all attendees as soon as possible\, hopefully soon after each 2hr session. \nIf some sessions are not at a convenient time due to different time zones\, attendees are encouraged to join as many of the live broadcasts as possible. For example\, attendees from North America may be able to join the live sessions from 3pm-5pm and 6pm-8pm\, and then catch up with the 12pm-2pm recorded session once it is uploaded. By joining any live sessions that are possible will allow attendees to benefit from asking questions and having discussions\, rather than just watching prerecorded sessions. \nAt the start of the first day\, we will ensure that everyone is comfortable with how Zoom works\, and we’ll discuss the procedure for asking questions and raising comments. \nAlthough not strictly required\, using a large monitor or preferably even a second monitor will make the learning experience better\, as you will be able to see my RStudio and your own RStudio simultaneously. \nAll the sessions will be video recorded\, and made available immediately on a private video hosting website. Any materials\, such as slides\, data sets\, etc.\, will be shared via GitHub\n			\n				\n				\n				\n				\n				Assumed quantitative knowledge\n				A basic understanding of statistical concepts. Specifically\, generalised linear regression models\, statistical significance\, hypothesis testing.\n			\n				\n				\n				\n				\n				Assumed computer background\n				Familiarity with R. Ability to import/export data\, manipulate data frames\, fit basic statistical models & generate simple exploratory and diagnostic plots.\n			\n				\n				\n				\n				\n				Equipment and software requirements\n				\nA laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs\, Macs\, and Linux computers. \n\n\n\n\n\nParticipants should be able to install additional software on their own computer during the course (please make sure you have administration rights to your computer). \n\n\n\n\n\n\nA large monitor and a second screen\, although not absolutely necessary\, could improve the learning experience. Participants are also encouraged to keep their webcam active to increase the interaction with the instructor and other students. \n\n\n\n\n\nDownload R \n\n\nDownload RStudio \n\n\nDownload Zoom \n\n			\n			\n			\n				\n				\n				\n				\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				\n			\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				PLEASE READ – CANCELLATION POLICY \nCancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered\, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.\n			\n				\n				\n				\n				\n				If you are unsure about course suitability\, please get in touch by email to find out more oliverhooker@prstatistics.com\n			\n			\n				\n				\n				\n				\n			\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				COURSE PROGRAMME\n			\n				\n				\n				\n				\n				\n				\n				\n				\n				\n				Day 1\n				Day 1 \nTopic 1: Reading in data. We will begin by reading in data into R using tools suchas readr and readxl. Almost all types of data can be read into R\, and here we will considermany of the main types\, such as csv\, xlsx\, sav\, etc. Here\, we will also consider how to controlhow data are parsed\, e.g.\, so that they are read as dates\, numbers\, strings\, etc. \nTopic 2: Wrangling with dplyr. We will next cover the very powerful dplyr R package. Thispackage supplies a number of so-called &quot;verbs&quot; — select\, rename\, slice\, filter\, mutate\, arrange\, etc. — each of which focuses on a key data manipulation tools\, such as selecting or changing variables. All of these verbs can be chained together using &quot;pipes&quot; (represented by %&gt;%). Together\, these create powerful data wrangling pipelines that take raw data as input and return cleaned data as output. Here\, we will also learn about the key concept of &quot;tidy data&quot;\, which is roughly where each row of a data frame is an observation and each column is a variable. \nTopic 3: Summarizing data. The summarize and group_by tools in dplyr can be used withgreat effect to summarize data using descriptive statistics. \nTopic 4: Merging and joining data frames. There are multiple ways to combine data frames\,with the simplest being &quot;bind&quot; operations\, which are effectively horizontal or verticalconcatenations. Much more powerful are the SQL-like &quot;join&quot; operations. Here\, we willconsider the inner_join\, left_join\, right_join\, full_join operations. In this section\, we will alsoconsider how to use purrr to read in and automatically merge large sets of files. \nTopic 5: Pivoting data. Sometimes we need to change data frames from &quot;long&quot; to &quot;wide&quot;formats. The R package tidyr provides the tools pivot_longer and pivot_wider for doing this. \n			\n				\n				\n				\n				\n				Day 2\n				Day 2 \nTopic 1: What is data visualization. Data visualization is a means to explore and understandour data and should be a major part of any data analysis. Here\, we briefly discuss why datavisualization is so important and what the major principles behind it are. \nTopic 2: Introducing ggplot. Though there are many options for visualization in R\, ggplot issimply the best. Here\, we briefly introduce the major principles behind how ggplot works\,namely how it is a layered grammar of graphics.Topic 3: Visualizing univariate data. Here\, we cover a set of major tools for visualizingdistributions over single variables: histograms\, density plots\, barplots\, Tukey boxplots. In each case\, we will explore how to plot multiple groups of data simultaneously using different colours and also using facet plots. \nTopic 4: Scatterplots. Scatterplots and their variants are used to visualize bivariate data.Here\, in addition to covering how to visualize multiple groups using colours and facets\, wewill also cover how to provide marginal plots on the scatterplots\, labels to points\, and howto obtain linear and nonlinear smoothing of the plots. \nTopic 5: More plot types. Having already covered the most widely used general purposeplots\, we now turn to cover a range of other major plot types: frequency polygons\, areaplots\, line plots\, uncertainty plots\, violin plots\, and geospatial mapping. Each of these areimportant and widely used types of plots\, and knowing them will expand your repertoire. \nTopic 6: Fine control of plots. Thus far\, we will have mostly used the default for the plotstyles and layouts. Here\, we will introduce how to modify things like the limits and scales onthe axes\, the positions and nature of the axis ticks\, the colour palettes that are used\, andthe different types of ggplot themes that are available. \nTopic 7: Plots for publications and presentations. Thus far\, we have primarily focused ondata visualization as a means of interactively exploring data. Often\, however\, we also wantto present our plots in\, for example\, published articles or in slide presentations. It is simpleto save a plot in different file formats\, and then insert them into a document. However\, amuch more efficient way of doing this is to use RMarkdown to run the R code andautomatically insert the resulting figure into a\, for example\, Word document\, pdf document\,html page\, etc. In addition\, here we will also cover how to make labelled grids of subplotslike those found in many scientific articles. \n			\n				\n				\n				\n				\n				Day 3\n				Day 3 \nTopic 1: The general linear model. We begin by providing an overview of the normal\, as innormal distribution\, general linear model\, including using categorical predictor variables.Although this model is not the focus of the course\, it is the foundation on which generalizedlinear models are based and so must be understood to understand generalized linearmodels. \nTopic 2: Binary logistic regression. Our first generalized linear model is the binary logisticregression model\, for use when modelling binary outcome data. We will present theassumed theoretical model behind logistic regression\, implement it using R’s glm\, and thenshow how to interpret its results\, perform predictions\, and (nested) model comparisons. \nTopic 3: Binomial logistic regression. Here\, we show how the binary logistic regression canbe extended to deal with data on discrete proportions. We will also present alternative linkfunctions to the logit\, such as the probit and complementary log-log links. \nTopic 4: Categorical logistic regression. Categorical logistic regression\, also known as multinomial logistic regression\, is for modelling polychotomous data\, i.e. data taking more than two categorically distinct values. Categorical logistic regression is based on an extension of the binary logistic regression case. \nTopic 5: Poisson regression. Poisson regression is a widely used technique for modellingcount data\, i.e.\, data where the variable denotes the number of times an event has occurred. \n			\n				\n				\n				\n				\n				Day 4\n				Day 4 \nTopic 1: Measuring model fit. Here\, the concept of conditional probability of the observeddata\, or of future data\, is of vital importance. This is intimately related\, though distinct\, toconcept of likelihood and the likelihood function\, which is in turn related to the concept ofthe log likelihood or deviance of a model. Here\, we also show how these concepts arerelated to concepts of residual sums of squares\, root mean square error (rmse)\, anddeviance residuals. \nTopic 2: Nested model comparison. In this section\, we cover how to do nested modelcomparison in general linear models\, generalized linear models\, and their mixed effects(multilevel) counterparts. First\, we precisely define what is meant by a nested model. Thenwe show how nested model comparison can be accomplished in general linear models withF tests\, which we will also discuss in relation to R^2 and adjusted R^2. In generalized linearmodels\, we can accomplish nested model comparison using deviance based chi-square testsvia Wilks’s theorem. \nTopic 3: Overdispersion models. The quasi-likelihood approach for both the Poisson andbinomial models. Negative binomial regression. The negative binomial model is\, like thePoisson regression model\, used for unbounded count data\, but it is less restrictive thanPoisson regression\, specifically by dealing with overdispersed data. Beta-binomialregression. The beta-binomial model is an overdispersed alternative to the binomial. \nTopic 4: Zero inflated models. Zero inflated count data is where there are excessivenumbers of zero counts that can be modelled using either a Poisson or negative binomialmodel. Zero inflated Poisson or negative binomial models are types of latent variablemodels. \nTopic 5: Random effects models. The defining feature of multilevel models is that they aremodels of models. We begin by using a binomial random effects model to illustrate this.Specifically\, we show how multilevel models are models of the variability in models ofdifferent clusters or groups of data. \nTopic 6: Normal random effects models. Normal\, as in normal distribution\, random effectsmodels are the key to understanding the more general and widely used linear mixed effectsmodels. Here\, we also cover the key concepts of statistical shrinkage and intraclasscorrelation. \n			\n				\n				\n				\n				\n				Day 5\n				Day 5 \nTopic 1: Out of sample predictive performance: cross validation and information criteria.Here\, we describe how to measure out of sample predictive performance\, which measureshow well a model can generalize to new data. This is arguably the gold-standard forevaluating any statistical models. A practical means to measure out of sample predictiveperformance is cross-validation\, especially leave-one-out cross-validation. Leave-one-outcross-validation can\, in relatively simple models\, be approximated by Akaike InformationCriterion (AIC)\, which can be exceptionally simple to calculate. We will discuss how tointerpret AIC values\, and describe other related information criteria\, some of which will beused in more detail in later sections. \nTopic 2: Linear mixed effects models. Next\, we turn to multilevel linear models\, also knownas linear mixed effects models. We specifically deal with the cases of varying interceptand/or varying slope linear regression models. \nTopic 3: Multilevel models for nested data. Here\, we will consider multilevel linear modelsfor nested\, as in groups of groups\, data. As an example\, we will look at multilevel linearmodels applied to data from students within classes that are themselves within differentschools\, and where we model the variability of effects across the classes and across theschools. \nTopic 4: Multilevel models for crossed data. In some multilevel models\, each observationoccurs in multiple groups\, but these groups are not nested. For example\, animals may bemembers of different species and in different locations\, but the species are not subsets oflocations\, nor vice versa. These are known as crossed or multiclass data structures. \nTopic 5: Group level predictors. In some multilevel regression models\, predictor variable aresometimes associated with individuals\, and sometimes associated with their groups. In thissection\, we consider how to handle these two situations. \nTopic 6: Generalized linear mixed models (GLMMs). Here\, we extend the linear mixed modelto the exponential family of distributions and showcase an example using the PoissonGLMM. We also cover how to accommodate overdispersion through individual-levelrandom effects. \nTopic 7: Bayesian multilevel models. All of the models that we have considered can behandled\, often more easily\, using Bayesian models. Here\, we provide an brief introductionto Bayesian models and how to perform examples of the models that we have consideredusing Bayesian methods and the brms R package. \nTopic 8: Variable selection. Variable selection is a type of nested model comparison. It isalso one of the most widely used model selection methods\, and variable selection of somekind is almost always done routinely in all data analysis. In particular\, we cover stepwiseregression (and its limitations)\, all subsets methods\, ridge regression\, Lasso\, and elastic nets.Topic 9: Model averaging. Rather than selecting one model from a set of candidates\, it isarguably always better perform model averaging\, using all the candidates models\, weighted by the predictive performance. We show how to perform model average using informationcriteria. \n			\n			\n				\n				\n				\n				\n				Course Instructor\n \nDr. Rafael De Andrade Moral \nRafael is an Associate Professor of Statistics at Maynooth University\, Ireland. With a background in Biology and a PhD in Statistics from the University of São Paulo\, Rafael has a deep passion for teaching and conducting research in statistical modelling applied to Ecology\, Wildlife Management\, Agriculture\, and Environmental Science. As director of the Theoretical and Statistical Ecology Group\, Rafael brings together a community of researchers who use mathematical and statistical tools to better understand the natural world. As an alternative teaching strategy\, Rafael has been producing music videos and parodies to promote Statistics in social media and in the classroom. His personal webpage can be found here \nResearchGate\nGoogleScholar\nORCID\nGitHub
URL:https://prstats.preprodw.com/course/advancing-in-r-advrpr/
LOCATION:Delivered remotely (United Kingdom)\, Western European Time\, United Kingdom
CATEGORIES:Previously Recorded Courses
ATTACH;FMTTYPE=image/jpeg:https://prstats.preprodw.com/wp-content/uploads/2024/01/nick-owuor-astro-nic-portraits-wDifg5xc9Z4-unsplash-scaled.jpg
END:VEVENT
END:VCALENDAR