LaTeX & R Markdown: Data Analysis
Within many businesses and workplaces, employees rely heavily upon spreadsheets. These spreadsheets are most often manipulated using Microsoft Excel, a tool that has been in popular use for several decades.
In our modern era however, there is a new competitor to Excel that allows one to not just create spreadsheets, but virtually anything! This tool is known as R Markdown.
R Markdown is an extension of the R programming language, which is something used increasingly often within the fields of data science and financial analysis. R Markdown, as the name implies, allows one to include not just R and Markdown together in one single document, but also LaTeX, Python, JavaScript, SQL, Git, Bash, and more! What makes this so powerful is that it not only means you can use R Markdown for data science applications and for automating tasks normally performed using Excel. With R Markdown, you can also do everything from automatically turn a .CSV file into a full-blown financial report, to automating the process of converting a machine into a database or web server!
Below is a demonstration of R Markdown that combines R, Git, and LaTeX in order to automatically produce a LaTeX-formatted report in the PDF format using information extracted by conducting data analysis on any spreadsheet or table provided to it in any file format. Please note that Bash has not been used in order to enable the code to automatically execute itself, as this causes problems on some operating systems.
Financial Analysis, Spreadsheet Manipulation, & Data Science in R Markdown:
Report.rmd
--- title: "Financial Analysis, Spreadsheet Manipulation, & Data Science in R Markdown - Demo" output: pdf_document --- The following libraries were used on this project. All formatting is written in \LaTeX\ and all other code is written in R. \newline ```{r include = TRUE} ## GitHub is used to download other libraries library("gh") ## Rio is used for data input/output library("rio") ## tstools is used for time series manipulation library("tstools") ## tidyr is used for data organization library("tidyr") ## zoo is for time series organization library("zoo") ## Pandoc takes .md files and outputs ## files in formats like PDF or HTML library("pandoc") ## Markdown is a markup language library("markdown") ## knitr translates other code into ## the markdown markup language library("knitr") ## Rmarkdown gives files to knitr library("rmarkdown") ``` ```{r include = FALSE} pandoc_install() ## install_formats() ### ### File data input block - This block parses file-related information. ### ## File location - Change this to the location of the file, including the filename itself filepath = "/home/user/Downloads/data.csv" ## First line - Change this to the number of lines that must be skipped before the data begins filestart = 0 ## Ignored characters - Change this to any string or character in the dataset that should be ignored fileignore = "ignore" ## Naming - TRUE means the first line lists the name(s) of each column. FALSE means the first line is just data. filenamed = TRUE ## Columns - Which column of data would you like to select? filecolumn = "expenditure" ## Export location - Change this to the location where you would like the outputted files to be saved. exportpath = "/home/user/Downloads/" ## Exported Filename - Change this to whatever name you would like your exported file to have. exportfile = "export" ### ### Data selection input block - This block is how the user selects things like the timeframe they are interested in looking at. ### ## Stock versus Flow - Set to 0.5 if you would like to use flow variables and 1 if you would like to use stock variables flowstockset <- 0.5 ## The beginning of the chosen data. For exceptionally large datasets, this allows only a portion to be parsed initially. datastartyear <- 1950 datastartquarter <- 1 datastartmonth <- 1 datastartday <- 15 ## The beginning of the chosen timeframe. This allows one to analyse just one given subset of the imported data. startyear <- 1950 startquarter <- 1 startmonth <- 1 startday <- 15 ## Index times - To do things like calculate the Index Base 100, select your chosen timeframe here. ## Remember - To select just one year, the starts and ends should match. Different values imply a range other than 1 year. indexstartyear <- 2000 indexstartquarter <- 1 indexstartmonth <- 3 indexstartday <- 15 indexendyear <- 2000 indexendquarter <- 1 indexendmonth <- 3 indexendday <- 15 ## The end of the chosen timeframe. This allows one to analyse just one given subset of the imported data. endyear <- 2020 endquarter <- 4 endmonth <- 12 endday <- 15 ## The end of the chosen data. For exceptionally large datasets, this allows only a portion to be parsed initially. dataendyear <- 2020 dataendquarter <- 4 dataendmonth <- 12 dataendday <- 15 ### ### File data loading block - This logic parses and partially sanitizes data contained within a file of any format supported by Rio ### ## Parsing the arbitrary file to a variable, skipping any rows that need to be skipped dataraw <- import(filepath, skip=filestart, header=filenamed) ## Exporting the contents of the variable to a temp file in the CSV format export(dataraw, "temp.csv") ## Declaring the path to this temp file as a string path <- "temp.csv" ## Importing the temp file using native CSV importing functions in R dataimport <- read.csv(path, na.strings=fileignore, header=TRUE) ## Deleting the temp file now that it is no longer needed file.remove(path) ### ### Data sanitizing block - This logic continues the process of sanitizing the imported data ### ## Column prep - Parsing possible column names into their own vairables - This is purely for the sake of making the code easier to read and understand monthcheck <- "month" monthscheck <- "months" mcheck <- "m" quartercheck <- "quarter" quarterscheck <- "quarters" qcheck <- "q" yearcheck <- "year" yearscheck <- "years" ycheck <- "y" ## Column & Row import - Parsing the actual names of the existing columns columnnames <- tolower(colnames(dataimport)) rownames <- tolower(rownames(dataimport)) ## Row name checks - Determining whether the table needs to be rotated by checking the names of the table rows monthcheckpass <- monthcheck[monthcheck %in% rownames] monthscheckpass <- monthscheck[monthscheck %in% rownames] mcheckpass <- mcheck[mcheck %in% rownames] quartercheckpass <- quartercheck[quartercheck %in% rownames] quarterscheckpass <- quarterscheck[quarterscheck %in% rownames] qcheckpass <- qcheck[qcheck %in% rownames] yearcheckpass <- yearcheck[yearcheck %in% rownames] yearscheckpass <- yearscheck[yearscheck %in% rownames] ycheckpass <- ycheck[ycheck %in% rownames] ## Row name sanitization - Ensuring the row names on rotated tables are better standardized for easier data manipulation if(length(ycheckpass) > 0) { names(dataimport)[names(data) == "y"] <- "year" } if(length(yearscheckpass) > 0) { names(dataimport)[names(data) == "years"] <- "year" } if(length(qcheckpass) > 0) { names(dataimport)[names(data) == "q"] <- "quarter" } if(length(quarterscheckpass) > 0) { names(dataimport)[names(data) == "quarters"] <- "quarter" } if(length(mcheckpass) > 0) { names(dataimport)[names(data) == "m"] <- "month" } if(length(monthscheckpass) > 0) { names(dataimport)[names(data) == "months"] <- "month" } ## Table transposition - Ensuring rotated tables are reoriented correctly if(any(c(monthcheckpass, quartercheckpass, yearcheckpass) > 0)) { dataimport <- t(dataimport) } ## Column name checks - Determining whether or not column names referenced during column prep exist in the imported table monthcheckpass <- monthcheck[monthcheck %in% columnnames] monthscheckpass <- monthscheck[monthscheck %in% columnnames] mcheckpass <- mcheck[mcheck %in% columnnames] quartercheckpass <- quartercheck[quartercheck %in% columnnames] quarterscheckpass <- quarterscheck[quarterscheck %in% columnnames] qcheckpass <- qcheck[qcheck %in% columnnames] yearcheckpass <- yearcheck[yearcheck %in% columnnames] yearscheckpass <- yearscheck[yearscheck %in% columnnames] ycheckpass <- ycheck[ycheck %in% columnnames] ## Column name sanitization - Ensuring the column names are better standardized for easier data manipulation if(length(ycheckpass) > 0) { names(dataimport)[names(data) == "y"] <- "year" } if(length(yearscheckpass) > 0) { names(dataimport)[names(data) == "years"] <- "year" } if(length(qcheckpass) > 0) { names(dataimport)[names(data) == "q"] <- "quarter" } if(length(quarterscheckpass) > 0) { names(dataimport)[names(data) == "quarters"] <- "quarter" } if(length(mcheckpass) > 0) { names(dataimport)[names(data) == "m"] <- "month" } if(length(monthscheckpass) > 0) { names(dataimport)[names(data) == "months"] <- "month" } ## Column name checks - Determining whether or not the standardized column names are present monthcheckpass <- monthcheck[monthcheck %in% columnnames] quartercheckpass <- quartercheck[quartercheck %in% columnnames] ## Month insertion - If quarters are present yet months are not, a months column is inserted and populated. if(length(quartercheckpass) > 0) { if(!(length(monthcheckpass) > 0)) { colnames(dataimport) <- tolower(colnames(dataimport)) newtablemanip <- NULL newtablemanipyear <- ts(data=dataimport["year"]) newtablemanipquarter <- ts(data=dataimport["quarter"]) newtablemanipvalue <- ts(data=dataimport[3]) yearmanip <- tsqm(newtablemanipyear) quartermanip <- tsqm(newtablemanipquarter) valuemanip <- tsqm(newtablemanipvalue) yearlonger <- time(yearmanip) quarterlonger <- time(quartermanip) valuelonger <- time(valuemanip) yearvals <- as.vector(yearmanip) quartervals <- as.vector(quartermanip) valuevals <- as.vector(valuemanip) yearA <- as.yearmon(yearlonger) yearB <- as.integer(floor(yearlonger)) yearC <- months(yearA, abbreviate=TRUE) quarterA <- as.yearmon(quarterlonger) quarterB <- as.integer(floor(quarterlonger)) quarterC <- months(quarterA, abbreviate=TRUE) valueA <- as.yearmon(valuelonger) valueB <- as.integer(floor(quarterlonger)) valueC <- months(valueA, abbreviate=TRUE) yearlongest <- data.frame(year=yearB, month=yearC, value=yearvals) quarterlongest <- data.frame(year=quarterB, month=quarterC, value=quartervals) valuelongest <- data.frame(year=valueB, month=valueC, value=valuevals) warpeddata <- data.frame(year=yearlongest$value, quarter=quarterlongest$value, month=valueC, value=valuelongest$value) namechecker <- tolower(colnames(dataimport)) namechecker <- namechecker[namechecker != "month"] namechecker <- namechecker[namechecker != "year"] namechecker <- namechecker[namechecker != "quarter"] names(warpeddata)[names(warpeddata) == "value"] <- namechecker[1] export(warpeddata, "temp.csv") temppath <- "temp.csv" dataimport <- read.csv(temppath, na.strings=fileignore, header=TRUE) file.remove(temppath) } } ``` In order to process data, the data in question must be extracted from a file. For this, Rio is used to import files, then tidyr, tstools, and zoo are used to reorganise the imported data for the purposes of normalization. Doing this allows the code to extract datasets from the widest possible range of different sources. The associated code to do this takes up 229 lines, and thus is not shown in full. \newline \Large \textbf{Section A: Data Visualization} \normalsize 1. \textit{Plotting a series using a line chart before including a description of some form.} \newline ```{r include = TRUE} ## Time series - Monthly, Quarterly, & Yearly monthlytimeseries <- ts(dataimport[filecolumn], frequency=12, start=c(datastartyear,datastartmonth), end=c(dataendyear,dataendmonth)) quarterlytimeseries <- ts(dataimport[filecolumn], frequency=4, start=c(datastartyear,datastartmonth), end=c(dataendyear,dataendmonth)) annualtimeseries <- ts(dataimport[filecolumn], frequency=1, start=c(datastartyear,datastartmonth), end=c(dataendyear,dataendmonth)) ``` ```{r include = FALSE} ## Time series - Decimal Date Formatting monthlytimeseriesdec <- time(monthlytimeseries, offset=flowstockset) quarterlytimeseriesdec <- time(quarterlytimeseries, offset=flowstockset) annualtimeseriesdec <- time(annualtimeseries, offset=flowstockset) monthlytimeseriesquad <- monthlytimeseriesdec^2 quarterlytimeseriesquad <- quarterlytimeseriesdec^2 annualtimeseriesquad <- annualtimeseriesdec^2 ## Nets & Averages - Total Change over the full time Series, and average values monthlytimesum <- sum(monthlytimeseries, na.rm=TRUE) quarterlytimesum <- sum(quarterlytimeseries, na.rm=TRUE) annualtimesum <- sum(annualtimeseries, na.rm=TRUE) monthlytimeavg <- mean(monthlytimeseries, na.rm=TRUE) quarterlytimeavg <- mean(quarterlytimeseries, na.rm=TRUE) annualtimeavg <- mean(annualtimeseries, na.rm=TRUE) ## Log series - Monthly, Quarterly, & Yearly. These help us find the log approximations of growth rate. monthlylogseries <- log(monthlytimeseries) quarterlylogseries <- log(quarterlytimeseries) annuallogseries <- log(annualtimeseries) ## Log series - Decimal Date Formatting monthlylogseriesdec <- time(monthlylogseries, offset=flowstockset) quarterlylogseriesdec <- time(quarterlylogseries, offset=flowstockset) annuallogseriesdec <- time(annuallogseries, offset=flowstockset) monthlylogseriesquad <- monthlylogseriesdec^2 quarterlylogseriesquad <- quarterlylogseriesdec^2 annuallogseriesquad <- annuallogseriesdec^2 ## Log approximation series - Fully compiled log approximations of growth rates in percentages monthlylogapproxseries <- diff(monthlylogseries)*100 quarterlylogapproxseries <- diff(quarterlylogseries)*100 annuallogapproxseries <- diff(annuallogseries)*100 ## Log approximation series - Decimal Date Formatting monthlylogapproxseriesdec <- time(monthlylogapproxseries, offset=flowstockset) quarterlylogapproxseriesdec <- time(quarterlylogapproxseries, offset=flowstockset) annuallogapproxseriesdec <- time(annuallogapproxseries, offset=flowstockset) monthlylogapproxseriesquad <- monthlylogapproxseriesdec^2 quarterlylogapproxseriesquad <- quarterlylogapproxseriesdec^2 annuallogapproxseriesquad <- annuallogapproxseriesdec^2 ## Nets & Averages - Total Change over the full log series, and average values monthlylogsum <- sum(monthlylogseries, na.rm=TRUE) monthlylogapproxsum <- sum(monthlylogapproxseries, na.rm=TRUE) quarterlylogsum <- sum(quarterlylogseries, na.rm=TRUE) quarterlylogapproxsum <- sum(quarterlylogapproxseries, na.rm=TRUE) annuallogsum <- sum(annuallogseries, na.rm=TRUE) annuallogapproxsum <- sum(annuallogapproxseries, na.rm=TRUE) monthlylogavg <- mean(monthlylogseries, na.rm=TRUE) monthlylogapproxavg <- mean(monthlylogapproxseries, na.rm=TRUE) quarterlylogavg <- mean(quarterlylogseries, na.rm=TRUE) quarterlylogapproxavg <- mean(quarterlylogapproxseries, na.rm=TRUE) annuallogavg <- mean(annuallogseries, na.rm=TRUE) annuallogapproxavg <- mean(annuallogapproxseries, na.rm=TRUE) ## Monthly growth - Exact Value monthlygrowth <- diff(monthlytimeseries)/stats::lag(monthlytimeseries,-1)*100 ## Monthly growth - Exact Value - Decimal Date Formatting monthlygrowthdec <- time(monthlygrowth, offset=flowstockset) monthlygrowthquad <- monthlygrowthdec^2 ## Monthly growth - Exact Value Quartised monthlygrowthquartised <- ((monthlygrowth/100+1)^4 - 1) monthlygrowthquartised <- monthlygrowthquartised*100 ## Monthly growth - Exact Value Quartised - Decimal Date Formatting monthlygrowthquartiseddec <- time(monthlygrowthquartised, offset=flowstockset) monthlygrowthquartisedquad <- monthlygrowthquartiseddec^2 ## Monthly growth - Log Approximation Quartised monthlygrowthquartisedlog <- ((monthlylogapproxseries/100+1)^4 - 1) monthlygrowthquartisedlog <- monthlygrowthquartisedlog*100 ## Monthly growth - Log Approximation Quartised - Decimal Date Formatting monthlygrowthquartisedlogdec <- time(monthlygrowthquartisedlog, offset=flowstockset) monthlygrowthquartisedlogquad <- monthlygrowthquartisedlogdec^2 ## Monthly growth - Approximate Quartisation of Exact Monthly Value monthlygrowthquartisedapprox <- 4*monthlygrowth ## Monthly growth - Approximate Quartisation of Exact Monthly Value - Decimal Date Formatting monthlygrowthquartisedapproxdec <- time(monthlygrowthquartisedapprox, offset=flowstockset) monthlygrowthquartisedapproxquad <- monthlygrowthquartisedapproxdec ## Monthly growth - Approximate Quartisation of the Log Approximation monthlygrowthquartisedapproxlog <- 4*monthlylogapproxseries ## Monthly growth - Approximate Quartisation of the Log Approximation - Decimal Date Formatting monthlygrowthquartisedapproxlogdec <- time(monthlygrowthquartisedapproxlog, offset=flowstockset) monthlygrowthquartisedapproxlogquad <- monthlygrowthquartisedapproxlogdec^2 ## Monthly growth - Exact Value Annualized monthlygrowthannualized <- ((monthlygrowth/100+1)^12 - 1) monthlygrowthannualized <- monthlygrowthannualized*100 ## Monthly growth - Exact Value Annualized - Decimal Date Formatting monthlygrowthannualizeddec <- time(monthlygrowthannualized, offset=flowstockset) monthlygrowthannualizedquad <- monthlygrowthannualizeddec^2 ## Monthly growth - Log Approximation Annualized monthlygrowthannualizedlog <- ((monthlylogapproxseries/100+1)^12 - 1) monthlygrowthannualizedlog <- monthlygrowthannualizedlog*100 ## Monthly growth - Log Approximation Annualized - Decimal Date Formatting monthlygrowthannualizedlogdec <- time(monthlygrowthannualizedlog, offset=flowstockset) monthlygrowthannualizedlogquad <- monthlygrowthannualizedlogdec^2 ## Monthly growth - Approximate Annualization of Exact Monthly Value monthlygrowthannualizedapprox <- 12*monthlygrowth ## Monthly growth - Approximate Annualization of Exact Monthly Value - Decimal Date Formatting monthlygrowthannualizedapproxdec <- time(monthlygrowthannualizedapprox, offset=flowstockset) monthlygrowthannualizedapproxquad <- monthlygrowthannualizedapproxdec^2 ## Monthly growth - Approximate Annualization of the Log Approximation monthlygrowthannualizedapproxlog <- 12*monthlylogapproxseries ## Montly growth - Approximate Annualization of the Log Approximation - Decimal Date Formatting monthlygrowthannualizedapproxlogdec <- time(monthlygrowthannualizedapproxlog, offset=flowstockset) monthlygrowthannualizedapproxlogquad <- monthlygrowthannualizedapproxlogdec^2 ## Nets & Averages - Total Change over the full exact monthly growth series set, and average values monthlygrowthsum <- sum(monthlygrowth, na.rm=TRUE) monthlygrowthquartisedsum <- sum(monthlygrowthquartised, na.rm=TRUE) monthlygrowthquartisedlogsum <- sum(monthlygrowthquartisedlog, na.rm=TRUE) monthlygrowthquartisedapproxsum <- sum(monthlygrowthquartisedapprox, na.rm=TRUE) monthlygrowthquartisedapproxlogsum <- sum(monthlygrowthquartisedapproxlog, na.rm=TRUE) monthlygrowthannualizedsum <- sum(monthlygrowthannualized, na.rm=TRUE) monthlygrowthannualizedlogsum <- sum(monthlygrowthannualizedlog, na.rm=TRUE) monthlygrowthannualizedapproxsum <- sum(monthlygrowthannualizedapprox, na.rm=TRUE) monthlygrowthannualizedapproxlogsum <- sum(monthlygrowthannualizedapproxlog, na.rm=TRUE) monthlygrowthavg <- mean(monthlygrowth, na.rm=TRUE) monthlygrowthquartisedavg <- mean(monthlygrowthquartised, na.rm=TRUE) monthlygrowthquartisedlogavg <- mean(monthlygrowthquartisedlog, na.rm=TRUE) monthlygrowthquartisedapproxavg <- mean(monthlygrowthquartisedapprox, na.rm=TRUE) monthlygrowthquartisedapproxlogavg <- mean(monthlygrowthquartisedapproxlog, na.rm=TRUE) monthlygrowthannualizedavg <- mean(monthlygrowthannualized, na.rm=TRUE) monthlygrowthannualizedlogavg <- mean(monthlygrowthannualizedlog, na.rm=TRUE) monthlygrowthannualizedapproxavg <- mean(monthlygrowthannualizedapprox, na.rm=TRUE) monthlygrowthannualizedapproxlogavg <- mean(monthlygrowthannualizedapproxlog, na.rm=TRUE) ## Quarterly growth - Exact Value quarterlygrowth <- diff(quarterlytimeseries)/stats::lag(quarterlytimeseries,-1)*100 ## Quarterly growth - Exact Value - Decimal Date Formatting quarterlygrowthdec <- time(quarterlygrowth, offset=flowstockset) quarterlygrowthquad <- quarterlygrowthdec^2 ## Quarterly growth - Exact Value Annualized quarterlygrowthannualized <- (1+quarterlygrowth)^4 - 1 quarterlygrwothannualized <- quarterlygrowthannualized*100 ## Quarterly growth - Exact Value Annualized - Decimal Date Formatting quarterlygrowthannualizeddec <- time(quarterlygrowthannualized, offset=flowstockset) quarterlygrowthannualizedquad <- quarterlygrowthannualizeddec^2 ## Quarterly growth - Log Approximation Annualized quarterlygrowthannualizedlog <- (1+quarterlylogapproxseries)^4 - 1 quartelygrowthannualizedlog <- quarterlygrowthannualizedlog*100 ## Quarterly growth - Log Approximation Annualized - Decimal Date Formatting quarterlygrowthannualizedlogdec <- time(quarterlygrowthannualizedlog, offset=flowstockset) quarterlygrowthannualizedlogquad <- quarterlygrowthannualizedlogdec^2 ## Quarterly growth - Approximate Annualization of Exact Quarterly Value quarterlygrowthannualizedapprox <- 4*quarterlygrowth ## Quarterly growth - Approximate Annualization of Exact Quarterly Value - Decimal Date Formatting quarterlygrowthannualizedapproxdec <- time(quarterlygrowthannualizedapprox, offset=flowstockset) quarterlygrowthannualizedapproxquad <- quarterlygrowthannualizedapproxdec^2 ## Quarterly growth - Approximate Annualization of the Log Approximation quarterlygrowthannualizedapproxlog <- 4*quarterlylogapproxseries ## Quarterly growth - Approximate Annualization of the Log Approximation - Decimal Date Formatting quarterlygrowthannualizedapproxlogdec <- time(quarterlygrowthannualizedapproxlog, offset=flowstockset) quarterlygrowthannualizedapproxlogquad <- quarterlygrowthannualizedapproxlogdec^2 ## Nets & Averages - Total Change over the full exact quarterly growth series set, and average values quarterlygrowthsum <- sum(quarterlygrowth, na.rm=TRUE) quarterlygrowthannualizedsum <- sum(quarterlygrowthannualized, na.rm=TRUE) quarterlygrowthannualizedlogsum <- sum(quarterlygrowthannualizedlog, na.rm=TRUE) quarterlygrowthannualizedapproxsum <- sum(quarterlygrowthannualizedapprox, na.rm=TRUE) quarterlygrowthannualizedapproxlogsum <- sum(quarterlygrowthannualizedapproxlog, na.rm=TRUE) quarterlygrowthavg <- mean(quarterlygrowth, na.rm=TRUE) quarterlygrowthannualizedavg <- mean(quarterlygrowthannualized, na.rm=TRUE) quarterlygrowthannualizedlogavg <- mean(quarterlygrowthannualizedlog, na.rm=TRUE) quarterlygrowthannualizedapproxavg <- mean(quarterlygrowthannualizedapprox, na.rm=TRUE) quarterlygrowthannualizedapproxlogavg <- mean(quarterlygrowthannualizedapproxlog, na.rm=TRUE) ## Annual growth - Exact Value annualgrowth <- diff(annualtimeseries)/stats::lag(annualtimeseries,-1)*100 ## Annual growth - Exact Value - Decimal Date Formatting annualgrowthdec <- time(annualgrowth, offset=flowstockset) annualgrowthquad <- annualgrowthdec^2 ## Nets & Averages - Total Change over the full exact annual growth series, and average values annualgrowthsum <- sum(annualgrowth, na.rm=TRUE) annualgrowthavg <- mean(annualgrowth, na.rm=TRUE) ## Approximation Errors - The precise degree to which the approximations and exact values differ monthlygrowthlogerror <- monthlygrowth - monthlylogapproxseries monthlygrowthlogerrormin <- min(monthlygrowth - monthlylogapproxseries) monthlygrowthlogerrormean <- mean(monthlygrowth - monthlylogapproxseries) monthlygrowthlogerrormax <- max(monthlygrowth - monthlylogapproxseries) monthlygrowthlogerrordec <- time(monthlygrowthlogerror, offset=flowstockset) monthlygrowthlogerrorquad <- monthlygrowthlogerrordec^2 monthlygrowthannualizedapproxerror <- monthlygrowthannualized - monthlygrowthannualizedapprox monthlygrowthannualizedapproxerrormin <- min(monthlygrowthannualized - monthlygrowthannualizedapprox) monthlygrowthannualizedapproxerrormean <- mean(monthlygrowthannualized - monthlygrowthannualizedapprox) monthlygrowthannualizedapproxerrormax <- max(monthlygrowthannualized - monthlygrowthannualizedapprox) monthlygrowthannualizedapproxerrordec <- time(monthlygrowthannualizedapproxerror, offset=flowstockset) monthlygrowthannualizedapproxerrorquad <- monthlygrowthannualizedapproxerrordec^2 monthlygrowthannualizedlogerror <- monthlygrowthannualized - monthlygrowthannualizedlog monthlygrowthannualizedlogerrormin <- min(monthlygrowthannualized - monthlygrowthannualizedlog) monthlygrowthannualizedlogerrormean <- mean(monthlygrowthannualized - monthlygrowthannualizedlog) monthlygrowthannualizedlogerrormax <- max(monthlygrowthannualized - monthlygrowthannualizedlog) monthlygrowthannualizedlogerrordec <- time(monthlygrowthannualizedlogerror, offset=flowstockset) monthlygrowthannualizedlogerrorquad <- monthlygrowthannualizedlogerrordec^2 monthlygrowthannualizedlogapproxerror <- monthlygrowthannualized - monthlygrowthannualizedapproxlog monthlygrowthannualizedlogapproxerrormin <- min(monthlygrowthannualized - monthlygrowthannualizedlog) monthlygrowthannualizedlogapproxerrormean <- mean(monthlygrowthannualized - monthlygrowthannualizedlog) monthlygrowthannualizedlogapproxerrormax <- max(monthlygrowthannualized - monthlygrowthannualizedlog) monthlygrowthannualizedlogapproxerrordec <- time(monthlygrowthannualizedlogapproxerror, offset=flowstockset) monthlygrowthannualizedlogapproxerrorquad <- monthlygrowthannualizedlogapproxerrordec^2 monthlylogapproxerror <- monthlygrowthannualized - monthlylogapproxseries monthlylogapproxerrormin <- min(monthlygrowthannualized - monthlylogapproxseries) monthlylogapproxerrormean <- mean(monthlygrowthannualized - monthlylogapproxseries) monthlylogapproxerrormax <- mean(monthlygrowthannualized - monthlylogapproxseries) monthlylogapproxerrordec <- time(monthlylogapproxerror, offset=flowstockset) monthlylogapproxerrorquad <- monthlylogapproxerrordec^2 monthlygrowthquartisedapproxerror <- monthlygrowthquartised - monthlygrowthquartisedapprox monthlygrowthquartisedapproxerrormin <- min(monthlygrowthquartised - monthlygrowthquartisedapprox) monthlygrowthquartisedapproxerrormean <- mean(monthlygrowthquartised - monthlygrowthquartisedapprox) monthlygrowthquartisedapproxerrormax <- max(monthlygrowthquartised - monthlygrowthquartisedapprox) monthlygrowthquartisedapproxerrordec <- time(monthlygrowthquartisedapproxerror, offset=flowstockset) monthlygrowthquartisedapproxerrorquad <- monthlygrowthquartisedapproxerrordec^2 quarterlygrowthlogerror <- quarterlygrowth - quarterlylogapproxseries quarterlygrowthlogerrormin <- min(quarterlygrowth - quarterlylogapproxseries) quarterlygrowthlogerrormean <- mean(quarterlygrowth - quarterlylogapproxseries) quarterlygrowthlogerrormax <- max(quarterlygrowth - quarterlylogapproxseries) quarterlygrowthlogerrordec <- time(quarterlygrowthlogerror, offset=flowstockset) quarterlygrowthlogerrorquad <- quarterlygrowthlogerrordec^2 quarterlygrowthannualizedapproxerror <- quarterlygrowthannualized - quarterlygrowthannualizedapprox quarterlygrowthannualizedapproxerrormin <- min(quarterlygrowthannualized - quarterlygrowthannualizedapprox) quarterlygrowthannualizedapproxerrormean <- mean(quarterlygrowthannualized - quarterlygrowthannualizedapprox) quarterlygrowthannualizedapproxerrormax <- max(quarterlygrowthannualized - quarterlygrowthannualizedapprox) quarterlygrowthannualizedapproxerrordec <- time(quarterlygrowthannualizedapproxerror, offset=flowstockset) quarterlygrowthannualizedapproxerrorquad <- quarterlygrowthannualizedapproxerrordec^2 quarterlygrowthannualizedlogerror <- quarterlygrowthannualized - quarterlygrowthannualizedlog quarterlygrowthannualizedlogerrormin <- min(quarterlygrowthannualized - quarterlygrowthannualizedlog) quarterlygrowthannualizedlogerrormean <- mean(quarterlygrowthannualized - quarterlygrowthannualizedlog) quarterlygrowthannualizedlogerrormax <- max(quarterlygrowthannualized - quarterlygrowthannualizedlog) quarterlygrowthannualizedlogerrordec <- time(quarterlygrowthannualizedlogerror, offset=flowstockset) quarterlygrowthannualizedlogerrorquad <- quarterlygrowthannualizedlogerrordec^2 quarterlylogapproxerror <- quarterlygrowthannualized - quarterlylogapproxseries quarterlylogapproxerrormin <- min(quarterlygrowthannualized - quarterlylogapproxseries) quarterlylogapproxerrormean <- mean(quarterlygrowthannualized - quarterlylogapproxseries) quarterlylogapproxerrormax <- max(quarterlygrowthannualized - quarterlylogapproxseries) quarterlylogapproxerrordec <- time(quarterlylogapproxerror, offset=flowstockset) quarterlylogapproxerrorquad <- quarterlylogapproxerrordec^2 quarterlygrowthannualizedlogapproxerror <- quarterlygrowthannualized - quarterlygrowthannualizedapproxlog quarterlygrowthannualizedlogapproxerrormin <- min(quarterlygrowthannualized - quarterlygrowthannualizedapproxlog) quarterlygrowthannualizedlogapproxerrormean <- mean(quarterlygrowthannualized - quarterlygrowthannualizedapproxlog) quarterlygrowthannualizedlogapproxerrormax <- max(quarterlygrowthannualized - quarterlygrowthannualizedapproxlog) quarterlygrowthannualizedlogapproxerrordec <- time(quarterlygrowthannualizedlogapproxerror, offset=flowstockset) quarterlygrowthannualizedlogapproxerrorquad <- quarterlygrowthannualizedlogapproxerrordec^2 annualgrowthlogerror <- annualgrowth - annuallogapproxseries annualgrowthlogerrormin <- min(annualgrowth - annuallogapproxseries) annualgrowthlogerrormean <- mean(annualgrowth - annuallogapproxseries) annualgrowthlogerrormax <- max(annualgrowth - annuallogapproxseries) annualgrowthlogerrordec <- time(annualgrowthlogerror, offset=flowstockset) annualgrowthlogerrorquad <- annualgrowthlogerrordec^2 ## Index base 100 - Computing base value at index of the selected year or timeframe base <- window(annualtimeseries, c(indexstartyear,indexstartmonth), c(indexendyear,indexendmonth)) base <- c(base) indexvalue <- annualtimeseries/base*100 ## Index Series - Index base 100 series at the computed index for the chosen timespan annualindexseries <- ts(indexvalue, start=c(startyear, startmonth), end=c(endyear, endmonth)) annualindexseriesdiffs <- window(indexvalue, c(startyear, startmonth), c(endyear, endmonth))-100 ## Index Series - Decimal Date Formatting annualindexseriesdec <- time(annualindexseries, offset=flowstockset) annualindexseriesdiffsdec <- time(annualindexseriesdiffs, offset=flowstockset) annualindexseriesquad <- annualindexseriesdec^2 annualindexseriesdiffsquad <- annualindexseriesdiffsdec^2 ## Aggregation - Monthly to Quarterly quarterlyaggregate <- aggregate(monthlytimeseries, nfrequency=4, FUN=sum) ## Aggregation - Monthly to Quarterly - Decimal Date Formatting quarterlyaggregatedec <- time(quarterlyaggregate, offset=flowstockset) ## Aggregation - Monthly to Yearly annualaggregate <- aggregate(monthlytimeseries, FUN=sum) ## Aggregation - Monthly to Yearly - Decimal Date Formatting annualaggregatedec <- time(annualaggregate, offset=flowstockset) ## Aggregation - Quarterly to Yearly quarterlyannualaggregate <- aggregate(quarterlytimeseries, nfrequency=1, FUN=sum) ## Aggregation - Quarterly to Yearly - Decimal Date Formatting quarterlyannualaggregatedec <- time(quarterlyannualaggregate, offset=flowstockset) ## Aggregated Index Series - Index base 100 series using an aggregate of the chosen timespan aggregatedbase <- ts(annualaggregate, start=c(indexstartyear,indexstartmonth), end=c(indexendyear,indexendmonth)) aggregatedbase <- c(aggregatedbase) annualaggregatedindexseries <- annualaggregate/aggregatedbase*100 annualaggregatedindexseriesdiffs <- window(annualaggregatedindexseries, c(startyear,startmonth), c(endyear,endmonth))-100 ## Aggregated Index Series - Decimal Date Formatting annualaggregatedindexseriesdec <- time(annualaggregatedindexseries, offset=flowstockset) annualaggregatedindexseriesdiffsdec <- time(annualaggregatedindexseriesdiffs, offset=flowstockset) ## Trending - Fitting Linear Trends monthlytimeserieslinfit <- lm(monthlytimeseries~monthlytimeseriesdec) quarterlytimeserieslinfit <- lm(quarterlytimeseries~quarterlytimeseriesdec) annualtimeserieslinfit <- lm(annualtimeseries~annualtimeseriesdec) monthlytimeseriescoef <- coef(monthlytimeserieslinfit) quarterlytimeseriescoef <- coef(quarterlytimeserieslinfit) annualtimeseriescoef <- coef(annualtimeserieslinfit) monthlytimeserieslintrend <- monthlytimeseriescoef[1] + monthlytimeseriescoef[2]*monthlytimeseriesdec quarterlytimeserieslintrend <- quarterlytimeseriescoef[1] + quarterlytimeseriescoef[2]*quarterlytimeseriesdec annualtimeserieslintrend <- annualtimeseriescoef[1] + annualtimeseriescoef[2]*annualtimeseriesdec monthlylogserieslinfit <- lm(monthlylogseries~monthlylogseriesdec) quarterlylogserieslinfit <- lm(quarterlylogseries~quarterlylogseriesdec) annuallogserieslinfit <- lm(annuallogseries~annuallogseriesdec) monthlylogseriescoef <- coef(monthlylogserieslinfit) quarterlylogseriescoef <- coef(quarterlylogserieslinfit) annuallogseriescoef <- coef(annuallogserieslinfit) monthlylogserieslintrend <- monthlylogseriescoef[1] + monthlylogseriescoef[2]*monthlylogseriesdec quarterlylogserieslintrend <- quarterlylogseriescoef[1] + quarterlylogseriescoef[2]*quarterlylogseriesdec annuallogserieslintrend <- annuallogseriescoef[1] + annuallogseriescoef[2]*annuallogseriesdec monthlylogapproxserieslinfit <- lm(monthlylogapproxseries~monthlylogapproxseriesdec) quarterlylogapproxserieslinfit <- lm(quarterlylogapproxseries~quarterlylogapproxseriesdec) annuallogapproxserieslinfit <- lm(annuallogapproxseries~annuallogapproxseriesdec) monthlylogapproxseriescoef <- coef(monthlylogapproxserieslinfit) quarterlylogapproxseriescoef <- coef(quarterlylogapproxserieslinfit) annuallogapproxseriescoef <- coef(annuallogapproxserieslinfit) monthlylogapproxserieslintrend <- monthlylogapproxseriescoef[1] + monthlylogapproxseriescoef[2]*monthlylogapproxseriesdec quarterlylogapproxserieslintrend <- quarterlylogapproxseriescoef[1] + quarterlylogapproxseriescoef[2]*quarterlylogapproxseriesdec annuallogapproxserieslintrend <- annuallogapproxseriescoef[1] + annuallogapproxseriescoef[2]*annuallogapproxseriesdec monthlygrowthlinfit <- lm(monthlygrowth~monthlygrowthdec) monthlygrowthquartisedlinfit <- lm(monthlygrowthquartised~monthlygrowthquartiseddec) monthlygrowthquartisedloglinfit <- lm(monthlygrowthquartisedlog~monthlygrowthquartisedlogdec) monthlygrowthquartisedapproxlinfit <- lm(monthlygrowthquartisedapprox~monthlygrowthquartisedapproxdec) monthlygrowthquartisedapproxloglinfit <- lm(monthlygrowthquartisedapproxlog~monthlygrowthquartisedapproxlogdec) monthlygrowthannualizedlinfit <- lm(monthlygrowthannualized~monthlygrowthannualizeddec) monthlygrowthannualizedloglinfit <- lm(monthlygrowthannualizedlog~monthlygrowthannualizedlogdec) monthlygrowthannualizedapproxlinfit <- lm(monthlygrowthannualizedapprox~monthlygrowthannualizedapproxdec) monthlygrowthannualizedapproxloglinfit <- lm(monthlygrowthannualizedapproxlog~monthlygrowthannualizedapproxlogdec) monthlygrowthcoef <- coef(monthlygrowthlinfit) monthlygrowthquartisedcoef <- coef(monthlygrowthquartisedlinfit) monthlygrowthquartisedlogcoef <- coef(monthlygrowthquartisedloglinfit) monthlygrowthquartisedapproxcoef <- coef(monthlygrowthquartisedapproxlinfit) monthlygrowthquartisedapproxlogcoef <- coef(monthlygrowthquartisedapproxloglinfit) monthlygrowthannualizedcoef <- coef(monthlygrowthannualizedlinfit) monthlygrowthannualizedlogcoef <- coef(monthlygrowthannualizedloglinfit) monthlygrowthannualizedapproxcoef <- coef(monthlygrowthannualizedapproxlinfit) monthlygrowthannualizedapproxlogcoef <- coef(monthlygrowthannualizedapproxloglinfit) monthlygrowthlintrend <- monthlygrowthcoef[1] + monthlygrowthcoef[2]*monthlygrowthdec monthlygrowthquartisedlintrend <- monthlygrowthquartisedcoef[1] + monthlygrowthquartisedcoef[2]*monthlygrowthquartiseddec monthlygrowthquartisedloglintrend <- monthlygrowthquartisedlogcoef[1] + monthlygrowthquartisedlogcoef[2]*monthlygrowthquartisedlogdec monthlygrowthquartisedapproxlintrend <- monthlygrowthquartisedapproxcoef[1] + monthlygrowthquartisedapproxcoef[2]*monthlygrowthquartisedapproxdec monthlygrowthquartisedapproxloglintrend <- monthlygrowthquartisedapproxlogcoef[1] + monthlygrowthquartisedapproxlogcoef[2]*monthlygrowthquartisedapproxlogdec monthlygrowthannualizedlintrend <- monthlygrowthannualizedcoef[1] + monthlygrowthannualizedcoef[2]*monthlygrowthannualizeddec monthlygrowthannualizedloglintrend <- monthlygrowthannualizedlogcoef[1] + monthlygrowthannualizedlogcoef[2]*monthlygrowthannualizeddec monthlygrowthannualizedapproxlintrend <- monthlygrowthannualizedapproxcoef[1] + monthlygrowthannualizedapproxcoef[2]*monthlygrowthannualizedapproxdec monthlygrowthannualizedapproxloglintrend <- monthlygrowthannualizedapproxlogcoef[1] + monthlygrowthannualizedapproxlogcoef[2]*monthlygrowthannualizedapproxlogdec quarterlygrowthlinfit <- lm(quarterlygrowth~quarterlygrowthdec) quarterlygrowthannualizedlinfit <- lm(quarterlygrowthannualized~quarterlygrowthannualizeddec) quarterlygrowthannualizedloglinfit <- lm(quarterlygrowthannualizedlog~quarterlygrowthannualizedlogdec) quarterlygrowthannualizedapproxlinfit <- lm(quarterlygrowthannualizedapprox~quarterlygrowthannualizedapproxdec) quarterlygrowthannualizedapproxloglinfit <- lm(quarterlygrowthannualizedapproxlog~quarterlygrowthannualizedapproxlogdec) quarterlygrowthcoef <- coef(quarterlygrowthlinfit) quarterlygrowthannualizedcoef <- coef(quarterlygrowthannualizedlinfit) quarterlygrowthannualizedlogcoef <- coef(quarterlygrowthannualizedloglinfit) quarterlygrowthannualizedapproxcoef <- coef(quarterlygrowthannualizedapproxlinfit) quarterlygrowthannualizedapproxlogcoef <- coef(quarterlygrowthannualizedapproxloglinfit) quarterlygrowthlintrend <- quarterlygrowthcoef[1] + quarterlygrowthcoef[2]*quarterlygrowthdec quarterlygrowthannualizedlintrend <- quarterlygrowthannualizedcoef[1] + quarterlygrowthannualizedcoef[2]*quarterlygrowthannualizeddec quarterlygrowthannualizedloglintrend <- quarterlygrowthannualizedlogcoef[1] + quarterlygrowthannualizedlogcoef[2]*quarterlygrowthannualizedlogdec quarterlygrowthannualizedapproxlintrend <- quarterlygrowthannualizedapproxcoef[1] + quarterlygrowthannualizedapproxcoef[2]*quarterlygrowthannualizedapproxdec quarterlygrowthannualizedapproxloglintrend <- quarterlygrowthannualizedapproxlogcoef[1] + quarterlygrowthannualizedapproxlogcoef[2]*quarterlygrowthannualizedapproxlogdec annualgrowthlinfit <- lm(annualgrowth~annualgrowthdec) annualgrowthcoef <- coef(annualgrowthlinfit) annualgrowthlintrend <- annualgrowthcoef[1] + annualgrowthcoef[2]*annualgrowthdec monthlygrowthlogerrorlinfit <- lm(monthlygrowthlogerror~monthlygrowthlogerrordec) monthlygrowthannualizedapproxerrorlinfit <- lm(monthlygrowthannualizedapproxerror~monthlygrowthannualizedapproxerrordec) monthlygrowthannualizedlogerrorlinfit <- lm(monthlygrowthannualizedlogerror~monthlygrowthannualizedlogerrordec) monthlygrowthannualizedlogapproxerrorlinfit <- lm(monthlygrowthannualizedlogapproxerror~monthlygrowthannualizedlogapproxerrordec) monthlylogapproxerrorlinfit <- lm(monthlylogapproxerror~monthlylogapproxerrordec) monthlygrowthquartisedapproxerrorlinfit <- lm(monthlygrowthquartisedapproxerror~monthlygrowthquartisedapproxerrordec) monthlygrowthlogerrorcoef <- coef(monthlygrowthlogerrorlinfit) monthlygrowthannualizedapproxerrorcoef <- coef(monthlygrowthannualizedapproxerrorlinfit) monthlygrowthannualizedlogerrorcoef <- coef(monthlygrowthannualizedlogerrorlinfit) monthlygrowthannualizedlogapproxerrorcoef <- coef(monthlygrowthannualizedlogapproxerrorlinfit) monthlylogapproxerrorcoef <- coef(monthlylogapproxerrorlinfit) monthlygrowthquartisedapproxerrorcoef <- coef(monthlygrowthquartisedapproxerrorlinfit) monthlygrowthlogerrorlintrend <- monthlygrowthlogerrorcoef[1] + monthlygrowthlogerrorcoef[2]*monthlygrowthlogerrordec monthlygrowthannualizedapproxerrorlintrend <- monthlygrowthannualizedapproxerrorcoef[1] + monthlygrowthannualizedapproxerrorcoef[2]*monthlygrowthannualizedapproxerrordec monthlygrowthannualizedlogerrorlintrend <- monthlygrowthannualizedlogerrorcoef[1] + monthlygrowthannualizedlogerrorcoef[2]*monthlygrowthannualizedlogerrordec monthlygrowthannualizedlogapproxerrorlintrend <- monthlygrowthannualizedlogapproxerrorcoef[1] + monthlygrowthannualizedlogapproxerrorcoef[2]*monthlygrowthannualizedlogapproxerrordec monthlylogapproxerrorlintrend <- monthlylogapproxerrorcoef[1] + monthlylogapproxerrorcoef[2]*monthlylogapproxerrordec monthlygrowthquartisedapproxerrorlintrend <- monthlygrowthquartisedapproxerrorcoef[1] + monthlygrowthquartisedapproxerrorcoef[2]*monthlygrowthquartisedapproxerrordec quarterlygrowthlogerrorlinfit <- lm(quarterlygrowthlogerror~quarterlygrowthlogerrordec) quarterlygrowthannualizedapproxerrorlinfit <- lm(quarterlygrowthannualizedapproxerror~quarterlygrowthannualizedapproxerrordec) quarterlygrowthannualizedlogerrorlinfit <- lm(quarterlygrowthannualizedlogerror~quarterlygrowthannualizedlogerrordec) quarterlylogapproxerrorlinfit <- lm(quarterlylogapproxerror~quarterlylogapproxerrordec) quarterlygrowthannualizedlogapproxerrorlinfit <- lm(quarterlygrowthannualizedlogapproxerror~quarterlygrowthannualizedlogapproxerrordec) quarterlygrowthlogerrorcoef <- coef(quarterlygrowthlogerrorlinfit) quarterlygrowthannualizedapproxerrorcoef <- coef(quarterlygrowthannualizedapproxerrorlinfit) quarterlygrowthannualizedlogerrorcoef <- coef(quarterlygrowthannualizedlogerrorlinfit) quarterlylogapproxerrorcoef <- coef(quarterlylogapproxerrorlinfit) quarterlygrowthannualizedlogapproxerrorcoef <- coef(quarterlygrowthannualizedlogapproxerrorlinfit) quarterlygrowthlogerrorlintrend <- quarterlygrowthlogerrorcoef[1] + quarterlygrowthlogerrorcoef[2]*quarterlygrowthlogerrordec quarterlygrowthannualizedapproxerrorlintrend <- quarterlygrowthannualizedapproxerrorcoef[1] + quarterlygrowthannualizedapproxerrorcoef[2]*quarterlygrowthannualizedapproxerrordec quarterlygrowthannualizedlogerrorlintrend <- quarterlygrowthannualizedlogerrorcoef[1] + quarterlygrowthannualizedlogerrorcoef[2]*quarterlygrowthannualizedlogerrordec quarterlylogapproxerrorlintrend <- quarterlylogapproxerrorcoef[1] + quarterlylogapproxerrorcoef[2]*quarterlylogapproxerrordec quarterlygrowthannualizedlogapproxerrorlintrend <- quarterlygrowthannualizedlogapproxerrorcoef[1] + quarterlygrowthannualizedlogapproxerrorcoef[2]*quarterlygrowthannualizedlogapproxerrordec annualgrowthlogerrorlinfit <- lm(annualgrowthlogerror~annualgrowthlogerrordec) annualgrowthlogerrorcoef <- coef(annualgrowthlogerrorlinfit) annualgrowthlogerrorlintrend <- annualgrowthlogerrorcoef[1] + annualgrowthlogerrorcoef[2]*annualgrowthlogerrordec annualindexserieslinfit <- lm(annualindexseries~annualindexseriesdec) annualindexseriesdiffslinfit <- lm(annualindexseriesdiffs~annualindexseriesdiffsdec) annualindexseriescoef <- coef(annualindexserieslinfit) annualindexseriesdiffscoef <- coef(annualindexseriesdiffslinfit) annualindexserieslintrend <- annualindexseriescoef[1] + annualindexseriescoef[2]*annualindexseriesdec annualindexseriesdiffslintrend <- annualindexseriesdiffscoef[1] + annualindexseriesdiffscoef[2]*annualindexseriesdiffsdec quarterlyaggregatelinfit <- lm(quarterlyaggregate~quarterlyaggregatedec) annualaggregatelinfit <- lm(annualaggregate~annualaggregatedec) quarterlyannualaggregatelinfit <- lm(quarterlyannualaggregate~quarterlyannualaggregatedec) quarterlyaggregatecoef <- coef(quarterlyaggregatelinfit) annualaggregatecoef <- coef(annualaggregatelinfit) quarterlyannualaggregatecoef <- coef(quarterlyannualaggregatelinfit) quarterlyaggregatelintrend <- quarterlyaggregatecoef[1] + quarterlyaggregatecoef[2]*quarterlyaggregatedec annualaggregatelintrend <- annualaggregatecoef[1] + annualaggregatecoef[2]*annualaggregatedec quarterlyannualaggregatelintrend <- quarterlyannualaggregatecoef[1] + quarterlyannualaggregatecoef[2]*quarterlyannualaggregatedec annualaggregatedindexserieslinfit <- lm(annualaggregatedindexseries~annualaggregatedindexseriesdec) annualaggregatedindexseriesdiffslinfit <- lm(annualaggregatedindexseriesdiffs~annualaggregatedindexseriesdiffsdec) annualaggregatedindexseriescoef <- coef(annualaggregatedindexserieslinfit) annualaggregatedindexseriesdiffscoef <- coef(annualaggregatedindexseriesdiffslinfit) annualaggregatedindexserieslintrend <- annualaggregatedindexseriescoef[1] + annualaggregatedindexseriescoef[2]*annualaggregatedindexseriesdec annualaggregatedindexseriesdiffslintrend <- annualaggregatedindexseriesdiffscoef[1] + annualaggregatedindexseriesdiffscoef[2]*annualaggregatedindexseriesdiffsdec ## Trending - Fitting Quadratic Trends monthlytimeseriesquadfit <- lm(monthlytimeseries~monthlytimeseriesdec+monthlytimeseriesquad) quarterlytimeseriesquadfit <- lm(quarterlytimeseries~quarterlytimeseriesdec+quarterlytimeseriesquad) annualtimeseriesquadfit <- lm(annualtimeseries~annualtimeseriesdec+annualtimeseriesquad) monthlytimeseriesquadcoef <- coef(monthlytimeseriesquadfit) quarterlytimeseriesquadcoef <- coef(quarterlytimeseriesquadfit) annualtimeseriesquadcoef <- coef(annualtimeseriesquadfit) monthlytimeseriesquadtrend <- monthlytimeseriesquadcoef[1] + monthlytimeseriesquadcoef[2]*monthlytimeseriesdec + monthlytimeseriesquadcoef[3]*monthlytimeseriesquad quarterlytimeseriesquadtrend <- quarterlytimeseriesquadcoef[1] + quarterlytimeseriesquadcoef[2]*quarterlytimeseriesdec + quarterlytimeseriesquadcoef[3]*quarterlytimeseriesquad annualtimeseriesquadtrend <- annualtimeseriesquadcoef[1] + annualtimeseriesquadcoef[2]*annualtimeseriesdec + annualtimeseriesquadcoef[3]*annualtimeseriesquad monthlylogseriesquadfit <- lm(monthlylogseries~monthlylogseriesdec+monthlylogseriesquad) quarterlylogseriesquadfit <- lm(quarterlylogseries~quarterlylogseriesdec+quarterlylogseriesquad) annuallogseriesquadfit <- lm(annuallogseries~annuallogseriesdec+annuallogseriesquad) monthlylogseriesquadcoef <- coef(monthlylogseriesquadfit) quarterlylogseriesquadcoef <- coef(quarterlylogseriesquadfit) annuallogseriesquadcoef <- coef(annuallogseriesquadfit) monthlylogseriesquadtrend <- monthlylogseriesquadcoef[1] + monthlylogseriesquadcoef[2]*monthlylogseriesdec + monthlylogseriesquadcoef[3]*monthlylogseriesquad quarterlylogseriesquadtrend <- quarterlylogseriesquadcoef[1] + quarterlylogseriesquadcoef[2]*quarterlylogseriesdec + quarterlylogseriesquadcoef[3]*quarterlylogseriesquad annuallogseriesquadtrend <- annuallogseriesquadcoef[1] + annuallogseriesquadcoef[2]*annuallogseriesdec + annuallogseriesquadcoef[3]*annuallogseriesquad ``` It should be noted that many datasets are organised on a quarterly time-scale, not monthly. This is one of the issues that is addressed during the previously mentioned data normalization phase, meaning there are no errors or warnings when using this code with the given dataset. \newline ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(1,3)) ## Plotting the initial time series sets onto line graphs plot(monthlytimeseries, main="Monthly Time Series", xlab="Time", ylab="Expenditure") plot(quarterlytimeseries, main="Quarterly Time Series", xlab="Time", ylab="Expenditure") plot(annualtimeseries, main="Annual Time Series", xlab="Time", ylab="Expenditure") ``` As can be seen in the above set of line graphs, the trend appears to be that over time, expenditure increases, decreases, or remains the same. Automatically inserting an appropriate description for any series has not yet been implemented. \newline 2. \textit{Repeating the previous steps using the log-scale.} \newline ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(1,3)) ## Plotting the logs series sets onto line graphs plot(monthlylogseries,main="Monthly Log Series", xlab="Time", ylab="Expenditure") plot(quarterlylogseries,main="Quarterly Log Series", xlab="Time", ylab="Expenditure") plot(annuallogseries,main="Annual Log Series", xlab="Time", ylab="Expenditure") ``` Based on the log-scale versions of the graphs, it would appear that a steady increase in growth rates would correlate with a rather flat graph, whereas a graph that fluctuates greatly would imply a more volatile growth rate automatically generating appropriate responses has not yet been implemented at this time. \newline 3. \textit{Plotting the annualized growth and including a space for a desciption.} \newline The exact calculations have been used for the growth rates themselves, as well as for the annualization of those growth rates: \newline ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(1,3)) ## Plotting the annualized growth rates plot(monthlygrowthannualized, main="Monthly Growth - Annualized", xlab="Time", ylab="Growth") plot(quarterlygrowthannualized, main="Quarterly Growth - Annualized", xlab="Time", ylab="Growth") plot(annualgrowth, main="Annual Growth", xlab="Time", ylab="Growth") ``` While all code and graphing is accomplished using R, all text including this line here is written in LaTeX. This is possible through the use of R Markdown, which I personally find to be an incredibly powerful tool for automatically generating professional-looking reports from arbitraty datasets. \newline \Large \textbf{Section B: Time Series Decomposition} \normalsize 1. \textit{Fitting linear and quadratic trends and plotting these trends to a line graph.} \newline As can be seen from the below set of line plots, it is clear that for some graphs, the linear trend maps more closely to the original line, whereas in other cases it is instead the quadratic trend that maps more closely to the original line. \newline ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(1,3)) ## Plotting the original time series alongside the corresponding linear and quadratic trends plot(monthlytimeseries, main="Monthly Time Series", xlab="Expenditure", ylab="Time", lwd=2) lines(monthlytimeserieslintrend, col="green", lwd=1, lty=2) lines(monthlytimeseriesquadtrend, col="red", lwd=1, lty=2) legend("topleft", c("Monthly", "Trend (L)", "Trend (Q)"), col=1:3, lty=1:2, lwd=2, bty='n') plot(quarterlytimeseries, main="Quarterly Time Series", xlab="Expenditure", ylab="Time", lwd=2) lines(quarterlytimeserieslintrend, col="green", lwd=1, lty=2) lines(quarterlytimeseriesquadtrend, col="red", lwd=1, lty=2) legend("topleft", c("Quarterly", "Trend (L)", "Trend (Q)"), col=1:3, lty=1:2, lwd=2, bty='n') plot(annualtimeseries, main="Annual Time Series", xlab="Expenditure", ylab="Time", lwd=2) lines(annualtimeserieslintrend, col="green", lwd=1, lty=2) lines(annualtimeseriesquadtrend, col="red", lwd=1, lty=2) legend("topleft", c("Annual", "Trend (L)", "Trend (Q)"), col=1:2, lty=1:3, lwd=2, bty='n') ``` 2. \textit{Fitting linear and quadratic trends to the log series.} \newline Remember, just because the linear trend fits most closely to a particular series does not mean the linear trend will also fit most closely to the log of that series. This same principle also applies to quadratic trends. \newline ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(1,3)) ## Plotting the log series alongside the corresponding linear and quadratic trends plot(monthlylogseries, main="Monthly Time Series", xlab="Expenditure", ylab="Time", lwd=2) lines(monthlylogserieslintrend, col="green", lwd=1, lty=2) lines(monthlylogseriesquadtrend, col="red", lwd=1, lty=2) legend("topleft", c("Monthly", "Trend (L)", "Trend (Q)"), col=1:3, lty=1:2, lwd=2, bty='n') plot(quarterlylogseries, main="Quarterly Time Series", xlab="Expenditure", ylab="Time", lwd=2) lines(quarterlylogserieslintrend, col="green", lwd=1, lty=2) lines(quarterlylogseriesquadtrend, col="red", lwd=1, lty=2) legend("topleft", c("Quarterly", "Trend (L)", "Trend (Q)"), col=1:3, lty=1:2, lwd=2, bty='n') plot(annuallogseries, main="Annual Time Series", xlab="Expenditure", ylab="Time", lwd=2) lines(annuallogserieslintrend, col="green", lwd=1, lty=2) lines(annuallogseriesquadtrend, col="red", lwd=1, lty=2) legend("topleft", c("Annual", "Trend (L)", "Trend (Q)"), col=1:2, lty=1:3, lwd=2, bty='n') ``` \textit{Computing the following questions using the log of the series and the trends calculated previously.} \newline 3. \textit{Plotting the detrended series using the trend that best fit the series.} \newline The graphed data, shown below, is the detrended log series of the original extracted data using the quadratic trend. As one can see, the short-term fluctuations are very clearly highlighted. These fluctuations may still be visible when looking at the original dataset, however. \newline ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(1,3)) ## Plotting the logs series sets onto line graphs plot((monthlylogseries-monthlylogseriesquadtrend),main="Monthly Log Series", xlab="Time", ylab="Detrended Value") plot((quarterlylogseries-quarterlylogseriesquadtrend),main="Quarterly Log Series", xlab="Time", ylab="Detrended Value") plot((annuallogseries-annuallogseriesquadtrend),main="Annual Log Series", xlab="Time", ylab="Detrended Value") ``` 4. \textit{Computing then plotting cyclical component using a moving average of order 5.} \newline The below line plot shows the log series of the original extracted data, alongside its cyclical component, calculated using a moving average of 5 with the quadratic trend. \newline As should be highlighted by the below set of plots, the extent to which the actual measured data diverges from the quadratic trend is seasonal in many cases. ```{r echo = FALSE} ## Creating the window for the plots par(mfrow=c(2,1)) ## Plotting the logs series sets onto line graphs plot((quarterlylogseries-quarterlylogseriesquadtrend),main="Quarterly Log Series", xlab="Time", ylab="Expenditure (Log)") lines(filter((quarterlylogseries-quarterlylogseriesquadtrend), filter=rep(1/5,5)), col="red", lty=2) plot((annuallogseries-annuallogseriesquadtrend),main="Annual Log Series", xlab="Time", ylab="Expenditure (Log)") lines(filter((annuallogseries-annuallogseriesquadtrend), filter=rep(1/5,5)), col="red", lty=2) ``` 5. \textit{Plotting the low frequency of the series} \newline As can be seen from the low frequency component of the series, when one ignores the 'noise' in the data, some information becomes easier to identify, like what effect a given seasonal timeframe has. ```{r echo = FALSE} par(mfrow=c(2,1)) plot(quarterlylogseriesquadtrend+(filter((quarterlylogseries-quarterlylogseriesquadtrend), filter=rep(1/5,5))), main="Quarterly Log - Low Freq. Comp.", xlab="Time", ylab="Expenditure (Log)") plot(annuallogseriesquadtrend+(filter((annuallogseries-annuallogseriesquadtrend), filter=rep(1/5,5))), main="Annual Log - Low Freq. Comp.", xlab="Time", ylab="Expenditure (Log)") ``` 6. \textit{Computing the seasonal component and representing it on a bar chart.} \newline As can be seen from the bar seasonal bar graph shown below, any seasonal differences that may exist are now much more visible than they were previously. \newline ```{r echo = FALSE} decompa <- ts(quarterlylogseries-quarterlylogseriesquadtrend, frequency=12) decompb <- ts(annuallogseries-annuallogseriesquadtrend, frequency=12) decompquart <- decompose(decompa, filter=rep(1/13,13)) decompannu <- decompose(decompb, filter=rep(1/13,13)) par(mfrow=c(2,1)) barplot(decompquart$seasonal[1:4], main="Seasonal Component (Quarterly)", xlab="Season", ylab="Seasonal Component") barplot(decompannu$seasonal[1:4], main="Seasonal Component (Annual)", xlab="Season", ylab="Seasonal Component") ``` \Large \textbf{Section C: Comovement} \normalsize 1. \textit{Creating a scatter plot of a series expressed in logs against another series also expressed in logs.} \newline ```{r echo = FALSE} ## File location - Change this to the location of the file, including the filename itself filepath = "/home/user/Downloads/data2.csv" ## First line - Change this to the number of lines that must be skipped before the data begins filestart = 0 ## Ignored characters - Change this to any string or character in the dataset that should be ignored fileignore = "ignore" ## Naming - TRUE means the first line lists the name(s) of each column. FALSE means the first line is just data. filenamed = TRUE ## Columns - Which column of data would you like to select? filecolumn = "expenditure" ## Export location - Change this to the location where you would like the outputted files to be saved. exportpath = "/home/user/Downloads/" ## Exported Filename - Change this to whatever name you would like your exported file to have. exportfile = "export" ### ### Data selection input block - This block is how the user selects things like the timeframe they are interested in looking at. ### ## Stock versus Flow - Set to 0.5 if you would like to use flow variables and 1 if you would like to use stock variables flowstockset <- 0.5 ## The beginning of the chosen data. For exceptionally large datasets, this allows only a portion to be parsed initially. datastartyear <- 1950 datastartquarter <- 1 datastartmonth <- 1 datastartday <- 15 ## The beginning of the chosen timeframe. This allows one to analyse just one given subset of the imported data. startyear <- 1950 startquarter <- 1 startmonth <- 1 startday <- 15 ## Index times - To do things like calculate the Index Base 100, select your chosen timeframe here. ## Remember - To select just one year, the starts and ends should match. Different values imply a range other than 1 year. indexstartyear <- 2000 indexstartquarter <- 1 indexstartmonth <- 3 indexstartday <- 15 indexendyear <- 2000 indexendquarter <- 1 indexendmonth <- 3 indexendday <- 15 ## The end of the chosen timeframe. This allows one to analyse just one given subset of the imported data. endyear <- 2020 endquarter <- 4 endmonth <- 12 endday <- 15 ## The end of the chosen data. For exceptionally large datasets, this allows only a portion to be parsed initially. dataendyear <- 2020 dataendquarter <- 4 dataendmonth <- 12 dataendday <- 15 ### ### File data loading block - This logic parses and partially sanitizes data contained within a file of any format supported by Rio ### ## Parsing the arbitrary file to a variable, skipping any rows that need to be skipped dataraw <- import(filepath, skip=filestart, header=filenamed) ## Exporting the contents of the variable to a temp file in the CSV format export(dataraw, "temp.csv") ## Declaring the path to this temp file as a string path <- "temp.csv" ## Importing the temp file using native CSV importing functions in R dataimport2 <- read.csv(path, na.strings=fileignore, header=TRUE) ## Deleting the temp file now that it is no longer needed file.remove(path) ### ### Data sanitizing block - This logic continues the process of sanitizing the imported data ### ## Column prep - Parsing possible column names into their own vairables - This is purely for the sake of making the code easier to read and understand monthcheck <- "month" monthscheck <- "months" mcheck <- "m" quartercheck <- "quarter" quarterscheck <- "quarters" qcheck <- "q" yearcheck <- "year" yearscheck <- "years" ycheck <- "y" ## Column & Row import - Parsing the actual names of the existing columns columnnames <- tolower(colnames(dataimport2)) rownames <- tolower(rownames(dataimport2)) ## Row name checks - Determining whether the table needs to be rotated by checking the names of the table rows monthcheckpass <- monthcheck[monthcheck %in% rownames] monthscheckpass <- monthscheck[monthscheck %in% rownames] mcheckpass <- mcheck[mcheck %in% rownames] quartercheckpass <- quartercheck[quartercheck %in% rownames] quarterscheckpass <- quarterscheck[quarterscheck %in% rownames] qcheckpass <- qcheck[qcheck %in% rownames] yearcheckpass <- yearcheck[yearcheck %in% rownames] yearscheckpass <- yearscheck[yearscheck %in% rownames] ycheckpass <- ycheck[ycheck %in% rownames] ## Row name sanitization - Ensuring the row names on rotated tables are better standardized for easier data manipulation if(length(ycheckpass) > 0) { names(dataimport)[names(data) == "y"] <- "year" } if(length(yearscheckpass) > 0) { names(dataimport2)[names(data) == "years"] <- "year" } if(length(qcheckpass) > 0) { names(dataimport2)[names(data) == "q"] <- "quarter" } if(length(quarterscheckpass) > 0) { names(dataimport2)[names(data) == "quarters"] <- "quarter" } if(length(mcheckpass) > 0) { names(dataimport2)[names(data) == "m"] <- "month" } if(length(monthscheckpass) > 0) { names(dataimport2)[names(data) == "months"] <- "month" } ## Table transposition - Ensuring rotated tables are reoriented correctly if(any(c(monthcheckpass, quartercheckpass, yearcheckpass) > 0)) { dataimport2 <- t(dataimport2) } ## Column name checks - Determining whether or not column names referenced during column prep exist in the imported table monthcheckpass <- monthcheck[monthcheck %in% columnnames] monthscheckpass <- monthscheck[monthscheck %in% columnnames] mcheckpass <- mcheck[mcheck %in% columnnames] quartercheckpass <- quartercheck[quartercheck %in% columnnames] quarterscheckpass <- quarterscheck[quarterscheck %in% columnnames] qcheckpass <- qcheck[qcheck %in% columnnames] yearcheckpass <- yearcheck[yearcheck %in% columnnames] yearscheckpass <- yearscheck[yearscheck %in% columnnames] ycheckpass <- ycheck[ycheck %in% columnnames] ## Column name sanitization - Ensuring the column names are better standardized for easier data manipulation if(length(ycheckpass) > 0) { names(dataimport2)[names(data) == "y"] <- "year" } if(length(yearscheckpass) > 0) { names(dataimport2)[names(data) == "years"] <- "year" } if(length(qcheckpass) > 0) { names(dataimport2)[names(data) == "q"] <- "quarter" } if(length(quarterscheckpass) > 0) { names(dataimport2)[names(data) == "quarters"] <- "quarter" } if(length(mcheckpass) > 0) { names(dataimport2)[names(data) == "m"] <- "month" } if(length(monthscheckpass) > 0) { names(dataimport2)[names(data) == "months"] <- "month" } ## Column name checks - Determining whether or not the standardized column names are present monthcheckpass <- monthcheck[monthcheck %in% columnnames] quartercheckpass <- quartercheck[quartercheck %in% columnnames] ## Month insertion - If quarters are present yet months are not, a months column is inserted and populated. if(length(quartercheckpass) > 0) { if(!(length(monthcheckpass) > 0)) { colnames(dataimport2) <- tolower(colnames(dataimport2)) newtablemanip2 <- NULL newtablemanipyear2 <- ts(data=dataimport2["year"]) newtablemanipquarter2 <- ts(data=dataimport2["quarter"]) newtablemanipvalue2 <- ts(data=dataimport2[3]) yearmanip2 <- tsqm(newtablemanipyear2) quartermanip2 <- tsqm(newtablemanipquarter2) valuemanip2 <- tsqm(newtablemanipvalue2) yearlonger2 <- time(yearmanip2) quarterlonger2 <- time(quartermanip2) valuelonger2 <- time(valuemanip2) yearvals2 <- as.vector(yearmanip2) quartervals2 <- as.vector(quartermanip2) valuevals2 <- as.vector(valuemanip2) yearA2 <- as.yearmon(yearlonger2) yearB2 <- as.integer(floor(yearlonger2)) yearC2 <- months(yearA2, abbreviate=TRUE) quarterA2 <- as.yearmon(quarterlonger2) quarterB2 <- as.integer(floor(quarterlonger2)) quarterC2 <- months(quarterA2, abbreviate=TRUE) valueA2 <- as.yearmon(valuelonger2) valueB2 <- as.integer(floor(quarterlonger2)) valueC2 <- months(valueA2, abbreviate=TRUE) yearlongest2 <- data.frame(year=yearB2, month=yearC2, value=yearvals2) quarterlongest2 <- data.frame(year=quarterB2, month=quarterC2, value=quartervals2) valuelongest2 <- data.frame(year=valueB2, month=valueC2, value=valuevals2) warpeddata2 <- data.frame(year=yearlongest2$value, quarter=quarterlongest2$value, month=valueC2, value=valuelongest2$value) namechecker2 <- tolower(colnames(dataimport2)) namechecker <- namechecker2[namechecker2 != "month"] namechecker <- namechecker2[namechecker2 != "year"] namechecker <- namechecker2[namechecker2 != "quarter"] names(warpeddata)[names(warpeddata) == "value"] <- namechecker[1] export(warpeddata2, "temp.csv") temppath2 <- "temp.csv" dataimport2 <- read.csv(temppath2, na.strings=fileignore, header=TRUE) file.remove(temppath2) } } ### ### Data manipulation block - This logic attempts to provide the user with a suite of data analysis functions ### ## Time series - Monthly, Quarterly, & Yearly monthlytimeseries2 <- ts(dataimport[filecolumn], frequency=12, start=c(datastartyear,datastartmonth), end=c(dataendyear,dataendmonth)) quarterlytimeseries2 <- ts(dataimport[filecolumn], frequency=4, start=c(datastartyear,datastartmonth), end=c(dataendyear,dataendmonth)) annualtimeseries2 <- ts(dataimport[filecolumn], frequency=1, start=c(datastartyear,datastartmonth), end=c(dataendyear,dataendmonth)) ## Time series - Decimal Date Formatting monthlytimeseriesdec2 <- time(monthlytimeseries, offset=flowstockset) quarterlytimeseriesdec2 <- time(quarterlytimeseries, offset=flowstockset) annualtimeseriesdec2 <- time(annualtimeseries, offset=flowstockset) monthlytimeseriesquad2 <- monthlytimeseriesdec^2 quarterlytimeseriesquad2 <- quarterlytimeseriesdec^2 annualtimeseriesquad2 <- annualtimeseriesdec^2 ## Nets & Averages - Total Change over the full time Series, and average values monthlytimesum2 <- sum(monthlytimeseries, na.rm=TRUE) quarterlytimesum2 <- sum(quarterlytimeseries, na.rm=TRUE) annualtimesum2 <- sum(annualtimeseries, na.rm=TRUE) monthlytimeavg2 <- mean(monthlytimeseries, na.rm=TRUE) quarterlytimeavg2 <- mean(quarterlytimeseries, na.rm=TRUE) annualtimeavg2 <- mean(annualtimeseries, na.rm=TRUE) ## Log series - Monthly, Quarterly, & Yearly. These help us find the log approximations of growth rate. monthlylogseries2 <- log(monthlytimeseries) quarterlylogseries2 <- log(quarterlytimeseries) annuallogseries2 <- log(annualtimeseries) ## Log series - Decimal Date Formatting monthlylogseriesdec2 <- time(monthlylogseries, offset=flowstockset) quarterlylogseriesdec2 <- time(quarterlylogseries, offset=flowstockset) annuallogseriesdec2 <- time(annuallogseries, offset=flowstockset) monthlylogseriesquad2 <- monthlylogseriesdec^2 quarterlylogseriesquad2 <- quarterlylogseriesdec^2 annuallogseriesquad2 <- annuallogseriesdec^2 par(mfrow=c(1,1)) plot(quarterlylogseries, quarterlylogseries2, main="DAT128 X DAT11 (Log Series')", xlab="DAT128 (Log)", ylab="DAT11 (Log)") ``` ```{r include=FALSE} ### Rendering block - Input this to the R console in order to generate a PDF from this file ## rmarkdown::render("/home/user/Downloads/Report.rmd", output_file=("/home/user/Downloads/Report.pdf")) ```