From this article, a beginner to an advanced R user will be able to reproduce relevant date-related columns from given dates and learn how they are applied in a time series visualisation.
- Linux Arch system
- Installed R
- Date data
Setting up the environment
The core packaged used in the above analysis are the
tidyverse, lubridate, dplyr & date:
Install the packages with:
After installing the packages, load the packages in the R environment:
Set the working directory where your date data has been stored:
Now import the date data:
The attach() function attaches the data to the R search path. It is important to note the type of file will determine the import file code. For this case, the date file in a .csv file. For an excel file use:
Below is the view of the data in the .csv file.
The date.csv file has two variables; Date and Time.
For simple and direct analysis of dates, ensure they are saved in the format “2021-02-23” such that:
“2021” is the year(yyyy)
“02” is the month(mm)
“23” the day(dd)
From the Dates column, the Year, Month, Week, Day columns are to be produced. We also would like to format the date into the “mmddyyyy” format (i.e. without the “-” hyphens). The Time variable is recorded in 12hr format.
Set the Date variable as “Date” datatype;
The class() function returns dataype of a variable.
From the Date column produce the year, month and day(numeric) variable and store them in a dataframe df_date by running:
As noted from the above from the Date, the above code produces the year, month & day columns.
Now, to format days and months from numeric values to abbreviations as well as produces the weeks of the year and the formated “mmddyyyy” column run:
Let’s break down the above:
df_date <- df_date…: Assigns/mounts the final output of the code on the previous output.
add_column(): a tidyverse function that enables one to add a new column(s) to the data frame date and assign values to the columns. One is also able to position the new column.
- Column names and values:
D_time = paste(df_date$month,df_date$day,df_date$year, sep = ""
The variable name for this column is “D_time” and the assigned values are formed by merging the month, day & years columns from the date dataframe.
The output of this column is: 2232021 where (2,23,2021: month, day, year)
Year = df_date$year
Pastes the values of the year variables in date output to the new variable “Year”
Month = month.abb[df_date$month]
date$monthrefers to the numeric monthly assignments of the month column in the date output. The month.abb transforms these numerical months into abbreviations. (Jan, Feb, Mar, …)
Week = paste(lubridate::isoweek(df_date$Date), sep = "")
isoweek()function from the lubridate package returns the numeric week of the year of a given date according to ISO 8601.
Day = weekdays(df_date$Date, abbreviate = TRUE)
weekdays()function returns the day of the week. The
abbreviate = TRUEfunctions to return the abbreviated day of the week. (Mon, Tue, …)
.after = "Date"
The function places the added columns after the Date column in date output.
Now drop the numeric month and day variables and the duplicated “year”:
The final output for the date data will be:
To change the time format from a 12hrs to a 24hrs system, do:
To have a MySQL and a Kibana kind of timestamp, merge the D_time and the 24hrs converted time.
One should have the below output as the final output of the transformed date and time date:
Write the output to a .csv file in your local repository:
Application of Dates & Time
Dates and time are well known for showing real-time periodical trends i.e. events monitored in units of time. In real-time applications, dates are used in the visualisation of stock trends from yahoo finance. Here, we will visualize Apple Inc
(TSLA) & Nike
(NKE) stock trends in the markets using different time units.
To start off, import the relevant database and visualisation packages:Error when loading gists from https://gist.github.com/.
quantmod package is designed to assist the quantitative trader in the development, testing,
and deployment of statistically based trading models. in this package, one can directly extract realtime data from Yahoo Finance.
ggplot2 & plotly will help in visualisation while
maggrittr and broom will assist in general data manipulation.
Now set the period of data you need for visualisation:
quantmod to get the stock prices for Nike, Tesla & Apple. This will automatically generate three xts objects with an index column to reference the dates:
Now create a dataframe that only captures the “adjusted price column of each stock company. This data frame is transformed into a time series object with the function as.xts():
Before plotting the stocks:
- Assigning names to the columns given
- Define the index column as a date.
Date is now ready for visualisation:
1. Daily Stock Price trends
ggplot() in ggplot2 package will help in visualizing the To visualize this. To view the Apple, Tesla and Nike stock trends run:
While the dates on the x-axis have been summarised in years the daily values can be visualised by the
ggplotly() which is from the
The ggplotly(stocks_series) produces a live visual for tracing daily trends:
2. For Tesla (TSLA) trends:
3. Monthly averages for the AAPL and NKE stocks
Unlike the daily trends, the time series below shows the monthly average trend for the
AAPL & NKE stocks. You will notice that the
stocks data only has the index as dates. On that note, one will first have to extract months from the date and add it as a variable after changing the
xts object to a dataframe. Thereafter find monthly averages and plot them. To do this run:
The compiled R code can be found is the git repository.